Skip to contents

Generates a list of parameters to be used in a pyDarwin run.

Usage

create_pyDarwinOptions(
  author = "",
  project_name = NULL,
  algorithm = c("GA", "EX", "MOGA", "MOGA3", "GP", "RF", "GBRT", "PSO"),
  GA = pyDarwinOptionsGA(),
  MOGA = pyDarwinOptionsMOGA(),
  PSO = pyDarwinOptionsPSO(),
  random_seed = 11,
  num_parallel = 4,
  num_generations = 6,
  population_size = 4,
  num_opt_chains = 4,
  exhaustive_batch_size = 100,
  crash_value = 99999999,
  penalty = pyDarwinOptionsPenalty(),
  effect_limit = -1,
  downhill_period = 2,
  num_niches = 2,
  niche_radius = 2,
  local_2_bit_search = TRUE,
  final_downhill_search = TRUE,
  local_grid_search = FALSE,
  max_local_grid_search_bits = 5,
  search_omega_blocks = FALSE,
  search_omega_bands = FALSE,
  individual_omega_search = TRUE,
  search_omega_sub_matrix = FALSE,
  max_omega_sub_matrix = 4,
  model_run_timeout = 1200,
  model_run_priority_class = c("below_normal", "normal"),
  postprocess = pyDarwinOptionsPostprocess(),
  keep_key_models = TRUE,
  keep_best_models = TRUE,
  rerun_key_models = FALSE,
  rerun_front_models = TRUE,
  use_saved_models = FALSE,
  saved_models_file = "{working_dir}/models0.json",
  saved_models_readonly = FALSE,
  remove_run_dir = FALSE,
  remove_temp_dir = FALSE,
  keep_files = c("dmp.txt", "posthoc.csv"),
  keep_extensions = NULL,
  use_system_options = TRUE,
  model_cache = "darwin.MemoryModelCache",
  model_run_man = c("darwin.LocalRunManager", "darwin.GridRunManager"),
  engine_adapter = c("nlme", "nonmem"),
  skip_running = FALSE,
  working_dir = NULL,
  data_dir = NULL,
  output_dir = "{working_dir}/output",
  temp_dir = NULL,
  key_models_dir = "{working_dir}/key_models",
  non_dominated_models_dir = "{working_dir}/non_dominated_models",
  nlme_dir = "C:/Program Files/Certara/NLME_Engine",
  gcc_dir = "C:/Program Files/Certara/mingw64",
  nmfe_path = NULL,
  rscript_path = file.path(normalizePath(R.home("bin")), "Rscript"),
  generic_grid_adapter = pyDarwinOptionsGridAdapter(),
  remote_run = FALSE,
  ...
)

Arguments

author

Character string: The name of the author.

project_name

Character string (optional): The name of the project. If not specified, pyDarwin will set its value to the name of the parent folder of the options file.

algorithm

Character string: One of EX, GA, MOGA, MOGA3, GP, RF, GBRT, PSO. See section Details below for more information.

GA

List: Options specific to the Genetic Algorithm (GA). See pyDarwinOptionsGA(). Ignored if algorithm is not "GA".

MOGA

List: Options specific to the Multi-Objective Genetic Algorithm (MOGA or MOGA3). See pyDarwinOptionsMOGA(). Ignored if algorithm is not "MOGA" or "MOGA3".

PSO

List: Options specific to the Particle Swarm Optimization (PSO). See pyDarwinOptionsPSO(). Ignored if algorithm is not "PSO".

random_seed

Positive integer: Seed for random number generation.

num_parallel

Positive integer: Number of models to execute in parallel, i.e., how many threads to create to handle model runs. Default: 4.

num_generations

Positive integer: Number of iterations or generations of the search algorithm to run. Not used/required for EX. Default: 6.

population_size

Positive integer: Number of models to create in every generation. Not used/required for EX. Default: 4.

num_opt_chains

Positive integer: Number of parallel processes to perform the "ask" step (to increase performance). Required only for GP, RF, and GBRT. Default: 4.

exhaustive_batch_size

Positive integer: Batch size for the EX (Exhaustive Search) algorithm. Default: 100.

crash_value

Positive real: Value of fitness or reward assigned when model output is not generated. Should be set larger than any anticipated completed model fitness. Default: 99999999.

penalty

List: Options specific to the penalty calculation. See pyDarwinOptionsPenalty().

effect_limit

Integer: Limits number of effects. Applicable only for NONMEM and GA/MOGA/MOGA3. If < 1, effect limit is turned off. Default: -1.

downhill_period

Integer: How often to run the downhill step. If < 1, no periodic downhill search will be performed. Default: 2.

num_niches

Integer: Used for GA and downhill. A penalty is assigned for each model based on the number of similar models within a niche radius. This penalty is applied only to the selection process (not to the fitness of the model). The purpose is to ensure maintaining a degree of diversity in the population. num_niches is also used to select the number of models that are entered into the downhill step for all algorithms, except EX. Default: 2.

niche_radius

Positive real: The radius of the niches. Used to define how similar pairs of models are, for Local search and GA sharing penalty. Default: 2.

Logical: Whether to perform the two-bit local search. Substantially increases search robustness. Done starting from num_niches models. Ignored for MOGA and MOGA3. Default: TRUE.

Logical: Whether to perform a local search (1-bit and 2-bit) at the end of the global search. Default: TRUE.

Logical: Whether to perform a local grid search during downhill. Default: FALSE.

max_local_grid_search_bits

Positive integer: Maximum number of bits to explore in the local grid search. Default: 5.

search_omega_blocks

Logical: Whether to perform search for block omegas. Used only when engine_adapter == 'nlme'. Default: FALSE.

search_omega_bands

Logical: Whether to perform search for band omegas. Used only when engine_adapter == 'nonmem'. Default: FALSE.

Logical: If TRUE, every omega search block is handled individually. If FALSE, all search blocks have the same pattern. Default: TRUE.

search_omega_sub_matrix

Logical: Set to TRUE to search omega submatrix. Default: FALSE.

max_omega_sub_matrix

Integer: Maximum size of sub matrix to use in search. Default: 4.

model_run_timeout

Positive real: Time (seconds) after which the execution will be terminated, and the crash value assigned. Default: 1200.

model_run_priority_class

Character string (Windows only): Priority class for child processes. Options are below_normal (default) and normal.

postprocess

List: Options specific to postprocessing. See pyDarwinOptionsPostprocess(). For algorithm = "MOGA3", postprocessing is required to define objectives and constraints. For algorithm = "MOGA" (NSGA-II), pyDarwin does not use postprocessing for objective calculation.

keep_key_models

Logical: Whether to save the best model from every generation to key_models_dir. Default: TRUE.

keep_best_models

Logical: If TRUE (default), saves only "key" models that represent an improvement in fitness value compared to the previous overall best model. Models are saved to key_models_dir. Not applicable to Exhaustive Search (EX). Default: TRUE.

rerun_key_models

Logical: Whether to re-run key models that lack output after the search. Default: FALSE.

rerun_front_models

Logical: Similar to rerun_key_models, but for non-dominated models (typically from MOGA/MOGA3). Models are copied to non_dominated_models_dir. Default: TRUE.

use_saved_models

Logical: Whether to restore saved Model Cache from file. Default: FALSE.

saved_models_file

Character string: The file from which to restore Model Cache. Default: "{working_dir}/models0.json".

saved_models_readonly

Logical: Do not overwrite the saved_models_file content. Default: FALSE.

remove_run_dir

Logical: If TRUE, delete the entire model run directory, otherwise only unnecessary files. Default: FALSE.

remove_temp_dir

Logical: Whether to delete the entire temp_dir after the search. Default: FALSE

keep_files

Character vector (optional): List of exact file names to keep when cleaning up run directories. Default is c("dmp.txt", "posthoc.csv") when engine_adapter is "nlme".

keep_extensions

Character vector (optional): List of file extensions (without dot) to keep. Default: NULL.

use_system_options

Logical: Whether to override options with environment-specific values. Default: TRUE.

model_cache

Character string: ModelCache subclass to be used. Default: "darwin.MemoryModelCache".

model_run_man

Character string: ModelRunManager subclass to be used. Options: "darwin.LocalRunManager" (default), "darwin.GridRunManager".

engine_adapter

Character string: ModelEngineAdapter subclass. Options: "nlme" (default), "nonmem".

skip_running

Logical: If TRUE, no actual NM/NLME runs will be performed. Default: FALSE.

working_dir

Character string (optional): Project's working directory.

data_dir

Character string (optional): Directory for datasets.

output_dir

Character string: Directory for pyDarwin output. Default: "{working_dir}/output".

temp_dir

Character string (optional): Parent directory for model run subdirectories.

key_models_dir

Character string: Directory where key/best models will be saved. Default: "{working_dir}/key_models".

non_dominated_models_dir

Character string: Directory where non-dominated models will be saved (typically for MOGA/MOGA3). Default: "{working_dir}/non_dominated_models".

nlme_dir

Character string (optional): Directory for NLME Engine installation.

gcc_dir

Character string (optional): Directory for Mingw-w64 compiler.

nmfe_path

Character string (optional): Path to NONMEM execution command.

rscript_path

Character string (optional): Path to Rscript executable.

generic_grid_adapter

List: Options for grid execution. See pyDarwinOptionsGridAdapter(). Used if model_run_man == "darwin.GridRunManager".

remote_run

Logical: Indicates if pyDarwin execution is for a remote host. Default: FALSE.

...

Additional parameters.

Value

A list of pyDarwin options.

Details

The algorithm parameter specifies the search algorithm. The algorithm “MOGA” and “MOGA3” are used for multi-objective optimization: "MOGA" uses NSGA-II (see the documentation at https://pymoo.org/algorithms/moo/nsga2.html?highlight=nsga%20ii), and "MOGA3" uses NSGA-III (see the documentation at https://pymoo.org/algorithms/moo/nsga3.html?highlight=nsga%20ii). For MOGA3, the objectives and constraints must be defined and returned by postprocessing scripts (post_run_r_code or post_run_python_code) in a specific format:

  • R scripts should return a list of two vectors: the first vector is for the objectives and the second one is for the constraints. If no constraints, the second vector should be empty.

  • Python scripts should return a tuple of two lists: the first list is for the objectives and the second one is for the constraints). If no constraints, the second list should be empty.

Other algorithms include "EX" (Exhaustive), "GA" (Genetic Algorithm), "GP" (Gaussian Process), "RF" (Random Forest), "GBRT" (Gradient Boosted Random Tree), and "PSO" (Particle Swarm Optimization).

Please see pyDarwin documentation for complete details on all options.

Examples

# Basic options with GA
ga_opts <- create_pyDarwinOptions(author = "Jane Doe", algorithm = "GA")

# Options for MOGA (NSGA-II)
# pyDarwin internally uses 2 objectives; postprocessing for objectives is not used by pyDarwin.
moga_opts_nsga2 <- create_pyDarwinOptions(
  author = "J. Doe",
  project_name = "MOGA_Test_NSGA2",
  algorithm = "MOGA", # NSGA-II
  MOGA = pyDarwinOptionsMOGA(), # Default MOGA options are suitable
  population_size = 50,
  num_generations = 100,
  engine_adapter = "nonmem",
  nmfe_path = "/opt/NONMEM/nm75/run/nmfe75"
)

# Options for MOGA3 (NSGA-III with 3 objectives, 1 constraint via R postprocessing)
moga_opts_nsga3_custom <- pyDarwinOptionsMOGA(
  objectives = 3,
  names = c("AIC", "NumEffects", "RunTime"), # Example custom names
  constraints = 1,
  partitions = 10 # Custom partitions
)
main_opts_nsga3 <- create_pyDarwinOptions(
  author = "J. Doe",
  project_name = "MOGA_Test_NSGA3",
  algorithm = "MOGA3", # NSGA-III
  MOGA = moga_opts_nsga3_custom,
  population_size = 60, # NSGA-III population size might need adjustment
  num_generations = 100,
  postprocess = pyDarwinOptionsPostprocess( # Required for MOGA3
    use_r = TRUE,
    post_run_r_code = "{project_dir}/moga3_postprocess.R"
  ),
  engine_adapter = "nonmem",
  nmfe_path = "/opt/NONMEM/nm75/run/nmfe75"
)