Create pyDarwin Options
create_pyDarwinOptions.Rd
Generates a list of parameters to be used in a pyDarwin run.
Usage
create_pyDarwinOptions(
author = "",
project_name = NULL,
algorithm = c("GA", "EX", "MOGA", "MOGA3", "GP", "RF", "GBRT", "PSO"),
GA = pyDarwinOptionsGA(),
MOGA = pyDarwinOptionsMOGA(),
PSO = pyDarwinOptionsPSO(),
random_seed = 11,
num_parallel = 4,
num_generations = 6,
population_size = 4,
num_opt_chains = 4,
exhaustive_batch_size = 100,
crash_value = 99999999,
penalty = pyDarwinOptionsPenalty(),
effect_limit = -1,
downhill_period = 2,
num_niches = 2,
niche_radius = 2,
local_2_bit_search = TRUE,
final_downhill_search = TRUE,
local_grid_search = FALSE,
max_local_grid_search_bits = 5,
search_omega_blocks = FALSE,
search_omega_bands = FALSE,
individual_omega_search = TRUE,
search_omega_sub_matrix = FALSE,
max_omega_sub_matrix = 4,
model_run_timeout = 1200,
model_run_priority_class = c("below_normal", "normal"),
postprocess = pyDarwinOptionsPostprocess(),
keep_key_models = TRUE,
keep_best_models = TRUE,
rerun_key_models = FALSE,
rerun_front_models = TRUE,
use_saved_models = FALSE,
saved_models_file = "{working_dir}/models0.json",
saved_models_readonly = FALSE,
remove_run_dir = FALSE,
remove_temp_dir = FALSE,
keep_files = c("dmp.txt", "posthoc.csv"),
keep_extensions = NULL,
use_system_options = TRUE,
model_cache = "darwin.MemoryModelCache",
model_run_man = c("darwin.LocalRunManager", "darwin.GridRunManager"),
engine_adapter = c("nlme", "nonmem"),
skip_running = FALSE,
working_dir = NULL,
data_dir = NULL,
output_dir = "{working_dir}/output",
temp_dir = NULL,
key_models_dir = "{working_dir}/key_models",
non_dominated_models_dir = "{working_dir}/non_dominated_models",
nlme_dir = "C:/Program Files/Certara/NLME_Engine",
gcc_dir = "C:/Program Files/Certara/mingw64",
nmfe_path = NULL,
rscript_path = file.path(normalizePath(R.home("bin")), "Rscript"),
generic_grid_adapter = pyDarwinOptionsGridAdapter(),
remote_run = FALSE,
...
)
Arguments
Character string: The name of the author.
- project_name
Character string (optional): The name of the project. If not specified, pyDarwin will set its value to the name of the parent folder of the options file.
- algorithm
Character string: One of EX, GA, MOGA, MOGA3, GP, RF, GBRT, PSO. See section Details below for more information.
- GA
List: Options specific to the Genetic Algorithm (GA). See
pyDarwinOptionsGA()
. Ignored if algorithm is not "GA".- MOGA
List: Options specific to the Multi-Objective Genetic Algorithm (MOGA or MOGA3). See
pyDarwinOptionsMOGA()
. Ignored if algorithm is not "MOGA" or "MOGA3".- PSO
List: Options specific to the Particle Swarm Optimization (PSO). See
pyDarwinOptionsPSO()
. Ignored if algorithm is not "PSO".- random_seed
Positive integer: Seed for random number generation.
- num_parallel
Positive integer: Number of models to execute in parallel, i.e., how many threads to create to handle model runs. Default: 4.
- num_generations
Positive integer: Number of iterations or generations of the search algorithm to run. Not used/required for EX. Default: 6.
- population_size
Positive integer: Number of models to create in every generation. Not used/required for EX. Default: 4.
- num_opt_chains
Positive integer: Number of parallel processes to perform the "ask" step (to increase performance). Required only for GP, RF, and GBRT. Default: 4.
- exhaustive_batch_size
Positive integer: Batch size for the EX (Exhaustive Search) algorithm. Default: 100.
- crash_value
Positive real: Value of fitness or reward assigned when model output is not generated. Should be set larger than any anticipated completed model fitness. Default: 99999999.
- penalty
List: Options specific to the penalty calculation. See
pyDarwinOptionsPenalty()
.- effect_limit
Integer: Limits number of effects. Applicable only for NONMEM and GA/MOGA/MOGA3. If < 1, effect limit is turned off. Default: -1.
- downhill_period
Integer: How often to run the downhill step. If < 1, no periodic downhill search will be performed. Default: 2.
- num_niches
Integer: Used for GA and downhill. A penalty is assigned for each model based on the number of similar models within a niche radius. This penalty is applied only to the selection process (not to the fitness of the model). The purpose is to ensure maintaining a degree of diversity in the population.
num_niches
is also used to select the number of models that are entered into the downhill step for all algorithms, except EX. Default: 2.- niche_radius
Positive real: The radius of the niches. Used to define how similar pairs of models are, for Local search and GA sharing penalty. Default: 2.
- local_2_bit_search
Logical: Whether to perform the two-bit local search. Substantially increases search robustness. Done starting from
num_niches
models. Ignored for MOGA and MOGA3. Default: TRUE.- final_downhill_search
Logical: Whether to perform a local search (1-bit and 2-bit) at the end of the global search. Default: TRUE.
- local_grid_search
Logical: Whether to perform a local grid search during downhill. Default: FALSE.
- max_local_grid_search_bits
Positive integer: Maximum number of bits to explore in the local grid search. Default: 5.
- search_omega_blocks
Logical: Whether to perform search for block omegas. Used only when
engine_adapter == 'nlme'
. Default: FALSE.- search_omega_bands
Logical: Whether to perform search for band omegas. Used only when
engine_adapter == 'nonmem'
. Default: FALSE.- individual_omega_search
Logical: If TRUE, every omega search block is handled individually. If FALSE, all search blocks have the same pattern. Default: TRUE.
- search_omega_sub_matrix
Logical: Set to TRUE to search omega submatrix. Default: FALSE.
- max_omega_sub_matrix
Integer: Maximum size of sub matrix to use in search. Default: 4.
- model_run_timeout
Positive real: Time (seconds) after which the execution will be terminated, and the crash value assigned. Default: 1200.
- model_run_priority_class
Character string (Windows only): Priority class for child processes. Options are
below_normal
(default) andnormal
.- postprocess
List: Options specific to postprocessing. See
pyDarwinOptionsPostprocess()
. Foralgorithm = "MOGA3"
, postprocessing is required to define objectives and constraints. Foralgorithm = "MOGA"
(NSGA-II), pyDarwin does not use postprocessing for objective calculation.- keep_key_models
Logical: Whether to save the best model from every generation to
key_models_dir
. Default: TRUE.- keep_best_models
Logical: If
TRUE
(default), saves only "key" models that represent an improvement in fitness value compared to the previous overall best model. Models are saved tokey_models_dir
. Not applicable to Exhaustive Search (EX). Default:TRUE
.- rerun_key_models
Logical: Whether to re-run key models that lack output after the search. Default: FALSE.
- rerun_front_models
Logical: Similar to
rerun_key_models
, but for non-dominated models (typically from MOGA/MOGA3). Models are copied tonon_dominated_models_dir
. Default: TRUE.- use_saved_models
Logical: Whether to restore saved Model Cache from file. Default: FALSE.
- saved_models_file
Character string: The file from which to restore Model Cache. Default: "{working_dir}/models0.json".
- saved_models_readonly
Logical: Do not overwrite the
saved_models_file
content. Default: FALSE.- remove_run_dir
Logical: If TRUE, delete the entire model run directory, otherwise only unnecessary files. Default: FALSE.
- remove_temp_dir
Logical: Whether to delete the entire
temp_dir
after the search. Default: FALSE- keep_files
Character vector (optional): List of exact file names to keep when cleaning up run directories. Default is
c("dmp.txt", "posthoc.csv")
whenengine_adapter
is "nlme".- keep_extensions
Character vector (optional): List of file extensions (without dot) to keep. Default: NULL.
- use_system_options
Logical: Whether to override options with environment-specific values. Default: TRUE.
- model_cache
Character string: ModelCache subclass to be used. Default: "darwin.MemoryModelCache".
- model_run_man
Character string: ModelRunManager subclass to be used. Options: "darwin.LocalRunManager" (default), "darwin.GridRunManager".
- engine_adapter
Character string: ModelEngineAdapter subclass. Options: "nlme" (default), "nonmem".
- skip_running
Logical: If TRUE, no actual NM/NLME runs will be performed. Default: FALSE.
- working_dir
Character string (optional): Project's working directory.
- data_dir
Character string (optional): Directory for datasets.
- output_dir
Character string: Directory for pyDarwin output. Default: "{working_dir}/output".
- temp_dir
Character string (optional): Parent directory for model run subdirectories.
- key_models_dir
Character string: Directory where key/best models will be saved. Default: "{working_dir}/key_models".
- non_dominated_models_dir
Character string: Directory where non-dominated models will be saved (typically for MOGA/MOGA3). Default: "{working_dir}/non_dominated_models".
- nlme_dir
Character string (optional): Directory for NLME Engine installation.
- gcc_dir
Character string (optional): Directory for Mingw-w64 compiler.
- nmfe_path
Character string (optional): Path to NONMEM execution command.
- rscript_path
Character string (optional): Path to Rscript executable.
- generic_grid_adapter
List: Options for grid execution. See
pyDarwinOptionsGridAdapter()
. Used ifmodel_run_man == "darwin.GridRunManager"
.- remote_run
Logical: Indicates if pyDarwin execution is for a remote host. Default:
FALSE
.- ...
Additional parameters.
Details
The algorithm parameter specifies the search algorithm. The algorithm “MOGA” and “MOGA3” are used for multi-objective optimization: "MOGA" uses NSGA-II (see the documentation at https://pymoo.org/algorithms/moo/nsga2.html?highlight=nsga%20ii), and "MOGA3" uses NSGA-III (see the documentation at https://pymoo.org/algorithms/moo/nsga3.html?highlight=nsga%20ii). For MOGA3, the objectives and constraints must be defined and returned by postprocessing scripts (post_run_r_code or post_run_python_code) in a specific format:
R scripts should return a list of two vectors: the first vector is for the objectives and the second one is for the constraints. If no constraints, the second vector should be empty.
Python scripts should return a tuple of two lists: the first list is for the objectives and the second one is for the constraints). If no constraints, the second list should be empty.
Other algorithms include "EX" (Exhaustive), "GA" (Genetic Algorithm), "GP" (Gaussian Process), "RF" (Random Forest), "GBRT" (Gradient Boosted Random Tree), and "PSO" (Particle Swarm Optimization).
Please see pyDarwin documentation for complete details on all options.
Examples
# Basic options with GA
ga_opts <- create_pyDarwinOptions(author = "Jane Doe", algorithm = "GA")
# Options for MOGA (NSGA-II)
# pyDarwin internally uses 2 objectives; postprocessing for objectives is not used by pyDarwin.
moga_opts_nsga2 <- create_pyDarwinOptions(
author = "J. Doe",
project_name = "MOGA_Test_NSGA2",
algorithm = "MOGA", # NSGA-II
MOGA = pyDarwinOptionsMOGA(), # Default MOGA options are suitable
population_size = 50,
num_generations = 100,
engine_adapter = "nonmem",
nmfe_path = "/opt/NONMEM/nm75/run/nmfe75"
)
# Options for MOGA3 (NSGA-III with 3 objectives, 1 constraint via R postprocessing)
moga_opts_nsga3_custom <- pyDarwinOptionsMOGA(
objectives = 3,
names = c("AIC", "NumEffects", "RunTime"), # Example custom names
constraints = 1,
partitions = 10 # Custom partitions
)
main_opts_nsga3 <- create_pyDarwinOptions(
author = "J. Doe",
project_name = "MOGA_Test_NSGA3",
algorithm = "MOGA3", # NSGA-III
MOGA = moga_opts_nsga3_custom,
population_size = 60, # NSGA-III population size might need adjustment
num_generations = 100,
postprocess = pyDarwinOptionsPostprocess( # Required for MOGA3
use_r = TRUE,
post_run_r_code = "{project_dir}/moga3_postprocess.R"
),
engine_adapter = "nonmem",
nmfe_path = "/opt/NONMEM/nm75/run/nmfe75"
)