API¶

High level functions¶

run_search¶

darwin.run_search.run_search(template_file: str, tokens_file: str, options_file: str) → ModelRun¶: The run_search function runs the algorithm selected in options_file, based on template_file and tokens_file. At the end, writes best control and output file to project_dir (specified in options_file). options_file path name should, in general, be absolute, other file names can be absolute path or path relative to the project_dir. Returns the final model object.

run_search_in_folder¶

darwin.run_search_in_folder.run_search_in_folder(folder: str, template_file: str = 'template.txt', tokens_file: str = 'tokens.json', options_file: str = 'options.json') → ModelRun¶

Algorithms¶

exhaustive¶

darwin.algorithms.exhaustive.run_exhaustive(model_template: Template) → ModelRun¶

Run full exhaustive search on the Template, all possible combinations. All models will be run in iteration number 0.

Parameters:: model_template (Template) – Model Template
Returns:: Returns final/best model run
Return type:: ModelRun

darwin.algorithms.exhaustive.get_search_space(template: Template) → ndarray¶

darwin.algorithms.exhaustive.get_search_space_size(template: Template) → int¶

GA¶

darwin.algorithms.GA.run_ga(model_template: Template) → ModelRun¶

Run the Genetic Algorithm (GA) search, using the DEAP (https://github.com/deap/deap) packages. The template object includes the control file template and all the token groups.

Parameters:: model_template (Template) – Template object for the search
Returns:: The single best model from the search
Return type:: Model

OPT¶

darwin.algorithms.OPT.run_skopt(model_template: Template) → ModelRun¶

Run one of the scikit optimize algorithms (GP, RF, GBRT). See https://scikit-optimize.github.io/stable/.

Parameters:: model_template (Template) – Model template to be run
Returns:: The best model from search
Return type:: Model

downhill¶

darwin.algorithms.run_downhill.run_downhill(template: Template, pop: Population, return_all: bool = False) → list¶: Run the downhill step, with full (2 bit) search if requested. Finds N <= num_niches niches in pop and replaces N worst models in pop with best models from the niches. If return_all is true, will return a list of ALL models.

Model classes¶

darwin.Template¶

class darwin.Template.Template(template_file: str, tokens_file: str)¶

The Template object contains information common to all the model objects, including the template code (from the template file) and the tokens set. It DOES NOT include any model specific information, such as the phenotype, the control file text, or any of the output results from NONMEM. Other housekeeping functions are performed, such as defining the gene structure (by counting the number of token groups for each token set), parsing out the THETA/OMEGA/SIGMA blocks, and counting the number of fixed/non-searched THETAs/OMEGAs/SIGMAs.

Parameters:

template_file (str) – Path to the plain ascii text template file
tokens_file – Path to the json tokens file

get_search_space_coordinates() → list¶

darwin.Model¶

class darwin.Model.Model(code: ModelCode)¶

The full model, used for GA, GP, RF, GBRF and exhaustive search.

Model instantiation takes a template as an argument, along with the model code, model number, and generation. Functions include constructing the control file, executing the control file, calculating the fitness/reward.

model_codea ModelCode object: Contains the bit string/integer string representation of the model. Includes the full binary string, integer string, and minimal binary representation of the model (for GA, GP/RF/GBRT and downhill, respectively).

genotype() → list¶

to_dict()¶

classmethod from_dict(src)¶

darwin.ModelCode¶

class darwin.ModelCode.ModelCode¶

Class for model code, just to keep straight whether this is full binary, minimal binary, or integer and to interconvert between them.

classmethod from_full_binary(code: list, gene_max: list, length: list)¶: Create ModelCode object from “full binary”

classmethod from_min_binary(code: list, gene_max: list, length: list)¶: Create ModelCode object from “minimal binary”

classmethod from_int(code: list, gene_max: list, length: list)¶: Create ModelCode object from integers

darwin.ModelResults¶

class darwin.ModelResults.ModelResults¶

get_message_text() → str¶

to_dict()¶

classmethod from_dict(src)¶

calc_fitness(model: Model)¶: Calculates the fitness, based on the model output, and the penalties (from the options file).

darwin.ModelRun¶

class darwin.ModelRun.ModelRun(model: Model, model_num, generation, adapter: ModelEngineAdapter)¶

generationint

The current generation/iteration.

Generation + model_num creates a unique “file_stem”

model_num: int

Model number within the generation.

Generation + model_num creates a unique “file_stem”

file_stem: string: Prefix string used to create unique names for control files, executable, and run directory. Defined as stem_prefix + generation + model_num.
control_file_name: string: Name of the control file, will be file_stem + “.mod”
executable_file_name: string: Name of the executable, will be file_stem + “.exe”

run_dir: string: Path to the directory where the model is run; run_dir name is based on the file_stem, which must be unique for each model in the search.

set_status(status: str)¶

init_stem(model_num, generation)¶

is_duplicate() → bool¶: Whether the run is a duplicate of another run in the same population.

started() → bool¶: Whether the run has been started.

to_dict()¶: Assembles what goes into the JSON file of saved models.

classmethod from_dict(src)¶

make_control_file()¶: Constructs the control file from the template and the model code.

check_files_present_impl()¶

run_command(cmd_count: int, command: dict) → bool¶

run_model()¶: Runs the model. Will terminate model if the timeout option (model_run_timeout) is exceeded. After model is run, the post run R code and post run Python code (if used) are run, and the calc_fitness function is called to calculate the fitness/reward.

finish()¶

keep()¶: Keep all necessary files in a separate folder after run completes.

cleanup()¶: Deletes all unneeded files after run.

output_results()¶: Prints results to output (.lst) file.

darwin.ModelRun.write_best_model_files(control_path: str, result_path: str) → bool¶

Saves the current best model control and output in control_path and result_path, respectively.

Parameters:

control_path – Path to current best model control file
result_path – Path to current best model result file

darwin.ModelRun.run_to_json(run: ModelRun, file: str)¶

darwin.ModelRun.json_to_run(file: str) → ModelRun¶

darwin.ModelRun.log_run(run: ModelRun)¶

darwin.Population¶

class darwin.Population.Population(template: Template, name, start_number=0, max_number=0, max_iteration=0)¶

Population of individuals (model runs).

__init__(template: Template, name, start_number=0, max_number=0, max_iteration=0)¶

Create an empty population.

Parameters:

name – Population name. Will be used as generation for every ModelRun added to this population. If an integer is specified, the name is formatted respectively to max_iteration (filled with leading zeroes, if needed). Otherwise, it’s just converted to a string.
start_number – Starting model number of this population.
max_number – Maximum model number of entire iteration. Used for formatting model number. Note that iteration may contain multiple populations (see exhaustive search).
max_iteration – Maximum iteration number. Used for formatting population name.

classmethod from_codes(template: Template, name, codes, code_converter, start_number=0, max_number=0, max_iteration=0)¶: Create a new population from a set of codes.

add_model_run(code: ModelCode)¶: Create a new ModelRun and append it to self.runs. If a ModelRun with such code already exists in self.runs, the new one will be marked as a duplicate and will not be run. If the code is found in the cache, ModelRun will be restored from there and will not be run.

get_best_run() → ModelRun¶: Get the best run (the run with the least fitness among entire population).

get_best_runs(n: int) → list¶: Get n best runs of entire population.

run(remaining_models=None)¶: Run the population - pass all runs to current run manager. There is no return value, the runs are just updated.

darwin.Population.init_pop_nums(template: Template)¶

darwin.Population.get_remaining_model_num(pop: Population)¶

Caching model runs¶

Duplicate models are commonly encountered during search. To prevent unnecessary runs, all unique models from current search are stored in Model Cache.
Every new model is checked against the cache. If a match is found, the new model is replaced with cached one, so the result is obtained instantly.
Currently, only in-memory cache is available, so with large searches (millions of model runs), the memory footprint may be substantial.
Normally the cache is dumped at the end of every iteration or when you stop the search. This behaviour can be affected
by saved_models_readonly being set to true. You also can load the cache from a saved state.

darwin.ModelCache¶

class darwin.ModelCache.ModelCache¶

Abstract Model Cache. Describes the interface that must be implemented by every Model Cache.

abstract store_model_run(run: ModelRun)¶: Store a run.

abstract find_model_run(**kwargs) → ModelRun¶: Find a run by parameters. Actual parameters depend on implementation.

abstract load()¶: Load the cache from a saved state.

abstract dump()¶: Dump the cache to a saved state.

finalize()¶: Finalize all ongoing activities.

darwin.ModelCache.set_model_cache(cache)¶: Set current Model Cache instance. Supposed to be called once at the startup.

darwin.ModelCache.get_model_cache()¶: Get current Model Cache instance.

darwin.ModelCache.register_model_cache(cache_name, mc_class)¶: Register Model Cache class. cache_name is arbitrary, must be unique among all registered caches.

darwin.ModelCache.create_model_cache(cache_name) → ModelCache¶: Create Model Cache instance. The cache class must be registered under cache_name.

darwin.MemoryModelCache¶

class darwin.MemoryModelCache.MemoryModelCache¶

Bases: ModelCache

Simple Model Cache that stores model runs in a dictionary. Default option for pyDarwin.

store_model_run(run: ModelRun)¶: Store a run.

find_model_run(**kwargs) → ModelRun¶: Find a run by parameters. Actual parameters depend on implementation.

load()¶: Load the cache from saved_models_file.

dump()¶: Save cached runs to file.

Does nothing if saved_models_readonly is set to true.

class darwin.MemoryModelCache.AsyncMemoryModelCache¶

Bases: MemoryModelCache

Non-blocking MemoryModelCache.

Dumps model runs in a separate thread so dump call doesn’t block the search execution.

finalize()¶: Finish the working thread and dump any unsaved model runs.

dump()¶: Signal the working thread that a dump was requested.

darwin.MemoryModelCache.register()¶: Register MemoryModelCache and AsyncMemoryModelCache.