API¶
High level functions¶
run_search¶
- darwin.run_search.run_search(template_file: str, tokens_file: str, options_file: str) ModelRun ¶
The run_search function runs the algorithm selected in options_file, based on template_file and tokens_file. At the end, writes best control and output file to project_dir (specified in options_file). options_file path name should, in general, be absolute, other file names can be absolute path or path relative to the project_dir. Returns the final model object.
run_search_in_folder¶
Algorithms¶
exhaustive¶
GA¶
- darwin.algorithms.GA.run_ga(model_template: Template) ModelRun ¶
Run the Genetic Algorithm (GA) search, using the DEAP (https://github.com/deap/deap) packages. The template object includes the control file template and all the token groups.
OPT¶
downhill¶
- darwin.algorithms.run_downhill.run_downhill(template: Template, pop: Population, return_all: bool = False) list ¶
Run the downhill step, with full (2 bit) search if requested. Finds N <= num_niches niches in pop and replaces N worst models in pop with best models from the niches. If return_all is true, will return a list of ALL models.
Model classes¶
darwin.Template¶
- class darwin.Template.Template(template_file: str, tokens_file: str)¶
The Template object contains information common to all the model objects, including the template code (from the template file) and the tokens set. It DOES NOT include any model specific information, such as the phenotype, the control file text, or any of the output results from NONMEM. Other housekeeping functions are performed, such as defining the gene structure (by counting the number of token groups for each token set), parsing out the THETA/OMEGA/SIGMA blocks, and counting the number of fixed/non-searched THETAs/OMEGAs/SIGMAs.
- Parameters:
template_file (str) – Path to the plain ascii text template file
tokens_file – Path to the json tokens file
- get_search_space_coordinates() list ¶
darwin.Model¶
- class darwin.Model.Model(code: ModelCode)¶
The full model, used for GA, GP, RF, GBRF and exhaustive search.
Model instantiation takes a template as an argument, along with the model code, model number, and generation. Functions include constructing the control file, executing the control file, calculating the fitness/reward.
- model_codea ModelCode object
Contains the bit string/integer string representation of the model. Includes the full binary string, integer string, and minimal binary representation of the model (for GA, GP/RF/GBRT and downhill, respectively).
- genotype() list ¶
- to_dict()¶
- classmethod from_dict(src)¶
darwin.ModelCode¶
- class darwin.ModelCode.ModelCode¶
Class for model code, just to keep straight whether this is full binary, minimal binary, or integer and to interconvert between them.
- classmethod from_full_binary(code: list, gene_max: list, length: list)¶
Create ModelCode object from “full binary”
- classmethod from_min_binary(code: list, gene_max: list, length: list)¶
Create ModelCode object from “minimal binary”
- classmethod from_int(code: list, gene_max: list, length: list)¶
Create ModelCode object from integers
darwin.ModelResults¶
darwin.ModelRun¶
- class darwin.ModelRun.ModelRun(model: Model, model_num, generation, adapter: ModelEngineAdapter)¶
- generationint
The current generation/iteration.
Generation + model_num creates a unique “file_stem”
- model_num: int
Model number within the generation.
Generation + model_num creates a unique “file_stem”
- file_stem: string
Prefix string used to create unique names for control files, executable, and run directory. Defined as stem_prefix + generation + model_num.
- control_file_name: string
Name of the control file, will be file_stem + “.mod”
- executable_file_name: string
Name of the executable, will be file_stem + “.exe”
- run_dir: string
Path to the directory where the model is run; run_dir name is based on the file_stem, which must be unique for each model in the search.
- set_status(status: str)¶
- init_stem(model_num, generation)¶
- is_duplicate() bool ¶
Whether the run is a duplicate of another run in the same population.
- started() bool ¶
Whether the run has been started.
- to_dict()¶
Assembles what goes into the JSON file of saved models.
- classmethod from_dict(src)¶
- make_control_file()¶
Constructs the control file from the template and the model code.
- check_files_present_impl()¶
- run_command(cmd_count: int, command: dict) bool ¶
- run_model()¶
Runs the model. Will terminate model if the timeout option (model_run_timeout) is exceeded. After model is run, the post run R code and post run Python code (if used) are run, and the calc_fitness function is called to calculate the fitness/reward.
- finish()¶
- keep()¶
Keep all necessary files in a separate folder after run completes.
- cleanup()¶
Deletes all unneeded files after run.
- output_results()¶
Prints results to output (.lst) file.
- darwin.ModelRun.write_best_model_files(control_path: str, result_path: str) bool ¶
Saves the current best model control and output in control_path and result_path, respectively.
- Parameters:
control_path – Path to current best model control file
result_path – Path to current best model result file
darwin.Population¶
- class darwin.Population.Population(template: Template, name, start_number=0, max_number=0, max_iteration=0)¶
Population of individuals (model runs).
- __init__(template: Template, name, start_number=0, max_number=0, max_iteration=0)¶
Create an empty population.
- Parameters:
name – Population name. Will be used as generation for every ModelRun added to this population. If an integer is specified, the name is formatted respectively to max_iteration (filled with leading zeroes, if needed). Otherwise, it’s just converted to a string.
start_number – Starting model number of this population.
max_number – Maximum model number of entire iteration. Used for formatting model number. Note that iteration may contain multiple populations (see exhaustive search).
max_iteration – Maximum iteration number. Used for formatting population name.
- classmethod from_codes(template: Template, name, codes, code_converter, start_number=0, max_number=0, max_iteration=0)¶
Create a new population from a set of codes.
- add_model_run(code: ModelCode)¶
Create a new ModelRun and append it to self.runs. If a ModelRun with such code already exists in self.runs, the new one will be marked as a duplicate and will not be run. If the code is found in the cache, ModelRun will be restored from there and will not be run.
- get_best_run() ModelRun ¶
Get the best run (the run with the least fitness among entire population).
- get_best_runs(n: int) list ¶
Get n best runs of entire population.
- run(remaining_models=None)¶
Run the population - pass all runs to current run manager. There is no return value, the runs are just updated.
- darwin.Population.get_remaining_model_num(pop: Population)¶
Caching model runs¶
darwin.ModelCache¶
- class darwin.ModelCache.ModelCache¶
Abstract Model Cache. Describes the interface that must be implemented by every Model Cache.
- abstract find_model_run(**kwargs) ModelRun ¶
Find a run by parameters. Actual parameters depend on implementation.
- abstract load()¶
Load the cache from a saved state.
- abstract dump()¶
Dump the cache to a saved state.
- finalize()¶
Finalize all ongoing activities.
- darwin.ModelCache.set_model_cache(cache)¶
Set current Model Cache instance. Supposed to be called once at the startup.
- darwin.ModelCache.get_model_cache()¶
Get current Model Cache instance.
- darwin.ModelCache.register_model_cache(cache_name, mc_class)¶
Register Model Cache class. cache_name is arbitrary, must be unique among all registered caches.
- darwin.ModelCache.create_model_cache(cache_name) ModelCache ¶
Create Model Cache instance. The cache class must be registered under cache_name.
darwin.MemoryModelCache¶
- class darwin.MemoryModelCache.MemoryModelCache¶
Bases:
ModelCache
Simple Model Cache that stores model runs in a dictionary. Default option for
pyDarwin
.- find_model_run(**kwargs) ModelRun ¶
Find a run by parameters. Actual parameters depend on implementation.
- load()¶
Load the cache from saved_models_file.
- dump()¶
- Save cached runs to file.Does nothing if saved_models_readonly is set to
true
.
- class darwin.MemoryModelCache.AsyncMemoryModelCache¶
Bases:
MemoryModelCache
Non-blocking MemoryModelCache.Dumps model runs in a separate thread so dump call doesn’t block the search execution.- finalize()¶
Finish the working thread and dump any unsaved model runs.
- dump()¶
Signal the working thread that a dump was requested.
- darwin.MemoryModelCache.register()¶