API

High level functions

run_search_in_folder

darwin.run_search_in_folder.run_search_in_folder(folder: str, template_file: str = 'template.txt', tokens_file: str = 'tokens.json', options_file: str = 'options.json') ModelRun

Algorithms

exhaustive

darwin.algorithms.exhaustive.run_exhaustive(model_template: Template) ModelRun

Run full exhaustive search on the Template, all possible combinations. All models will be run in iteration number 0.

Parameters:

model_template (Template) – Model Template

Returns:

Returns final/best model run

Return type:

ModelRun

darwin.algorithms.exhaustive.get_search_space(template: Template) ndarray
darwin.algorithms.exhaustive.get_search_space_size(template: Template) int

GA

darwin.algorithms.GA.run_ga(model_template: Template) ModelRun

Run the Genetic Algorithm (GA) search, using the DEAP (https://github.com/deap/deap) packages. The template object includes the control file template and all the token groups.

Parameters:

model_template (Template) – Template object for the search

Returns:

The single best model from the search

Return type:

Model

OPT

darwin.algorithms.OPT.run_skopt(model_template: Template) ModelRun

Run one of the scikit optimize algorithms (GP, RF, GBRT). See https://scikit-optimize.github.io/stable/.

Parameters:

model_template (Template) – Model template to be run

Returns:

The best model from search

Return type:

Model

downhill

darwin.algorithms.run_downhill.run_downhill(template: Template, pop: Population, return_all: bool = False) list

Run the downhill step, with full (2 bit) search if requested. Finds N <= num_niches niches in pop and replaces N worst models in pop with best models from the niches. If return_all is true, will return a list of ALL models.

Model classes

darwin.Template

class darwin.Template.Template(template_file: str, tokens_file: str)

The Template object contains information common to all the model objects, including the template code (from the template file) and the tokens set. It DOES NOT include any model specific information, such as the phenotype, the control file text, or any of the output results from NONMEM. Other housekeeping functions are performed, such as defining the gene structure (by counting the number of token groups for each token set), parsing out the THETA/OMEGA/SIGMA blocks, and counting the number of fixed/non-searched THETAs/OMEGAs/SIGMAs.

Parameters:
  • template_file (str) – Path to the plain ascii text template file

  • tokens_file – Path to the json tokens file

get_search_space_coordinates() list

darwin.Model

class darwin.Model.Model(code: ModelCode)

The full model, used for GA, GP, RF, GBRF and exhaustive search.

Model instantiation takes a template as an argument, along with the model code, model number, and generation. Functions include constructing the control file, executing the control file, calculating the fitness/reward.

model_codea ModelCode object

Contains the bit string/integer string representation of the model. Includes the full binary string, integer string, and minimal binary representation of the model (for GA, GP/RF/GBRT and downhill, respectively).

genotype() list
to_dict()
classmethod from_dict(src)

darwin.ModelCode

class darwin.ModelCode.ModelCode

Class for model code, just to keep straight whether this is full binary, minimal binary, or integer and to interconvert between them.

classmethod from_full_binary(code: list, gene_max: list, length: list)

Create ModelCode object from “full binary”

classmethod from_min_binary(code: list, gene_max: list, length: list)

Create ModelCode object from “minimal binary”

classmethod from_int(code: list, gene_max: list, length: list)

Create ModelCode object from integers

darwin.ModelResults

class darwin.ModelResults.ModelResults
get_message_text() str
to_dict()
classmethod from_dict(src)
calc_fitness(model: Model)

Calculates the fitness, based on the model output, and the penalties (from the options file).

darwin.ModelRun

class darwin.ModelRun.ModelRun(model: Model, model_num, generation, adapter: ModelEngineAdapter)
generationint

The current generation/iteration.

Generation + model_num creates a unique “file_stem”

model_num: int

Model number within the generation.

Generation + model_num creates a unique “file_stem”

file_stem: string

Prefix string used to create unique names for control files, executable, and run directory. Defined as stem_prefix + generation + model_num.

control_file_name: string

Name of the control file, will be file_stem + “.mod”

executable_file_name: string

Name of the executable, will be file_stem + “.exe”

run_dir: string

Path to the directory where the model is run; run_dir name is based on the file_stem, which must be unique for each model in the search.

set_status(status: str)
init_stem(model_num, generation)
is_duplicate() bool

Whether the run is a duplicate of another run in the same population.

started() bool

Whether the run has been started.

to_dict()

Assembles what goes into the JSON file of saved models.

classmethod from_dict(src)
make_control_file()

Constructs the control file from the template and the model code.

check_files_present_impl()
run_command(cmd_count: int, command: dict) bool
run_model()

Runs the model. Will terminate model if the timeout option (model_run_timeout) is exceeded. After model is run, the post run R code and post run Python code (if used) are run, and the calc_fitness function is called to calculate the fitness/reward.

finish()
keep()

Keep all necessary files in a separate folder after run completes.

cleanup()

Deletes all unneeded files after run.

output_results()

Prints results to output (.lst) file.

darwin.ModelRun.write_best_model_files(control_path: str, result_path: str) bool

Saves the current best model control and output in control_path and result_path, respectively.

Parameters:
  • control_path – Path to current best model control file

  • result_path – Path to current best model result file

darwin.ModelRun.run_to_json(run: ModelRun, file: str)
darwin.ModelRun.json_to_run(file: str) ModelRun
darwin.ModelRun.log_run(run: ModelRun)

darwin.Population

class darwin.Population.Population(template: Template, name, start_number=0, max_number=0, max_iteration=0)

Population of individuals (model runs).

__init__(template: Template, name, start_number=0, max_number=0, max_iteration=0)

Create an empty population.

Parameters:
  • name – Population name. Will be used as generation for every ModelRun added to this population. If an integer is specified, the name is formatted respectively to max_iteration (filled with leading zeroes, if needed). Otherwise, it’s just converted to a string.

  • start_number – Starting model number of this population.

  • max_number – Maximum model number of entire iteration. Used for formatting model number. Note that iteration may contain multiple populations (see exhaustive search).

  • max_iteration – Maximum iteration number. Used for formatting population name.

classmethod from_codes(template: Template, name, codes, code_converter, start_number=0, max_number=0, max_iteration=0)

Create a new population from a set of codes.

add_model_run(code: ModelCode)

Create a new ModelRun and append it to self.runs. If a ModelRun with such code already exists in self.runs, the new one will be marked as a duplicate and will not be run. If the code is found in the cache, ModelRun will be restored from there and will not be run.

get_best_run() ModelRun

Get the best run (the run with the least fitness among entire population).

get_best_runs(n: int) list

Get n best runs of entire population.

run(remaining_models=None)

Run the population - pass all runs to current run manager. There is no return value, the runs are just updated.

darwin.Population.init_pop_nums(template: Template)
darwin.Population.get_remaining_model_num(pop: Population)

Caching model runs

Duplicate models are commonly encountered during search. To prevent unnecessary runs, all unique models from current search are stored in Model Cache.
Every new model is checked against the cache. If a match is found, the new model is replaced with cached one, so the result is obtained instantly.
Currently, only in-memory cache is available, so with large searches (millions of model runs), the memory footprint may be substantial.
Normally the cache is dumped at the end of every iteration or when you stop the search. This behaviour can be affected by saved_models_readonly being set to true. You also can load the cache from a saved state.

darwin.ModelCache

class darwin.ModelCache.ModelCache

Abstract Model Cache. Describes the interface that must be implemented by every Model Cache.

abstract store_model_run(run: ModelRun)

Store a run.

abstract find_model_run(**kwargs) ModelRun

Find a run by parameters. Actual parameters depend on implementation.

abstract load()

Load the cache from a saved state.

abstract dump()

Dump the cache to a saved state.

finalize()

Finalize all ongoing activities.

darwin.ModelCache.set_model_cache(cache)

Set current Model Cache instance. Supposed to be called once at the startup.

darwin.ModelCache.get_model_cache()

Get current Model Cache instance.

darwin.ModelCache.register_model_cache(cache_name, mc_class)

Register Model Cache class. cache_name is arbitrary, must be unique among all registered caches.

darwin.ModelCache.create_model_cache(cache_name) ModelCache

Create Model Cache instance. The cache class must be registered under cache_name.

darwin.MemoryModelCache

class darwin.MemoryModelCache.MemoryModelCache

Bases: ModelCache

Simple Model Cache that stores model runs in a dictionary. Default option for pyDarwin.

store_model_run(run: ModelRun)

Store a run.

find_model_run(**kwargs) ModelRun

Find a run by parameters. Actual parameters depend on implementation.

load()

Load the cache from saved_models_file.

dump()
Save cached runs to file.
Does nothing if saved_models_readonly is set to true.
class darwin.MemoryModelCache.AsyncMemoryModelCache

Bases: MemoryModelCache

Non-blocking MemoryModelCache.
Dumps model runs in a separate thread so dump call doesn’t block the search execution.
finalize()

Finish the working thread and dump any unsaved model runs.

dump()

Signal the working thread that a dump was requested.

darwin.MemoryModelCache.register()

Register MemoryModelCache and AsyncMemoryModelCache.