Example 4: PK Model, DMAG by GA with post-run R code

Example 4 uses a dataset simulated from the following CATIE study , courtesy of Dr. Rob Bies . This search again uses nested tokens, as it searches whether K32 is a function of Weight, and 1 vs 2 vs 3 compartments. Another important feature of example 4 is the use of post run R code. In this case, it was of interest to capture the Cmax value. There is no straightforward way to include a penalty for missing the Cmax in the NONMEM control stream. Therefore, the penalty for missing Cmax is added after the NONMEM run is complete. Any R code can be provided by the user and should return a vector of two values. The first is a real values penalty to be added to the fitness/reward. The 2nd is text that will be appended to the NONMEM output file to describe the results of the R code execution.

The penalty for missing the Cmax is essentially a Posterior Predictive Check (PPC). The template file includes two tasks that are used to generate a table of the PPC: an estimation, followed by a simulation.

$PROB SIMULATION FOR CMAX

$INPUT       ID TIME AMT {D1LAG[1]} DV MDV EVID WT AGE SEX BUN SCR OCC CRCL COV1 COV2
$DATA      {data_dir}/dataWithPeriod.csv IGNORE=@  REWIND
$MSFI = MSF1
$SIMULATION (1) ONLYSIM
$TABLE REP ID TIME IOBS EVID  NOAPPEND NOPRINT FILE = SIM.DAT ONEHEADER NOAPPEND

User provided R code is then saved to a file (.R):

library(dplyr)
orgdata <-  read.table("ORG.DAT",skip=1,header=TRUE)
orgdata <- orgdata[orgdata$EVID==0&orgdata$TIME<=24,] # only first 24 hours
simdata <-  read.table("SIM.DAT",skip=1,header=TRUE) # only first 24 hours
simdata <- simdata[orgdata$EVID==0&orgdata$TIME<=24,]
observed_maxes <- orgdata %>% group_by(ID) %>%
summarise(max = max(DV))
sim_maxes <- simdata %>% group_by(ID) %>% summarise(max = max(IOBS))
# penalty of 4 points for each % difference
obs_geomean = exp(mean(log(observed_maxes$max)))
sim_geomean = exp(mean(log(sim_maxes$max)))
penalty <- 4*abs((obs_geomean-sim_geomean)/obs_geomean)*100
text <- paste0("Observed day 1 Cmax geomean = ", round(obs_geomean,1), " simulated day 1 Cmax geo mean = ", round(sim_geomean,1))
c(penalty,text)

The R code executes from the run directory of that model. For example, if a model runs in c:\\pydarwin\\example4\\00\\04, the ORG.DAT and SIM.DAT files will be written to that directory and the R code will be read from the same directory.

This R code returns a character vector of length 2. The first is a string that can be read as numeric (penalty) and the 2nd is a string that is appended to the NONMEM output file, as a comment to document the output of the R code. The content of the 2nd element of the vector is entirely up to the user, it has no impact on the pyDarwin process, it is just appended to the output file.

The first element of the returned value is added to the fitness/reward value and is then used in the selection of subsequent populations, just as any other penalty. The penalty value can be any real value - including negative, but typically would be positive.

The post processing code options are given in the options file:

"postprocess": {
    "use_r": true,
    "post_run_r_code": "{project_dir}/Cmaxppc.r",
    "rscript_path": "c:\\Program Files\\R\\R-4.1.3\\bin\\Rscript.exe",
    "r_timeout": 120,
    "use_python": false

Be sure to set "use_r": true to use R for post processing. Additionally, include the path to the R code ("post_run_r_code"), the path to Rscript.exe ("r_script_path"), and the timeout ("r_timeout"), after which the R session will be terminated if not complete and the crash_value added as the penalty.

The Template file

$PROBLEM    2 compartment fitting
$INPUT       ID TIME AMT {D1LAG[1]} DV MDV EVID WT AGE SEX BUN SCR OCC CRCL COV1 COV2

$DATA      {data_dir}/dataWithPeriod.csv IGNORE=@

$SUBROUTINE {ADVAN[1]}
$PK
CWTKGONE = WT/81  ;; WEIGHT CENTERED ON ONE
CWTKGZERO = WT-81 ;; WEIGHT CENTERED ON ZERO
CAGE = AGE/60     ;; AGE CENTERED ON ONE
CCRCL = CRCL/85.6 ;; CRCL CENTERD ON ONE
CCOV1 = COV1-15.4 ;; COVARIATE 1 CENTERED ON ZERO
{IOVV[1]}
TVV2=THETA(1) {V2~WT[1]} {V2~SEX[1]} {V2~AGE[1]} {V2~COV2[1]}  *EXP(IOVV)
V2=TVV2*EXP(ETA(2))
{IOVCL[1]}
TVCL= {INITCL[1]} {CL~WT[1]} {CL~AGE[1]} {CL~CRCL[1]} {CL~COV1[1]} *EXP(IOVCL)
CL=TVCL*EXP(ETA(1))

K=CL/V2
{ADVAN[2]}
{D1LAG[2]}
TVKA=THETA(2)
KA=TVKA  {KAETA[1]}
S2 = V2
$ERROR
REP = IREP
IPRED =F
IOBS = F {RESERR[1]}
Y=IOBS
$THETA
(0.001,100)         ; THETA(1) V  UNITS = L
(0.001, 10)         ; THETA(2) KA UNITS = 1/HR
{INITCL[2]} ; THETA(INITCL) CL UNITS =  L/HR
{ADVAN[3]}
{V2~WT[2]}
{V2~SEX[2]}
{V2~AGE[2]}
{V2~COV2[2]}
{CL~WT[2]}
{CL~AGE[2]}
{CL~CRCL[2]}
{CL~COV1[2]}
{D1LAG[3]}
$OMEGA
0.1                 ; ETA(1) CLEARANCE
0.4                 ; ETA(2) VOLUME
{KAETA[2]}
{D1LAG[4]}
{IOVCL[2]}
{IOVV[2]}
$SIGMA
{RESERR[2]}

$EST METHOD=COND INTER MAX = 9999 MSFO=MSF1 PRINT = 10
$COV UNCOND PRINT=E  PRECOND=1 PRECONDS=TOS  MATRIX = R

$TABLE REP ID TIME DV EVID NOPRINT FILE = ORG.DAT ONEHEADER NOAPPEND

$PROB SIMULATION FOR CMAX

$INPUT       ID TIME AMT {D1LAG[1]} DV MDV EVID WT AGE SEX BUN SCR OCC CRCL COV1 COV2
$DATA      {data_dir}/dataWithPeriod.csv IGNORE=@  REWIND
$MSFI = MSF1
$SIMULATION (1) ONLYSIM
$TABLE REP ID TIME IOBS EVID  NOAPPEND NOPRINT FILE = SIM.DAT ONEHEADER NOAPPEND

Example 4 template file: text

The Tokens file

Nothing new in the tokens file, we see again an example of nested tokens:

{
    "ADVAN": [
            ["ADVAN2 ;; advan2",
                    ";; PK 1 compartment ",
                    ";; THETA 1 compartment"
            ],
            ["ADVAN4 ;; advan4",
                    "K23=THETA(ADVANA){K23~WT[1]}\n  K32=THETA(ADVANB){K23~WT[1]}",
                    "(0.001,0.02)  \t ; THETA(ADVANA) K23 \n (0.001,0.3) \t ; THETA(ADVANB) K32 \n{K23~WT[2]} \t ; init for K23~WT "
            ],
            ["ADVAN12 ;; advan12",
                    "K23=THETA(ADVANA){K23~WT[1]}\n  K32=THETA(ADVANB){K23~WT[1]}\n  K24=THETA(ADVANC)\n  K42=THETA(ADVAND)",
                    "(0.001,0.1) \t; THETA(ADVANA) K23 \n (0.001,0.1)\t; THETA(ADVANB) K32 \n (0.001,0.1) \t; THETA(ADVANC) K24  \n (0.001,0.1) \t; THETA(ADVAND) K42  \n {K23~WT[2]} \t ;; init for K23~WT"
            ]
    ],
    "K23~WT": [
            ["",
             ""
            ],
            ["*CWTKGONE**THETA(K23~WT)",
                    "(0,0.1) \t; THETA(K23~WT) K23~WT"
            ]
    ],
    "KAETA": [
            ["",
             ""
            ],
            ["*EXP(ETA(KAETA)) ",
                    "$OMEGA ;; 2nd OMEGA block \n0.1\t\t; ETA(KAETA) ETA ON KA"
            ]
    ],
    "V2~WT": [
            ["",
             ""
            ],
            ["*CWTKGONE**THETA(V2~WT)",
                    "(-4,0.8) \t; THETA(V2~WT) POWER volume ~WT "
            ],
            ["*EXP(CWTKGZERO*THETA(V2~WT))",
                    "(-1,0.01) \t; THETA(V2~WT) EXPONENTIAL volume ~WT "
            ]
    ],
    "V2~AGE": [
            ["",
             ""
            ],
            ["*CAGE**THETA(V2~AGE)",
                    "(-4,0.8) \t; THETA(V2~AGE) POWER volume ~AGE "
            ]
    ],

    "V2~SEX": [
            ["",
                    ""
            ],
            ["*EXP(SEX*THETA(V2~SEX))",
                    "(-4,0.1) \t; THETA(V2~SEX) EXPONENTIAL volume~SEX "
            ]
    ],
    "V2~COV2": [
            ["",
                    ""
            ],
            ["*EXP(COV2*THETA(V2~COV2))",
                    "(-4,0.1) \t; THETA(V2~COV2) EXPONENTIAL volume ~COV2 "
            ]
    ],
    "CL~WT": [
            ["",
                    ""
            ],
            ["*CWTKGONE**THETA(CL~WT)",
                    "(-4,.7) \t; THETA(CL~WT) POWER clearance~WT "
            ],
            ["*EXP(CWTKGZERO*THETA(CL~WT))",
                    "(-1,0.01) \t; THETA(CL~WT) EXPONENTIAL clearance~WT "
            ]
    ],
    "CL~AGE": [
            ["",
                    ""
            ],
            ["*CAGE**THETA(CL~AGE)",
                    "(-4,.7) \t; THETA(CL~AGE) POWER clearance~AGE "
            ]
    ],
    "CL~CRCL": [
            ["",
            ""
            ],
            ["*CCRCL**THETA(CL~CRCL)",
                    "(-4,-0.2) \t; THETA(CL~CRCL) POWER clearance~CRCL "
            ]
    ],
    "CL~COV1": [
            ["",
            ""
            ],

            ["*EXP(CCOV1*THETA(CL~COV1))",
                    "(-4,0.1) \t; THETA(CL~COV1) EXPONENTIAL CL~COV1 "
            ]
    ],
    "IOVCL": [
            ["IF(OCC.EQ.1) IOVCL = ETA(IOVCLA) \n  IF(OCC.EQ.2) IOVCL = ETA(IOVCLB) \n  IF(OCC.EQ.3) IOVCL = ETA(IOVCLC)",
                    "$OMEGA BLOCK(1) ; ETA(IOVCLA)\n 0.1 \n $OMEGA BLOCK SAME ; ETA(IOVCLB)\n $OMEGA BLOCK SAME ; ETA(IOVCLC)"
            ],
            ["IOVCL = 0",
            ";; no iov ON CL"
            ]
    ],
    "IOVV": [
            ["IF(OCC.EQ.1) IOVV = ETA(IOVVA) \n  IF(OCC.EQ.2) IOVV = ETA(IOVVB) \n  IF(OCC.EQ.3) IOVV = ETA(IOVVC)",
                    "$OMEGA BLOCK(1) ; ETA(IOVVA)\n 0.1 \n$OMEGA BLOCK SAME ; ETA(IOVVB)\n$OMEGA BLOCK SAME ; ETA(IOVVC)"
            ],
            ["IOVV = 0",
            ";; no iov ON V"
            ]
    ],

    "INITCL": [
            ["THETA(INITCL)",
            "(0.001,2)"
       ],
            ["THETA(INITCL)",
            "(0.001,20)"
    ]
  ],

    "ETAD1LAG": [
            ["",
                    "",
                    ""
            ],
            ["*EXP(ETA(ETALAG))",
                    "",
                    "$OMEGA ;; 3rd OMEGA block \n 0.1 \t\t;; ETA(ETALAG) ETA ON ALAG1"
                    ],
                    ["",
                    "*EXP(ETA(ETALAG1))",
                    "$OMEGA ;;  \n 0.1 \t\t;; ETA(ETALAG1) ETA ON D1"
            ],
            ["*EXP(ETA(ETALAG1))",
                    "*EXP(ETA(ETALAG2))",
                    "$OMEGA  ;; diagonal OMEGA \n 0.1 \t\t;; ETA(ETALAG1) ETA ON ALAG1\n 0.1 \t\t;; ETA(ETALAG2) ETA ON D1"
            ],
            ["*EXP(ETA(ETALAG1))",
                    "*EXP(ETA(ETALAG2))",
                    "$OMEGA BLOCK(2) ;; block OMEGA block \n 0.1 \t\t;; ETA(ETALAG1) ETA ON ALAG1\n 0.01 0.1 \t\t;; ETA(ETALAG2) ETA ON D1"
            ]
    ],
    "D1LAG": [
            ["DROP",
                    "ALAG1=THETA(ALAG){ETAD1LAG[1]}\n;; No D1",
                    "(0.001,0.1) \t; THETA(ALAG) ALAG1   ",
                    "{ETAD1LAG[3]}"
            ],
            ["RATE",
                    " D1=THETA(D1) {ETAD1LAG[1]} ; infusion only",
                    "(0.01,0.2) \t\t;; THETA(D1) D1  ",
                    "{ETAD1LAG[3]}  "
            ],
            ["RATE",
                    " ALAG1=THETA(ALAG){ETAD1LAG[1]}\n  D1=THETA(D1){ETAD1LAG[2]}",
                    "(0.001,0.1,1) \t\t;; THETA(ALAG) ALAG1\n (0.001,0.1,1) ;;THETA(D1) D1",
                    "{ETAD1LAG[3]} \t\t;; D1 and lag, block"
            ]
    ],
    "RESERR": [
            ["*EXP(EPS(RESERRA))+EPS(RESERRB)",
                    " 0.1 \t; EPS(RESERRA) proportional error\n  100 \t; EPS(RESERRB) additive error"
            ],
            ["+EPS(RESERRA)",
                    " 200 \t; EPS(RESERRA) additive error"
            ]
    ]
}

Note again, the use of THETA(parameter identifier), e.g.,

(0.001,0.02)  \t ; THETA(ADVANA) K23

for ALL initial estimate token text (THETA, OMEGA, and SIGMA).

Example 4 tokens file: json

The Options file

The algorithm selection in the options file is GA.

The user should provide an appropriate path for “nmfe_path”. NONMEM version 7.4 and 7.5 are supported.

Note that, to run in the environment used for this example, the directories are set to:

"working_dir": "u:/pyDarwin/example4/working",
"temp_dir": "u:/pyDarwin/example4/rundir",
"output_dir": "u:/pyDarwin/example4/output",

It is recommended that the user set the directories to something appropriate for their environment. If directories are not set, the default is:

{user_dir}\pydarwin\{project_name}

In either case, the folder names are given in the initial and final output to facilitate finding the files and debugging.

{
 "author": "Certara",
 "algorithm": "GA",

 "GA": {
     "crossover_rate": 0.95,
     "elitist_num": 4,
     "mutation_rate": 0.95,
     "attribute_mutation_probability": 0.1,
     "mutate": "flipBit",
     "niche_penalty": 20,
     "selection": "tournament",
     "selection_size": 2,
     "sharing_alpha": 0.1,
     "crossover_operator": "cxOnePoint"
 },

 "random_seed": 11,
 "population_size": 80,
 "num_parallel": 4,
 "num_generations": 12,

 "downhill_period": 5,
 "num_niches": 2,
 "niche_radius": 2,
 "local_2_bit_search": true,
 "final_downhill_search": true,

 "crash_value": 99999999,

 "penalty": {
     "theta": 10,
     "omega": 10,
     "sigma": 10,
     "convergence": 100,
     "covariance": 100,
     "correlation": 100,
     "condition_number": 100,
     "non_influential_tokens": 0.00001
 },

 "remove_run_dir": false,

 "nmfe_path": "c:/nm744/util/nmfe74.bat",
 "model_run_timeout": 1200,

 "postprocess": {
     "use_r": true,
     "post_run_r_code": "{project_dir}/Cmaxppc.r",
     "rscript_path": "c:\\Program Files\\R\\R-4.1.3\\bin\\Rscript.exe",
     "r_timeout": 120,
     "use_python": false
 }
 }

Example 4 options file: json