Skip to contents

Establishes an SSH connection, prepares and uploads project files, executes pyDarwin in the background, and can optionally monitor the job and download results upon completion.

Usage

run_pyDarwinRemote(
  Host,
  User,
  Password = "",
  KeyPath = Sys.getenv("SSH_PRIVATE_KEY_PATH"),
  SshFlags = character(0),
  LocalTemplatePath,
  LocalTokensPath,
  LocalOptionsPath,
  LocalDirectoryPath = ".",
  RemoteBaseDir = "~/.rdarwin/",
  RemoteInterpreterPath = NULL,
  UseLocalLicense = FALSE,
  Wait = TRUE,
  Flags = c("-u", "-m"),
  MonitoringInterval = 30
)

Arguments

Host

Character string. The hostname or IP address of the remote server.

User

Character string. The username for the SSH connection.

Password

Character string. The password for SSH authentication. Defaults to "", which is appropriate when using key-based authentication. Using keys is strongly recommended over embedding passwords in scripts.

KeyPath

Character string. The path to your private SSH key file. Defaults to the path stored in the SSH_PRIVATE_KEY_PATH environment variable.

SshFlags

Character vector. Additional flags to pass to the underlying ssh::ssh_connect function.

LocalTemplatePath

Character string. The path to the pyDarwin template file. If not provided, defaults to "template.txt" within LocalDirectoryPath.

LocalTokensPath

Character string. The path to the pyDarwin tokens JSON file. If not provided, defaults to "tokens.json" within LocalDirectoryPath.

LocalOptionsPath

Character string. The path to the pyDarwin options JSON file. If not provided, defaults to "options.json" within LocalDirectoryPath.

LocalDirectoryPath

Character string or NULL. The path to the local project directory that contains the pyDarwin input files. Defaults to the current working directory (.). If NULL is provided, the directory containing LocalOptionsPath is used as the project directory.

RemoteBaseDir

Character string. The base directory on the remote host under which a new project-specific directory will be created.

RemoteInterpreterPath

Character string or NULL. The full path to the Python interpreter on the remote host (e.g., /usr/bin/python3). If NULL, the function attempts to find a suitable Python interpreter automatically.

UseLocalLicense

Logical. If TRUE, attempts to transfer local Certara license files to the remote host.

Wait

Logical. If TRUE (the default), the function will monitor the remote job's progress and download the results upon completion. If FALSE,

Flags

Character vector. Command-line flags to pass to the pyDarwin Python module.

MonitoringInterval

Numeric. The interval in seconds between status checks when monitoring a running job (Wait = TRUE).

Value

The return value depends on the Wait parameter:

If Wait = TRUE

On successful completion, a list containing the parsed results, similar to run_pyDarwin(). This may include data frames like $results and character vectors like $FinalResultFile.

If Wait = FALSE

An invisible list containing information needed to reconnect to the job later using reconnect_pyDarwinJob(). The list includes:

  • LocalJobInfoFile: Path to the local JSON file with job details.

  • RemoteProjectDir: The directory on the remote host.

  • RemoteJobPID: The Process ID of the job on the remote host.

  • Host, User, ProjectName

The function throws an error if a critical step fails.

Details

This function automates the entire remote execution workflow. It creates a unique project directory on the remote host to ensure run isolation.

Examples

if (FALSE) { # \dontrun{
# Example of launching a remote job and waiting for the results
remote_results <- run_pyDarwinRemote(
  Host = "cluster.mycompany.com",
  User = "myuser",
  KeyPath = "~/.ssh/id_rsa_cluster",
  LocalDirectoryPath = "path/to/my/CovariateSearchProject"
)

# Example of launching a job in the background
job_info <- run_pyDarwinRemote(
  Host = "cluster.mycompany.com",
  User = "myuser",
  KeyPath = "~/.ssh/id_rsa_cluster",
  LocalDirectoryPath = "path/to/my/CovariateSearchProject",
  Wait = FALSE
)

# You can later use job_info to reconnect to the job
# final_results <- reconnect_pyDarwinJob(JobInfo = job_info)
} # }