Run pyDarwin on a Remote Host via SSH
run_pyDarwinRemote.Rd
Establishes an SSH connection, prepares and uploads project files, executes pyDarwin in the background, and can optionally monitor the job and download results upon completion.
Usage
run_pyDarwinRemote(
Host,
User,
Password = "",
KeyPath = Sys.getenv("SSH_PRIVATE_KEY_PATH"),
SshFlags = character(0),
LocalTemplatePath,
LocalTokensPath,
LocalOptionsPath,
LocalDirectoryPath = ".",
RemoteBaseDir = "~/.rdarwin/",
RemoteInterpreterPath = NULL,
UseLocalLicense = FALSE,
Wait = TRUE,
Flags = c("-u", "-m"),
MonitoringInterval = 30
)
Arguments
- Host
Character string. The hostname or IP address of the remote server.
- User
Character string. The username for the SSH connection.
- Password
Character string. The password for SSH authentication. Defaults to
""
, which is appropriate when using key-based authentication. Using keys is strongly recommended over embedding passwords in scripts.- KeyPath
Character string. The path to your private SSH key file. Defaults to the path stored in the
SSH_PRIVATE_KEY_PATH
environment variable.- SshFlags
Character vector. Additional flags to pass to the underlying
ssh::ssh_connect
function.- LocalTemplatePath
Character string. The path to the pyDarwin template file. If not provided, defaults to
"template.txt"
withinLocalDirectoryPath
.- LocalTokensPath
Character string. The path to the pyDarwin tokens JSON file. If not provided, defaults to
"tokens.json"
withinLocalDirectoryPath
.- LocalOptionsPath
Character string. The path to the pyDarwin options JSON file. If not provided, defaults to
"options.json"
withinLocalDirectoryPath
.- LocalDirectoryPath
Character string or
NULL
. The path to the local project directory that contains the pyDarwin input files. Defaults to the current working directory (.
). IfNULL
is provided, the directory containingLocalOptionsPath
is used as the project directory.- RemoteBaseDir
Character string. The base directory on the remote host under which a new project-specific directory will be created.
- RemoteInterpreterPath
Character string or
NULL
. The full path to the Python interpreter on the remote host (e.g.,/usr/bin/python3
). IfNULL
, the function attempts to find a suitable Python interpreter automatically.- UseLocalLicense
Logical. If
TRUE
, attempts to transfer local Certara license files to the remote host.- Wait
Logical. If
TRUE
(the default), the function will monitor the remote job's progress and download the results upon completion. IfFALSE
,- Flags
Character vector. Command-line flags to pass to the
pyDarwin
Python module.- MonitoringInterval
Numeric. The interval in seconds between status checks when monitoring a running job (
Wait = TRUE
).
Value
The return value depends on the Wait
parameter:
- If
Wait = TRUE
On successful completion, a list containing the parsed results, similar to
run_pyDarwin()
. This may include data frames like$results
and character vectors like$FinalResultFile
.- If
Wait = FALSE
An invisible list containing information needed to reconnect to the job later using
reconnect_pyDarwinJob()
. The list includes:LocalJobInfoFile
: Path to the local JSON file with job details.RemoteProjectDir
: The directory on the remote host.RemoteJobPID
: The Process ID of the job on the remote host.Host
,User
,ProjectName
The function throws an error if a critical step fails.
Details
This function automates the entire remote execution workflow. It creates a unique project directory on the remote host to ensure run isolation.
Examples
if (FALSE) { # \dontrun{
# Example of launching a remote job and waiting for the results
remote_results <- run_pyDarwinRemote(
Host = "cluster.mycompany.com",
User = "myuser",
KeyPath = "~/.ssh/id_rsa_cluster",
LocalDirectoryPath = "path/to/my/CovariateSearchProject"
)
# Example of launching a job in the background
job_info <- run_pyDarwinRemote(
Host = "cluster.mycompany.com",
User = "myuser",
KeyPath = "~/.ssh/id_rsa_cluster",
LocalDirectoryPath = "path/to/my/CovariateSearchProject",
Wait = FALSE
)
# You can later use job_info to reconnect to the job
# final_results <- reconnect_pyDarwinJob(JobInfo = job_info)
} # }