Reconnect, Monitor, and Retrieve Results from a Remote pyDarwin Job
reconnect_pyDarwinJob.Rd
This function reconnects to a pyDarwin job previously launched in the background on a remote host. It monitors the job until completion (if a PID is available), then downloads and processes the results.
Usage
reconnect_pyDarwinJob(
LocalDirectoryPath = ".",
LocalJobInfoFilePath = NULL,
OriginalOptionsPath = NULL,
Password = NULL,
KeyPath = NULL,
MonitoringInterval = 30,
verbose = getOption("verbose", default = FALSE)
)
Arguments
- LocalDirectoryPath
Character string: The base local directory associated with the pyDarwin job. This directory is used to: 1. Locate the job information file (if
LocalJobInfoFilePath
is NULL), expected as{ProjectName}_remote_job_info.json
. 2. Locate the originaloptions.json
file (ifOriginalOptionsPath
is NULL). 3. Serve as the base location for downloading results into a subdirectory (e.g.,{LocalDirectoryPath}/{ProjectName}_Results/
).- LocalJobInfoFilePath
Character string (optional): Explicit path to the local JSON file containing information about the remote job (e.g., as created by
RunPyDarwinRemote(Wait = FALSE)
). If NULL (default), the path is constructed usingLocalDirectoryPath
andProjectName
(derived fromOriginalOptionsPath
).- OriginalOptionsPath
Character string (optional): Explicit path to the original local
options.json
file that was used when the job was first launched. This is needed to correctly parse results (e.g.,engine_adapter
) and to deriveProjectName
if not available fromLocalJobInfoFilePath
. If NULL (default), the function attempts to findoptions.json
withinLocalDirectoryPath
. If not found, the operation will stop.- Password
Character string. The password for SSH authentication. Defaults to
""
, which is appropriate when using key-based authentication. Using keys is strongly recommended over embedding passwords in scripts.- KeyPath
Character string. The path to your private SSH key file. Defaults to the path stored in the
SSH_PRIVATE_KEY_PATH
environment variable.- MonitoringInterval
Numeric. The interval in seconds between status checks when monitoring a running job (
Wait = TRUE
).- verbose
Logical: Passed to helper functions for verbose output during SSH connection and file downloads. Default:
getOption("verbose", default = FALSE)
.
Value
A list containing parsed results similar to RunPyDarwinRemote(Wait = TRUE)
(i.e., results
data.frame, FinalResultFile
, FinalControlFile
,
DownloadedResultsDir
, DownloadedItems
), or the content of the
downloaded messages.txt
as a character vector if primary result
files are not found or parsed successfully. If essential information
(like job info or options file) is missing, the function will stop.
Details
This function requires a job information JSON file (typically created by
RunPyDarwinRemote
when Wait = FALSE
) to obtain details like the remote host,
user, remote project directory, and optionally the remote process ID (PID).
The ProjectName
is crucial. It's primarily derived from the project_name
field in the original options file (located via OriginalOptionsPath
or within
LocalDirectoryPath
). If not present in the options file, a fallback derivation
uses the parent directory name of the options file. This ProjectName
is then
used to find the job info file (as {ProjectName}_remote_job_info.json
in
LocalDirectoryPath
) if LocalJobInfoFilePath
is not directly provided.
If the job info file itself contains a ProjectName
, that value may take precedence.
Downloaded results are organized locally using this determined ProjectName
.
If RemoteJobPID
is available in the job info file, the function will actively
monitor the process. If the PID is not available, it will skip active monitoring
and proceed directly to attempt downloading any available results.
Examples
if (FALSE) { # \dontrun{
# Assuming 'my_project_job_info.json' and 'options.json' exist in '~/darwin_runs/my_project_run'
# and 'my_project_job_info.json' was created by a previous RunPyDarwinRemote(Wait=FALSE) call.
# Example 1: Specifying only the local directory path
# ProjectName will be derived from options.json within that path.
# Job info file will be sought as {ProjectName}_remote_job_info.json.
try({
reconnect_pyDarwinJob(
LocalDirectoryPath = "~/darwin_runs/my_project_run"
)
})
# Example 2: Specifying paths explicitly
try({
reconnect_pyDarwinJob(
LocalDirectoryPath = "~/darwin_runs/my_project_run", # Still used for downloads
LocalJobInfoFilePath = "~/darwin_runs/my_project_run/my_project_remote_job_info.json",
OriginalOptionsPath = "~/darwin_runs/my_project_run/options.json",
KeyPath = "~/.ssh/id_rsa_remote_server"
)
})
} # }