tfep.utils.cli.launcher.SRunLauncher
- class tfep.utils.cli.launcher.SRunLauncher(n_tasks: int | list[int] | None = None, multiprog: bool = False, multiprog_config_file_path: str = 'srun-job.conf', **kwargs)[source]
Bases:
LauncherLaunch a command through SLURM’s srun.
The launcher simply prepends
"srun"to each given command, setting the specified number of nodes, tasks per node, and cpus per task. Except formultiprogandmultiprog_config_file_path, any parameter can be passed as a list of values, one for each command.The launcher also supports running multiple commands in parallel using the
--multi-progfeature. The launcher assigns contiguous task ranks to each command.The class also has a
GLOBAL_SRUN_OPTIONSattribute holding a dictionary where options forsrunthat are shared across all executions ofsruncan be specified.- Parameters:
n_tasks (int or List[int], optional) – The number of tasks to pass to
srun. WhenmultiprogisTrue, this must be given as a list with length equal to the number of commands.multiprog (bool, optional) – If
Truemultiple commands are run in parallel using the--multi-progargument. In this case,srunis invoked only once, and thus all parameters (n_nodes,n_tasks_per_node, etc.) cannot be list except forn_tasks.multiprog_config_file_path (str, optional) – The file path (relative to the working directory) where the multiprog configuration file is created.
time (str or List[str], optional) – The maximum time before the job step is terminated as a string in the same format used by SLURM (e.g.,
'1-00:06:00').n_nodes (int or List[int], optional) – The number of nodes to pass to
srun.n_tasks_per_node (int or List[int], optional) – The number of tasks per node to pass to
srun. Note thatn_taskstakes precedence over this.n_cpus_per_task (int or List[int], optional) – The number of cpus per task to pass to
srun.relative_node_idx (int or List[int], optional) – Run a job step relative the
relative_node_idx-th node (starting from node 0) of the current allocation.cpu_bind (str or List[str], optional) – How to bind tasks to CPU (e.g.,
'threads'). Corresponds to thesrun --cpu-bindoption.distribution (str or List[str], optional) – Specify how to distribute tasks among cores (e.g.,
'block:block:fcyclic'). Corresponds to thesrun --distributionoption.
See also
LauncherStandard launcher class.
Examples
If the number of nodes/tasks/cpus are given as an integer, all
srunparallel executions will have the same number of nodes/tasks/cpus.>>> launcher = SRunLauncher(n_nodes=2, n_tasks_per_node=4, n_cpus_per_task=4)
Multiple commands can be run in parallel either by calling
sruntwice by calling it once with the--multi-progargument, which is design to support multiple-program multiple-data (MPMD) MPI programs. In the first case, it is possible to specify the configuration for eachsrun.For example, this modifies the launcher to run two commands in parallel using the same number of cpus per task but different number of nodes and tasks per node.
>>> launcher.n_nodes = [1, 4] >>> launcher.n_tasks_per_node = [8, 4]
Instead, when
--multi-progis used,srunis invoked only once. Thus no option can be a list, except forn_tasks, which must be provided as a list and is used to determine the task ranks assigned to each program.The following example configures the launcher to run three programs on 4 nodes, and 7 tasks. It assigns 3 tasks to the second process and 2 tasks to the others.
>>> launcher = SRunLauncher(n_nodes=4, n_tasks=[2, 3, 2], multiprog=True)
- __init__(n_tasks: int | list[int] | None = None, multiprog: bool = False, multiprog_config_file_path: str = 'srun-job.conf', **kwargs)[source]
Methods
__init__([n_tasks, multiprog, ...])run(*commands, **kwargs)Run one or more commands with srun.
Attributes
GLOBAL_SRUN_OPTIONSThe number of tasks to pass to
srunfor each command.Whether the
--multi-progfeature should be used to run multiple commands.The file path (relative to the working directory) where the multiprog configuration file is created.
Other keword arguments for
SRunTool.- multiprog
Whether the
--multi-progfeature should be used to run multiple commands.
- multiprog_config_file_path
The file path (relative to the working directory) where the multiprog configuration file is created.
- n_tasks
The number of tasks to pass to
srunfor each command.
- run(*commands, **kwargs)[source]
Run one or more commands with srun.
The method accepts all keyword arguments supported by
tfep.utils.cli.Launcher.run().- Parameters:
*commands – One or more commands to execute, either in the same list format used by
subprocess.Popenor as aCLITool.**kwargs – Other keyword arguments to pass to
Launcher.run.
- Returns:
result – The object encapsulating the results of the project. If multiple processes are run in parallel, this is a
listof results, one for each process. Note that when running with multiprog only a single result is returned.- Return type:
subprocess.CompletedProcess or List[subprocess.CompletedProcess]
See also
tfep.utils.cli.Launcher.runThe parent class method.