Batch Shipyard Slurm Configuration
This page contains in-depth details on how to configure a Slurm configuration file for Batch Shipyard.
Schema
The Slurm schema is as follows:
slurm: storage_account_settings: mystorageaccount location: <Azure region, e.g., eastus> resource_group: my-slurm-rg cluster_id: slurm controller: ssh: username: shipyard ssh_public_key: /path/to/rsa/publickey.pub ssh_public_key_data: ssh-rsa ... ssh_private_key: /path/to/rsa/privatekey generated_file_export_path: null public_ip: enabled: true static: false virtual_network: name: myvnet resource_group: my-vnet-resource-group existing_ok: false address_space: 10.0.0.0/16 subnet: name: my-slurm-controller-subnet address_prefix: 10.0.1.0/24 network_security: ssh: - '*' custom_inbound_rules: myrule: destination_port_range: 5000-5001 protocol: '*' source_address_prefix: - 1.2.3.4 - 5.6.7.0/24 vm_size: STANDARD_D2_V2 vm_count: 2 accelerated_networking: false additional_prep_script: /path/to/some/script-controller.sh login: ssh: username: shipyard ssh_public_key: /path/to/rsa/publickey.pub ssh_public_key_data: ssh-rsa ... ssh_private_key: /path/to/rsa/privatekey generated_file_export_path: null public_ip: enabled: true static: false virtual_network: name: myvnet resource_group: my-vnet-resource-group existing_ok: false address_space: 10.0.0.0/16 subnet: name: my-slurm-login-subnet address_prefix: 10.0.2.0/24 network_security: ssh: - '*' custom_inbound_rules: myrule: destination_port_range: 5000-5001 protocol: '*' source_address_prefix: - 1.2.3.4 - 5.6.7.0/24 vm_size: STANDARD_D4_V2 vm_count: 1 accelerated_networking: false additional_prep_script: /path/to/some/script-login.sh shared_data_volumes: nfs_server: mount_path: /shared store_slurmctld_state: true slurm_options: idle_reclaim_time: 00:15:00 elastic_partitions: partition_1: batch_pools: mypool1: account_service_url: https://... compute_node_type: dedicated max_compute_nodes: 32 weight: 0 features: - arbitrary_constraint_1 reclaim_exclude_num_nodes: 8 mypool2: account_service_url: https://... compute_node_type: low_priority max_compute_nodes: 128 weight: 1 features: - arbitrary_constraint_2 reclaim_exclude_num_nodes: 0 max_runtime_limit: null default: true preempty_type: preempt/partition_prio preempt_mode: requeue over_subscribe: no priority_tier: 10 other_options: [] partition_2: batch_pools: mypool3: account_service_url: https://... compute_node_type: low_priority max_compute_nodes: 256 weight: 2 features: [] reclaim_exclude_num_nodes: 0 max_runtime_limit: 1.12:00:00 default: false unmanaged_partitions: - partition: 'PartitionName=onprem Nodes=onprem-[0-31] Default=No MaxTime=INFINITE State=UP' nodes: - 'NodeName=onprem-[0-31] CPUs=512 Sockets=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=512128 State=UNKNOWN'
The slurm property has the following members:
- (required)
storage_account_settingsis the storage account link to store all Slurm metadata. Anyslurmcommand that must store metadata or actions uses this storage account. - (required)
locationis the Azure region name for the resources, e.g.,eastusornortheurope. - (required)
resource_groupthis is the resource group to use for the Slurm resources. - (required)
cluster_idis the name of the Slurm cluster to create. This is also the DNS label prefix to apply to each virtual machine and resource allocated for the Slurm cluster. It should be unique.
There are two required sections for resources that comprise the Slurm
cluster: controller and login. The controller section specifies the VM
configuration which hosts the Slurm controller (and possibly the Slurm DBD).
The login section specifies the VM configuration which hosts the login nodes
for the Slurm cluster.
Both the controller and login sections have the following identical
configuration properties:
- (required)
sshis the SSH admin user to create on the machine. If you are running Batch Shipyard on Windows, please refer to these instructions on how to generate an SSH keypair for use with Batch Shipyard.- (required)
usernameis the admin user to create on all virtual machines - (optional)
ssh_public_keyis the path to a pre-existing ssh public key to use. If this is not specified, an RSA public/private key pair will be generated for use in your current working directory (with a non-colliding name for auto-generated SSH keys for compute pools, i.e.,id_rsa_shipyard_remotefs). On Windows only, if this is option is not specified, the SSH keys are not auto-generated (unlessssh-keygen.execan be invoked in the current working directory or is in%PATH%). This option cannot be specified withssh_public_key_data. - (optional)
ssh_public_key_datais the raw RSA public key data in OpenSSH format, e.g., a string starting withssh-rsa .... Only one key may be specified. This option cannot be specified withssh_public_key. - (optional)
ssh_private_keyis the path to an existing SSH private key to use against eitherssh_public_keyorssh_public_key_datafor connecting to storage nodes and performing operations that require SSH such as cluster resize and detail status. This option should only be specified if eitherssh_public_keyorssh_public_key_dataare specified. - (optional)
generated_file_export_pathis an optional path to specify for where to create the RSA public/private key pair.
- (required)
- (optional)
public_ipare public IP properties for the virtual machine.- (optional)
enableddesignates if public IPs should be assigned. The default istrue. Note that if public IP is disabled, then you must create an alternate means for accessing the Slurm resource virtual machine through a "jumpbox" on the virtual network. If this property is set tofalse(disabled), then any action requiring SSH, or the SSH command itself, will occur against the private IP address of the virtual machine. - (optional)
staticis to specify if static public IPs should be assigned to each virtual machine allocated. The default isfalsewhich results in dynamic public IP addresses. A "static" FQDN will be provided per virtual machine, regardless of this setting if public IPs are enabled.
- (optional)
- (required)
virtual_networkis the virtual network to use for the Slurm resource.- (required)
nameis the virtual network name - (optional)
resource_groupis the resource group for the virtual network. If this is not specified, the resource group name falls back to the resource group specified in the Slurm resource. - (optional)
existing_okallows use of a pre-existing virtual network. The default isfalse. - (required if creating, optional otherwise)
address_spaceis the allowed address space for the virtual network. - (required)
subnetspecifies the subnet properties.- (required)
nameis the subnet name. - (required)
address_prefixis the subnet address prefix to use for allocation of the Slurm resource virtual machine to.
- (required)
- (required)
- (required)
network_securitydefines the network security rules to apply to the Slurm resource virtual machine.- (required)
sshis the rule for which address prefixes to allow for connecting to sshd port 22 on the virtual machine. In the example,"*"allows any IP address to connect. This is an array property which allows multiple address prefixes to be specified. - (optional)
grafanarule allows grafana HTTPS (443) server port to be exposed to the specified address prefix. Multiple address prefixes can be specified. - (optional)
prometheusrule allows the Prometheus server port to be exposed to the specified address prefix. Multiple address prefixes can be specified. - (optional)
custom_inbound_rulesare custom inbound rules for other services that you need to expose.- (required)
<rule name>is the name of the rule; the example usesmyrule. Each rule name should be unique.- (required)
destination_port_rangeis the ports on each virtual machine that will be exposed. This can be a single port and should be a string. - (required)
source_address_prefixis an array of address prefixes to allow.
- (required)
- (required)
protocolis the protocol to allow. Valid values aretcp,udpand*(which means any protocol).
- (required)
- (required)
- (required)
vm_sizeis the virtual machine instance size to use. - (required)
vm_countis the number of virtual machines to allocate of this instance type. Forcontroller, a value greater than1will create a HA Slurm cluster. Additionally, a value of greater than1will automatically place the VMs in an availability set. - (optional)
accelerated_networkingenables or disables accelerated networking. The default isfalseif not specified. - (optional)
additional_prep_scriptproperty specifies a local file which will be uploaded then executed for additional prep/configuration that should be applied to each Slurm resource.
There are two required sections for specifying how the Slurm
cluster is configured: shared_data_volumes and slurm_options sections.
The shared_data_volumes section configures shared file systems (or
RemoteFS clusters as provisioned by Batch Shipyard). The slurm_options
section configures the Slurm partitions.
The following describes the shared_data_volumes configuration:
- (required) Storage cluster id is a named dictionary key that refers
to a defined storage cluster in the global configuration file (and
subsequently the RemoteFS configuration).
- (required)
mount_pathis the mount path across all Slurm resources and compute nodes. - (required)
store_slurmctld_statedesignates this shared data volume as the volume that hosts the slurmctld state for HA failover.
- (required)
The following describes the slurm_options configuration:
- (required)
idle_reclaim_timespecifies the amount of time required to pass while nodes are idle for them to be reclaimed (or suspended) by Slurm. The format for this property is a timedelta with a string representation of "d.HH:mm:ss". "HH:mm:ss" is required but "d" is optional. - (required)
elastic_partitionsspecifies the Slurm partitions to create for elastic cloud bursting onto Azure Batch- (required) Unique name of the partition
- (required)
batch_poolsspecifies the Batch pools which will be dynamically sized by Batch Shipyard and Slurm. All Batch pools should be pre-allocated (unless using theorchestratecommand in conjunction with using one pool) with 0 nodes.- (required) Batch Pool Id
- (optional)
account_service_urlis the Batch account service URL associated with this Batch pool. Currently, this is restricted to the service url specified in the credentials file. - (required)
compute_node_typeis the compute node type to allocate, can be eitherdedicatedorlow_priority. - (required)
max_compute_nodesis the maximum number of compute nodes that can be allocated. - (required)
weightis this weight for this Batch pool in this partition. See the Slurm documentation for more details. - (optional)
featuresare additional features labeled on this partition. - (optional)
reclaim_exclude_num_nodesis the number of nodes to exclude from reclaiming for this Batch pool.
- (optional)
- (required) Batch Pool Id
- (optional)
max_runtime_limitimposes a maximum runtime limit for this partition. The format for this property is a timedelta with a string representation of "d.HH:mm:ss". "HH:mm:ss" is required but "d" is optional. - (required)
defaultdesignates this partition as the default partition. - (optional)
preempt_typeis the PreemptType setting for preemption - (optional)
preempt_modeis the PreemptMode setting for preemption - (opation)
over_subscribeis the OverSubscribe setting associated with preemption - (optional)
priority_tieris the PriorityTier setting for preemption - (optional)
other_optionsis a sequence of other options to specify on the partition
- (required)
- (required) Unique name of the partition
- (optional)
unmanaged_partitionsspecifies partitions which are not managed by Batch Shipyard but those that you wish to join to the Slurm controller. This is useful for joining on-premises nodes within the same Virtual Network (or peered) to the Slurm cluster. Each sequence member has the properties:- (required)
partitionspecifies the partition entry in the Slurm configuration file. - (required)
nodesis a sequence of Slurm node entries in the Slurm configuration file as it relates to the partition.
- (required)
Slurm with Batch Shipyard Guide
Please see the full guide for relevant terminology and information on how this feature works in Batch Shipyard.
Full template
A full template of a Slurm cluster configuration file can be found here. Note that these templates cannot be used as-is and must be modified to fit your scenario.