Batch Shipyard Pool Configuration
This page contains in-depth details on how to configure the pool configuration file for Batch Shipyard.
Considerations
- Note that environment variable conventions used below are for Linux.
Windows environment variables should follow Windows conventions. For example,
the Azure Batch Shared Directory on compute nodes are referenced in Linux
as
$AZ_BATCH_NODE_SHARED_DIR, while on Windows, it would be%AZ_BATCH_NODE_SHARED_DIR%.
Schema
The pool schema is as follows:
pool_specification: id: batch-shipyard-pool vm_configuration: platform_image: publisher: Canonical offer: UbuntuServer sku: 16.04-LTS version: latest native: false license_type: null custom_image: arm_image_id: /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Compute/galleries/<gallery_name>/images/<image_name>/versions/<version> node_agent: <node agent sku id> native: false license_type: null vm_size: STANDARD_D2_V2 vm_count: dedicated: 4 low_priority: 8 max_tasks_per_node: 1 resize_timeout: 00:20:00 node_fill_type: pack autoscale: evaluation_interval: 00:15:00 scenario: name: active_tasks maximum_vm_count: dedicated: 16 low_priority: 8 maximum_vm_increment_per_evaluation: dedicated: 4 low_priority: -1 node_deallocation_option: taskcompletion sample_lookback_interval: 00:10:00 required_sample_percentage: 70 bias_last_sample: true bias_node_type: low_priority rebalance_preemption_percentage: 50 time_ranges: weekdays: start: 1 end: 5 work_hours: start: 8 end: 17 formula: null inter_node_communication_enabled: false per_job_auto_scratch: false reboot_on_start_task_failed: false attempt_recovery_on_unusable: false upload_diagnostics_logs_on_unusable: true block_until_all_global_resources_loaded: true transfer_files_on_pool_creation: false input_data: azure_batch: - destination: $AZ_BATCH_NODE_SHARED_DIR/jobonanotherpool exclude: - '*.txt' include: - wd/*.dat job_id: jobonanotherpool task_id: mytask azure_storage: - storage_account_settings: mystorageaccount remote_path: poolcontainer/dir local_path: $AZ_BATCH_NODE_SHARED_DIR/pooldata is_file_share: false exclude: - '*.tmp' include: - pooldata*.bin blobxfer_extra_options: null resource_files: - blob_source: https://some.url file_mode: '0750' file_path: path/in/wd/file.bin ssh: username: shipyard expiry_days: 30 ssh_public_key: /path/to/rsa/publickey.pub ssh_public_key_data: ssh-rsa ... ssh_private_key: /path/to/rsa/privatekey generate_docker_tunnel_script: true generated_file_export_path: hpn_server_swap: false allow_docker_access: false rdp: username: shipyard password: null expiry_days: 30 remote_access_control: starting_port: 49000 allow: - 1.2.3.4 deny: - '*' virtual_network: arm_subnet_id: /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Network/virtualNetworks/<virtual_network_name>/subnets/<subnet_name> name: myvnet resource_group: resource-group-of-vnet create_nonexistant: false address_space: 10.0.0.0/16 subnet: name: subnet-for-batch-vms address_prefix: 10.0.0.0/20 public_ips: - /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Network/publicIPAddresses/<public_ip_name1> - /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Network/publicIPAddresses/<public_ip_name2> certificates: sha1-thumbprint: visibility: - task - start_task - remote_user additional_node_prep: commands: pre: [] post: [] environment_variables: abc: xyz environment_variables_keyvault_secret_id: https://myvault.vault.azure.net/secrets/nodeprepenv gpu: nvidia_driver: source: https://some.url ignore_warnings: false batch_insights_enabled: false prometheus: node_exporter: enabled: false port: 9100 options: [] cadvisor: enabled: false port: 8080 options: [] container_runtimes: install: - kata_containers - singularity default: null
The pool_specification property has the following members:
- (required)
idis the compute pool ID. This value can be any combination of alphanumeric characters including hyphens and underscores up to 64 characters in length. If this pool specification is used for anauto_poolthen the maximum length of this value is 20 characters which becomes the prefix for the autopool id. - (required)
vm_configurationspecifies the image configuration for the VM. Eitherplatform_imageorcustom_imagemust be specified. You cannot specify both. Please see the Batch Shipyard Platform Image support doc for more information on which Marketplace images are supported. If using a custom image, please see the Custom Image Guide first.- (required for platform image)
platform_imagedefines the Marketplace platform image to use:- (required for platform image)
publisheris the publisher name of the Marketplace VM image. - (required for platform image)
offeris the offer name of the Marketplace VM image. - (required for platform image)
skuis the sku name of the Marketplace VM image. - (optional)
versionis the image version to use. The default islatest. - (optional)
nativewill convert the platform image to use native Docker container support for Azure Batch, if possible. This can provide better task management (such as job and task termination while tasks are running) and potentially lead to faster compute node provisioning (although not guaranteed), in exchange for some features that are not available in this mode such Singularity containers, task-level data ingress or task-level data egress that is not bound for Azure Storage Blobs, among others. If there is nonativeconversion equivalent for the specifiedpublisher,offer,skuthen no conversion is performed and this option will be force disabled. Note thatnativemode is not compatible with Singularity containers. The default isfalse. Please see the FAQ for more information. - (optional)
license_typespecifies the type of on-premises license to be used when deploying the operating system. This activates the Azure Hybrid Use Benefit for qualifying license holders. This only applies to Windows OS types. You must comply with the terms set forth by this program; please consult the FAQ for further information. The only valid value iswindows_server.
- (required for platform image)
- (required for custom image)
custom_imagedefines the custom image to use. AADbatchcredentials are required to use custom images for both Batch service and User Subscription modes.- (required for custom image)
arm_image_iddefines the Azure Shared Image Gallery resource id to use as the OS image for the pool. The Shared Image resource must be replicated (and have completed replication) to the same region as the Batch account. - (required for custom image)
node_agentis the node agent sku id to use with this custom image. You can view supported base images and their node agent sku ids with theaccount imagescommand. - (optional)
nativewill opt to use native Docker container support if possible. This provides better task management (such as job and task termination while tasks are running), in exchange for some other features that are not available in this mode such as task-level data ingress or task-level data egress that is not bound for Azure Storage Blobs. The default isfalse. - (optional)
license_typespecifies the type of on-premises license to be used when deploying the operating system. This activates the Azure Hybrid Use Benefit for qualifying license holders. This only applies to Windows OS types. You must comply with the terms set forth by this program; please consult the FAQ for further information. The only valid value iswindows_server.
- (required for custom image)
- (required for platform image)
- (required)
vm_sizeis the Azure Virtual Machine Instance Size. Please note that not all regions have every VM size available. - (required)
vm_countis the number of compute nodes to allocate. You may specify a mixed number of compute nodes in the following properties:- (optional)
dedicatedis the number of dedicated compute nodes to allocate. These nodes cannot be pre-empted. The default value is0. - (optional)
low_priorityis the number of low-priority compute nodes to allocate. These nodes may be pre-empted at any time. Workloads that are amenable tolow_prioritynodes are those that do not have strict deadlines for pickup and completion. Optimally, these types of jobs would checkpoint their progress and be able to recover when re-scheduled. The default value is0.
- (optional)
- (optional)
resize_timeoutis the amount of time allowed for resize operations (note that creating a pool resizes from 0 to the specified number of nodes). The format for this property is a timedelta with a string representation of "d.HH:mm:ss". "HH:mm:ss" is required, but "d" is optional, if specified. If not specified, the default is 15 minutes. This should not be specified (and is ignored) forautoscaleenabled pools. - (optional)
max_tasks_per_nodeis the maximum number of concurrent tasks that can be running at any one time on a compute node. This defaults to a value of 1 if not specified. The maximum value for the property that Azure Batch will accept is4 x <# cores per compute node>. For instance, for aSTANDARD_F2instance, because the virtual machine has 2 cores, the maximum allowable value for this property would be8. - (optional)
node_fill_typeis the task scheduling compute node fill type policy to apply.pack, which is the default, attempts to pack the maximum number of tasks on a node (controlled throughmax_tasks_per_nodebefore scheduling tasks to another node).spreadwill schedule tasks evenly across compute nodes before packing. - (optional)
autoscaledesignates the autoscale settings for the pool. If specified, thevm_countbecomes the minimum number of virtual machines for each node type forscenariobased autoscale.- (optional)
evaluation_intervalis the time interval between autoscale evaluations performed by the service. The format for this property is a timedelta with a string representation of "d.HH:mm:ss". "HH:mm:ss" is required, but "d" is optional, if specified. If not specified, the default is 15 minutes. The smallest value that can be specified is 5 minutes. Use caution when specifying a smallevaluation_intervalvalues which can cause pool resizing errors and instability with volatile target counts. - (optional)
scenariois a pre-set autoscale scenario where a formula will be generated with the parameters specified within this property.- (required)
nameis the autoscale scenario name to apply. Valid values areactive_tasks,pending_tasks,workday,workday_with_offpeak_max_low_priority,weekday,weekend. Please see the autoscale guide for more information about these scenarios. - (required)
maximum_vm_countis the maximum number of compute nodes that can be allocated from an autoscale evaluation. It is useful to have these limits in place as to control the top-end scale of the autoscale scenario. Specifying a negative value for either of the following properties will result in effectively no maximum limit.- (optional)
dedicatedis the maximum number of dedicated compute nodes that can be allocated. - (optional)
low_priorityis the maximum number of low priority compute nodes that can be allocated.
- (optional)
- (optional)
maximum_vm_increment_per_evaluationis the maximum amount of VMs to increase per evaluation. Specifying a non-positive value (i.e., less than or equal to0) for either of the following properties will result in effectively no increment limit.- (optional)
dedicatedis the maximum increase in VMs per evaluation. - (optional)
low_priorityis the maximum increase in VMs per evaluation.
- (optional)
- (optional)
node_deallocation_optionis the node deallocation option to apply. When a pool is resized down and a node is selected for removal, what action is performed for the running task is specified with this option. The valid values are:requeue,terminate,taskcompletion, andretaineddata. The default istaskcompletion. Please see this doc for more information. - (optional)
sample_lookback_intervalis the time interval to lookback for past history for certain scenarios such as autoscale based on active and pending tasks. The format for this property is a timedelta with a string representation of "d.HH:mm:ss". "HH:mm:ss" is required, but "d" is optional, if specified. If not specified, the default is 10 minutes. - (optional)
required_sample_percentageis the required percentage of samples that must be present during thesample_lookback_interval. If not specified, the default is 70. - (optional)
bias_last_samplewill bias the autoscale scenario, if applicable, to use the last sample during history computation. This can be enabled to more quickly respond to changes in history with respect to averages. The default istrue. - (optional)
bias_node_typewill bias the the autoscale scenario, if applicable, to favor one type of node over the other when making a decision on how many of each node to allocate. The default isautoor equal weight to bothdedicatedandlow_prioritynodes. Valid values arenull(or omitting the property),dedicated, orlow_priority. - (optional)
rebalance_preemption_percentagewill rebalance the compute nodes to bias for dedicated nodes when the pre-empted node count reaches the indicated threshold percentage of the total current dedicated and low priority nodes. The default isnullor no rebalancing is performed. - (optional)
time_rangesdefines the time ranges for the day-of-week based scenarios.- (optional)
weekdaysdefines the days of the week which should be considered weekdays, where1= Monday.- (optional)
startdefines the inclusive start weekday day of the week as an integer. The default is1. - (optional)
enddefines the inclusive end weekday day of the week as an integer. The default is5.
- (optional)
- (optional)
work_hoursdefines the hours of the day in the work day with a range from0to23, inclusive.- (optional)
startdefines the inclusive start hour of the work day as an integer. The default is8. - (optional)
enddefines the inclusive end hour of the work day as an integer. The default is17.
- (optional)
- (optional)
- (required)
- (optional)
formulais a custom autoscale formula to apply to the pool. If bothformulaandscenarioare specified, thenformulais used.
- (optional)
- (optional)
inter_node_communication_enableddesignates if this pool is set up for inter-node communication. This must be set totruefor any containers that must communicate with each other such as MPI applications. This property cannot be enabled if there are positive values for bothdedicated andlow_priority` compute nodes specified above. This property will be force enabled if peer-to-peer replication is enabled. - (optional)
per_job_auto_scratchwill enable on-demand distributed scratch space creation across all dedicated or low priority nodes in a pool for a job. This scratch will be available at the location$AZ_BATCH_TASK_DIR/auto_scratchwithin the container. The scratch drive is cleaned up automatically on job termination or deletion. This option requires setting the propertyinter_node_communication_enabledtotrue. Note that SSH and BeeGFS communication must be allowed on the virtual network between nodes. Thus if specifying avirtual_networkand/orremote_access_controlrules, you must ensure that the internal network traffic is not blocked by NSG rules. This option is only available on a subset of supported Linux distributions. The default, if not specified, isfalse. - (optional)
reboot_on_start_task_failedallows Batch Shipyard to reboot the compute node in case there is a transient failure in node preparation (e.g., network timeout, resolution failure or download problem). This defaults tofalse. - (optional)
attempt_recovery_on_unusableallows Batch Shipyard to attempt to recover nodes that enterunusablestate automatically. Note that enabling this option can lead to infinite wait onpool addorpool resizewith--wait. This defaults tofalseand is ignored forcustom_imagewhere the behavior is alwaysfalse. - (optional)
upload_diagnostics_logs_on_unusableallows Batch Shipyard to attempt upload of diagnostics logs for nodes that have entered unusable state during provisioning to the storage account designated under thebatch_shipyard:storage_account_settingsglobal configuration property. Note that this typically will only result in one set of logs being uploaded even if multiple nodes eventually enter this state. These logs can be referenced in conjunction with a support request to provide additional insight into why a compute node failed to provision properly. This defaults totrue. Note that by setting this property totrue, these diagnostics logs are not automatically sent to Microsoft and must be included, either indirectly via the SAS URL generated or directly, with support requests. - (optional)
block_until_all_global_resources_loadedwill block the node from entering ready state until all Docker images are loaded. This defaults totrue. This option has no effect onnativecontainer support pools (the behavior will effectively reflecttruefor this property onnativecontainer support pools). - (optional)
transfer_files_on_pool_creationwill ingress allfilesspecified in theglobal_resourcessection of the global configuration file when the pool is created. If files are to be ingressed to Azure Blob or File Storage, then data movement operations are overlapped with the creation of the pool. If files are to be ingressed to a shared file system on the compute nodes, then the files are ingressed after the pool is created and the shared file system is ready. Files can be ingressed to both Azure Blob Storage and a shared file system during the same pool creation invocation. If this property is set totruethenblock_until_all_global_resources_loadedwill be force disabled. If omitted, this property defaults tofalse. - (optional)
input_datais an object containing data that should be ingressed to all compute nodes as part of node preparation. It is important to note that if you are combining this action withfilesand are ingressing data to Azure Blob or File storage as part of pool creation, that the blob containers or file shares defined here will be downloaded as soon as the compute node is ready to do so. This may result in the blob container/blobs or file share/files not being ready in time for theinput_datatransfer. It is up to you to ensure that these two operations do not overlap. If there is a possibility of overlap, then you should ingress data defined infilesprior to pool creation and disable the option abovetransfer_files_on_pool_creation. This object currently supportsazure_batchandazure_storageas members.azure_batchcontains the following members:- (required)
job_idthe job id of the task - (required)
task_idthe id of the task to fetch files from - (optional)
includeis an array of include filters - (optional)
excludeis an array of exclude filters - (required)
destinationis the destination path to place the files
- (required)
azure_storagecontains the following members:- (required)
storage_account_settingscontains a storage account link as defined in the credentials config. - (required)
remote_pathis required when downloading from Azure Storage. This path on Azure includes either the container or file share path along with all virtual directories. - (required)
local_pathis required when downloading from Azure Storage. This specifies where the files should be downloaded to on the compute node. Please note that you should not specify a destination that is on a shared file system. If you require ingressing to a shared file system location like a GlusterFS volume, then use the global configurationfilesproperty and thedata ingresscommand. - (optional)
is_file_sharedenotes if theremote_pathis on a file share. This defaults tofalse. - (optional)
includeproperty defines optional include filters. - (optional)
excludeproperty defines optional exclude filters. - (optional)
blobxfer_extra_optionsare any extra options to pass toblobxfer.
- (required)
- (optional)
resource_filesis an array of resource files that should be downloaded as part of the compute node's preparation. Each array entry contains the following information:file_pathis the path within the node prep task working directory to place the file on the compute node. This directory can be referenced by the$AZ_BATCH_NODE_STARTUP_DIR/wdpath.blob_sourceis an accessible HTTP/HTTPS URL. This need not be an Azure Blob Storage URL.file_modeif the file mode to set for the file on the compute node. This is optional.
- (optional)
virtual_networkis the property for specifying an ARM-based virtual network resource for the pool. AADbatchcredentials are required for both Batch service and User Subscription modes. Please see the Virtual Network guide for more information.- (required/optional)
arm_subnet_idis the full ARM resource id to the subnet on the virtual network. This virtual network must already exist and must exist within the same region and subscription as the Batch account. If this value is specified, the other properties ofvirtual_networkare ignored. AADmanagementcredentials are not strictly required for this case but is recommended to be filled to allow address space validation checks. - (required/optional)
nameis the name of the virtual network. Ifarm_subnet_idis not specified, this value is required. Note that this requires AADmanagementcredentials. - (optional)
resource_groupcontaining the virtual network. If the resource group name is not specified here, theresource_groupspecified in thebatchcredentials will be used instead. - (optional)
create_nonexistantspecifies if the virtual network and subnet should be created if not found. If not specified, this defaults tofalse. - (required if creating, optional otherwise)
address_spaceis the allowed address space for the virtual network. - (required/optional)
subnetspecifies the subnet properties. This is required ifarm_subnet_idis not specified, i.e., the virtual networknameis specified instead. - (required)
nameis the subnet name. - (required)
address_prefixis the subnet address prefix to use for allocation Batch compute nodes to. The maximum number of compute nodes a subnet can support is 4096 which maps roughly to a CIDR mask of 20-bits.
- (required/optional)
- (optional)
sshis the property for creating a user to accomodate SSH sessions to compute nodes. If this property is absent, then an SSH user is not created with pool creation. If you are running Batch Shipyard on Windows, please refer to these instructions on how to generate an SSH keypair for use with Batch Shipyard. This property is ignored for Windows-based pools.- (required)
usernameis the user to create on the compute nodes. - (optional)
expiry_daysis the number of days from now for the account on the compute nodes to expire. The default is 30 days from invocation time. - (optional)
ssh_public_keyis the path to an existing SSH public key to use. If not specified, an RSA public/private keypair will be automatically generated ifssh-keygenorssh-keygen.execan be found on thePATH. This option cannot be specified withssh_public_key_data. - (optional)
ssh_public_key_datais the raw RSA public key data in OpenSSH format, e.g., a string starting withssh-rsa .... Only one key may be specified. This option cannot be specified withssh_public_key. - (optional)
ssh_private_keyis the path to an existing SSH private key to use against eitherssh_public_keyorssh_public_key_datafor connecting to compute nodes. This option should only be specified if eitherssh_public_keyorssh_public_key_dataare specified. - (optional)
generate_docker_tunnel_scriptproperty directs script to generate an SSH tunnel script that can be used to connect to the remote Docker engine running on a compute node. This script can only be used on non-Windows systems. - (optional)
generated_file_export_pathis the path to export the generated RSA keypair and docker tunnel script to. If omitted, the current directory is used. - (experimental)
hpn_server_swapproperty enables an OpenSSH server with HPN patches to be swapped with the standard distribution OpenSSH server. This is not supported on all Linux distributions and may be force disabled. - (optional)
allow_docker_accessallows this SSH user access to the Docker daemon. The default isfalse.
- (required)
- (optional)
rdpis the property for creating a user to accomodate RDP login sessions to compute nodes. If this property is absent, then an RDP user is not created with pool creation. This property is ignored for Linux-based pools.- (required)
usernameis the user to create on the compute nodes. - (optional)
expiry_daysis the number of days from now for the account on the compute nodes to expire. The default is 30 days from invocation time. - (optional)
passwordis the password to associate with the user. Passwords must meet the minimum complexity requirements as required by Azure Batch. If left omitted, unspecified or set tonull, then a random password is generated and logged during anypool addcall with this section defined, orpool user add.
- (required)
- (optional)
remote_access_controlis a property to control access to the remote access port (SSH or RDP). If this section is omitted, then the Batch service defaults are applied which do not apply any network security rules on these ports.- (optional)
starting_portis the starting port for each SSH port on each node to map to the "front-end" load balancer. The default value is49000if not specified. Ports from50000to55000are reserved by the Batch service. You must specify enough space for 1000 ports; e.g.,49500would not be valid since the range would overlap into the reserved range. - (optional)
allowis a list of allowable address prefixes in CIDR format. - (optional)
denyis a list of address prefixes in CIDR format to deny.denyrules have lower priority thanallowrules. Therefore, you can specify a set of allowable address prefixes and then specify a single deny rule of*to deny all other IP addresses from connecting to the remote access port. Take care when specifyingdenyrules when your nodes must make use of SSH or RDP to perform actions between compute nodes.
- (optional)
- (optional)
public_ipsproperty defines a list of any pre-defined Azure-allocated Public IP addresses that are assigned to the pool. These must not already be bound and there must be a sufficient number of public IPs to cover the number of compute nodes in a pool (or any potential future resizes). These must be fully-qualified ARM Public IP resource ids. - (optional)
certificatesproperty defines any certificate references to add on this pool. These certificates must already be present on the Batch account and are only applied to new pool allocations.- (required)
sha1-thumbprintis the actual SHA-1 thumbprint of the certificate to add to the pool.- (required)
visibilityis a list of visibility settings to apply to the certificate. Valid values arenode_prep,remote_user, andtask.
- (required)
- (required)
- (optional)
additional_node_prepdefines any additional node preparation commands to execute on node start.- (optional)
commandsare the commands to execute- (optional)
preis an array of additional commands to execute on the compute node host as part of node preparation which occur prior to the Batch Shipyard node preparation steps. This is particularly useful for preparing platform images with software for custom Linux mounts. - (optional)
postis an array of additional commands to execute on the compute node host as part of node preparation which occur after the Batch Shipyard node preparation steps.
- (optional)
- (optional)
environment_variablesthat are set on the Azure Batch start task. Note that environment variables are not expanded and are passed as-is. - (optional)
environment_variables_keyvault_secret_idare any additional environment variables that should be applied to the start task but are stored in KeyVault. The secret stored in KeyVault must be a valid YAML/JSON string, e.g.,{ "env_var_name": "env_var_value" }.
- (optional)
- (optional)
gpuproperty defines additional information for NVIDIA GPU-enabled VMs. If not specified, Batch Shipyard will automatically download the driver for thevm_sizespecified.- (optional)
nvidia_driverproperty contains the following members:- (required)
sourceis the source url to download the driver. This should be the silent-installable driver package.
- (required)
- (optional)
ignore_warningsproperty allows overriding the default beahvior to place the node in start task failed state if during node prep there are warnings of possible GPU issues such as infoROM corruption. It is recommended not to set this value totrue. The default, if not specified, isfalse.
- (optional)
- (optional)
batch_insights_enabledproperty enables Batch Insights monitoring for the pool. This provides simple non-realtime, host-based monitoring through Batch Explorer. The default isfalse. - (optional)
prometheusproperties are to control if collectors for metrics to export to Prometheus monitoring are enabled. Note that all exporters do not have their ports mapped (NAT) on the load balancer pool. This means that the Prometheus instance itself must reside on, or peered with, the virtual network that the compute nodes are in. This ensures that external parties cannot scrape exporter metrics from compute node instances.- (optional)
node_exportercontains options for the Node Exporter metrics exporter.- (optional)
enabledproperty enables or disables this exporter. Default isfalse. - (optional)
portis the port for Prometheus to connect to scrape. This is the internal port on the compute node. - (optional)
optionsis a list of options to pass to the node exporter instance running on all nodes. The following collectors are force disabled, in addition to others disabled by default: textfile, mdadm, wifi, xfs, zfs. The infiniband collector is enabled if on an IB/RDMA instance, automatically. The nfs collector is enabled if mounting an NFS RemoteFS storage cluster, automatically.
- (optional)
- (optional)
cadvisorcontains options for the cAdvisor metrics exporter.- (optional)
enabledproperty enables or disables this exporter. Default isfalse. - (optional)
portis the port for Prometheus to connect to scrape. This is the internal port on the compute node. - (optional)
optionsis a list of options to pass to the cAdvisor instance running on all nodes.
- (optional)
- (optional)
- (optional)
container_runtimesproperties control container runtime behavior on the pool compute nodes.- (optional)
installcontrols which optional container runtimes to install. A list of valid values for this option arekata_containersandsingularity. Note that therunccontainer runtime is always installed. Thenvidiacontainer runtime is automatically installed when allocating a pool with GPUs.singularitymust be specified if running Singularity containers. - (optional)
defaultis the default container runtime to use for running Docker containers. This option has no effect onsingularitycontainers.
- (optional)
Full template
A full template of a credentials file can be found here. Note that these templates cannot be used as-is and must be modified to fit your scenario.