AWS Batch

tip

This guide assumes you have an existing Amazon Web Service (AWS) account.

The AWS Batch service quota for job queues is 50 per account. For more information on AWS Batch service quotas, see AWS Batch service quotas.

There are two ways to create a Seqera Platform compute environment for AWS Batch:

Automatically: this option lets Seqera automatically create the required AWS Batch resources in your AWS account, using an internal tool within Seqera Platform called "Forge". This removes the need to set up your AWS Batch infrastructure manually. Resources can also be automatically deleted when the compute environment is removed from Platform.
Manually: this option lets Seqera use existing AWS Batch resources previously created.

Both options require specific IAM permissions to function correctly, as well as access to an S3 bucket or EFS/FSx file system to store intermediate Nextflow files.

S3 bucket creation

AWS S3 (Simple Storage Service) is a type of object storage. To access input and output files using Seqera products like Studios and Data Explorer create one or more S3 buckets. An S3 bucket can also be used to store intermediate results of your Nextflow pipelines, as an alternative to using EFS or FSx file systems.

note

Using EFS or FSx as work directory is incompatible with Studios.

Navigate to the AWS S3 console.
In the top right of the page, select the same region where you plan to create your AWS Batch compute environment.
Select Create bucket.
Enter a unique name for your bucket.
Leave the rest of the options as default and select Create bucket.

note

S3 can be used by Nextflow for the storage of intermediate files. In production pipelines, this can amount to a lot of data. To reduce costs, consider using a retention policy when creating a bucket, such as automatically deleting intermediate files after 30 days. See the AWS documentation for more information.

EFS or FSx file system creation

AWS Elastic File System (EFS) and AWS FSx for Lustre are types of file storage that can be used as a Nextflow work directory to store intermediate files, as an alternative to using S3 buckets.

note

Using EFS or FSx as work directory is incompatible with Studios.

To use EFS or FSx as your Nextflow work directory, create an EFS or FSx file system in the same region where you plan to create your AWS Batch compute environment.

The creation of an EFS or FSx file system can be done automatically by Seqera when creating the AWS Batch compute environment, or manually by following the steps below. If you let Seqera create the file system automatically, it will also be deleted when the compute environment is removed from Platform, unless the "Dispose Resources" option is disabled in the advanced options.

Creating an EFS file system

To create a new EFS file system manually, visit the EFS console.

Select Create file system.
Optionally give it a name, then select the VPC where your AWS Batch compute environment will be created.
Leave the rest of the options as default and select Create file system.

Creating an FSx file system

To create a new FSx for Lustre file system manually, visit the FSx console.

Select Create file system.
Select FSx for Lustre
Follow the prompts to configure the file system according to your requirements, then select Next.
Review the configuration and select Create file system.

Make sure the Lustre client is available in the AMIs used by your AWS Batch compute environment to allow mounting FSx file systems.

Required Platform IAM permissions

To create and launch pipelines, explore buckets with Data Explorer or run Studio sessions with the AWS Batch compute environment, an IAM user with specific permissions must be provided. Some permissions are mandatory for the compute environment to be created and function correctly, while others are optional and used for example to provide list of values to pick from in the Platform UI.

Permissions can be attached directly to an IAM user, or to an IAM role that the IAM user can assume when accessing AWS resources.

A permissive and broad policy with all the required permissions is provided here for a quick start. However, we recommend following the principle of least privilege and only granting the necessary permissions for your use case, as shown in the following sections.

Full permissive policy (for reference)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BatchEnvironmentManagementCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "batch:CreateComputeEnvironment",
        "batch:CreateJobQueue",
        "batch:DeleteComputeEnvironment",
        "batch:DeleteJobQueue",
        "batch:TagResource",
        "batch:UpdateComputeEnvironment",
        "batch:UpdateJobQueue"
      ],
      "Resource": [
        "arn:aws:batch:*:*:compute-environment/TowerForge-*",
        "arn:aws:batch:*:*:job-queue/TowerForge-*"
      ]
    },
    {
      "Sid": "BatchEnvironmentListing",
      "Effect": "Allow",
      "Action": [
        "batch:DescribeComputeEnvironments",
        "batch:DescribeJobDefinitions",
        "batch:DescribeJobQueues",
        "batch:DescribeJobs"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BatchJobsManagementCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "batch:CancelJob",
        "batch:RegisterJobDefinition",
        "batch:SubmitJob",
        "batch:TagResource",
        "batch:TerminateJob"
      ],
      "Resource": [
        "arn:aws:batch:*:*:job-definition/*",
        "arn:aws:batch:*:*:job-queue/TowerForge-*",
        "arn:aws:batch:*:*:job/*"
      ]
    },
    {
      "Sid": "LaunchTemplateManagement",
      "Effect": "Allow",
      "Action": [
        "ec2:CreateLaunchTemplate",
        "ec2:DeleteLaunchTemplate",
        "ec2:DescribeLaunchTemplates",
        "ec2:DescribeLaunchTemplateVersions"
      ],
      "Resource": "*"
    },
    {
      "Sid": "PassRolesToBatchCanBeRestricted",
      "Effect": "Allow",
      "Action": "iam:PassRole",
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "iam:PassedToService": [
            "batch.amazonaws.com",
            "ec2.amazonaws.com"
          ]
        }
      }
    },
    {
      "Sid": "CloudWatchLogsAccessCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "logs:Describe*",
        "logs:FilterLogEvents",
        "logs:Get*",
        "logs:List*",
        "logs:StartQuery",
        "logs:StopQuery",
        "logs:TestMetricFilter"
      ],
      "Resource": "*"
    },
    {
      "Sid": "OptionalS3PlatformDataAccessCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "s3:Get*",
        "s3:List*",
        "s3:PutObject"
      ],
      "Resource": "*"
    },
    {
      "Sid": "OptionalIAMManagementCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "iam:AddRoleToInstanceProfile",
        "iam:AttachRolePolicy",
        "iam:CreateInstanceProfile",
        "iam:CreateRole",
        "iam:DeleteInstanceProfile",
        "iam:DeleteRole",
        "iam:DeleteRolePolicy",
        "iam:DetachRolePolicy",
        "iam:GetRole",
        "iam:ListAttachedRolePolicies",
        "iam:ListRolePolicies",
        "iam:PutRolePolicy",
        "iam:RemoveRoleFromInstanceProfile",
        "iam:TagInstanceProfile",
        "iam:TagRole"
      ],
      "Resource": [
        "arn:aws:iam::*:role/TowerForge-*",
        "arn:aws:iam::*:instance-profile/TowerForge-*"
      ]
    },
    {
      "Sid": "OptionalFetchOptimizedAMIMetadata",
      "Effect": "Allow",
      "Action": "ssm:GetParameters",
      "Resource": "arn:aws:ssm:*:*:parameter/aws/service/ecs/*"
    },
    {
      "Sid": "OptionalEC2MetadataDescribe",
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeAccountAttributes",
        "ec2:DescribeImages",
        "ec2:DescribeInstanceTypeOfferings",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeKeyPairs",
        "ec2:DescribeSecurityGroups",
        "ec2:DescribeSubnets",
        "ec2:DescribeVpcs"
      ],
      "Resource": "*"
    },
    {
      "Sid": "OptionalFSXManagementCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "fsx:CreateFileSystem",
        "fsx:DeleteFileSystem",
        "fsx:DescribeFileSystems",
        "fsx:TagResource"
      ],
      "Resource": "*"
    },
    {
      "Sid": "OptionalEFSManagementCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "elasticfilesystem:CreateFileSystem",
        "elasticfilesystem:DeleteFileSystem",
        "elasticfilesystem:CreateMountTarget",
        "elasticfilesystem:DeleteMountTarget",
        "elasticfilesystem:DescribeFileSystems",
        "elasticfilesystem:DescribeMountTargets",
        "elasticfilesystem:UpdateFileSystem",
        "elasticfilesystem:PutLifecycleConfiguration",
        "elasticfilesystem:TagResource"
      ],
      "Resource": "*"
    },
    {
      "Sid": "OptionalPipelineSecretsListing",
      "Effect": "Allow",
      "Action": "secretsmanager:ListSecrets",
      "Resource": "*"
    },
    {
      "Sid": "OptionalPipelineSecretsManagementCanBeRestricted",
      "Effect": "Allow",
      "Action": [
        "secretsmanager:DescribeSecret",
        "secretsmanager:DeleteSecret",
        "secretsmanager:CreateSecret"
      ],
      "Resource": "arn:aws:secretsmanager:*:*:secret:tower-*"
    },
    {
      "Sid": "OptionalUserdataCheck",
      "Effect": "Allow",
      "Action": [
        "ec2:GetConsoleOutput"
      ],
      "Resource": "*"
    },
    {
      "Sid": "OptionalLineageIntegrationSQSAndS3",
      "Effect": "Allow",
      "Action": [
        "sqs:CreateQueue",
        "sqs:GetQueueAttributes",
        "sqs:SetQueueAttributes",
        "sqs:GetQueueUrl",
        "sqs:ReceiveMessage",
        "sqs:DeleteMessage",
        "s3:CreateBucket",
        "s3:GetBucketNotification",
        "s3:PutBucketNotification",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:sqs:*:*:seqera-lineage-*",
        "arn:aws:s3:::seqera-lineage-*"
      ]
    }
  ]
}

Download aws-batch-full-policy.json

AWS Batch management

The first section of the policy allows Seqera to create, update and delete Batch compute environments ("CE"), job queues ("JQ") and jobs.

If you are required to use manually created CEs and JQs or prefer to manage their lifecycle yourself, you can remove the permissions to manipulate CEs and JQs from the policy. The minimum permissions required are:

batch:DescribeJobs to report job status
batch:DescribeJobDefinitions to list existing job definitions
batch:RegisterJobDefinition to create new job definitions
batch:CancelJob to cancel jobs
batch:SubmitJob to submit jobs
batch:TagResource to tag jobs
batch:TerminateJob to terminate jobs

You can use batch:DescribeJobQueues to list the existing job queues in a drop-down but it's not required if you're specifying manually created job queues. However, it is required when you let Seqera create and manage job queues automatically (using the Forge tool). In this case, the batch:DescribeComputeEnvironments permission must also be added.

You can also restrict permissions based on resource tags. These are defined by users when they set up a pipeline in Platform.

{
  "Sid": "BatchEnvironmentListing",
  "Effect": "Allow",
  "Action": [
    "batch:DescribeJobDefinitions",
    "batch:DescribeJobs"
  ],
  "Resource": "*"
},
{
  "Sid": "BatchJobsManagement",
  "Effect": "Allow",
  "Action": [
    "batch:CancelJob",
    "batch:RegisterJobDefinition",
    "batch:SubmitJob",
    "batch:TagResource",
    "batch:TerminateJob"
  ],
  "Resource": [
    "arn:aws:batch:<REGION>:<ACCOUNT_ID>:job-queue/MyCustomJQ",
    "arn:aws:batch:<REGION>:<ACCOUNT_ID>:job-definition/*",
    "arn:aws:batch:<REGION>:<ACCOUNT_ID>:job/*"
  ],
  "Condition": {
    "StringEqualsIfExists": {
      "aws:ResourceTag/MyCustomTag": "MyCustomValue"
    }
  }
}

warning

Restricting the batch actions using resource tags requires that you set the appropriate tags on each Seqera pipeline when configuring it in Platform. Forgetting to set the tag will cause the pipeline to fail to run.

The job definition and job name resources cannot be restricted to specific names, as Seqera creates job definitions and jobs with dynamic names. Therefore, the wildcard * must be used in the name of these resources. In addition, batch:SubmitJob requires permission on both job definitions and job queues, so make sure to include both ARNs in the Resource array.

If you prefer to let Seqera manage Batch resources for you, you can still restrict the permissions to specific resources in your account ID and region; you can also restrict permissions based on Resource tag, as shown with the Conditions in the example above.

note

The quick start policy is expecting CE and JQ names automatically created by Seqera to start with the TowerForge- prefix, which is the default prefix used by Platform Cloud resources and can't be customized.

Launch template management

Seqera requires the ability to create and manage EC2 launch templates using optimized AMIs identified via AWS Systems Manager (SSM).

note

AWS does not support restricting IAM permissions on EC2 launch templates based on specific resource names or tags. As a result, permission to operate on any resource * must be granted.

Pass role to Batch

The iam:PassRole permission allows Seqera to pass execution IAM roles to AWS Batch to run Nextflow pipelines.

Permissions can be restricted to only allow passing the manually created roles or the roles created by Seqera automatically with the default prefix TowerForge- to the AWS Batch and EC2 services, in a specific account:

{
  "Sid": "PassRolesToBatch",
  "Effect": "Allow",
  "Action": "iam:PassRole",
  "Resource": "arn:aws:iam::<ACCOUNT_ID>:role/TowerForge-*",
  "Condition": {
    "StringEquals": {
      "iam:PassedToService": [
        "batch.amazonaws.com",
        "ec2.amazonaws.com"
      ]
    }
  }
}

CloudWatch logs access

Seqera requires access to CloudWatch logs to display relevant log data in the web interface.

The policy can be scoped down to limit access to the specific log group defined on the compute environment in a specific account and region:

{
  "Sid": "CloudWatchLogsAccess",
  "Effect": "Allow",
  "Action": [
    "logs:Describe*",
    "logs:FilterLogEvents",
    "logs:Get*",
    "logs:List*",
    "logs:StartQuery",
    "logs:StopQuery",
    "logs:TestMetricFilter"
  ],
  "Resource": "arn:aws:logs:<REGION>:<ACCOUNT_ID>:log-group:/aws/batch/job/*"
}

S3 access (optional)

Seqera automatically attempts to fetch a list of S3 buckets available in the AWS account connected to Platform, to provide them in a drop-down to be used as Nextflow working directory, and make the compute environment creation smoother. This feature is optional, and users can type the bucket name manually when setting up a compute environment. To allow Seqera to fetch the list of buckets in the account, the s3:ListAllMyBuckets action can be added, and it must have the Resource field set to *, as shown in the generic policy at the beginning of this document. The s3:ListAllMyBuckets action also allows Data Explorer to auto-discover the data repositories accessible to your workspace credentials.

Seqera offers several products to manipulate data on AWS S3 buckets, such as Studios and Data Explorer; if these features are not used the related permissions can be omitted.

The IAM policy can be scoped down to only allow limited read/write permissions in certain S3 buckets used by Studios/Data Explorer. For each bucket you want to browse, upload to, or download from with Data Explorer, grant s3:GetObject and s3:PutObject on the bucket objects, and s3:ListBucket, s3:GetBucketLocation, s3:GetBucketPolicy, and s3:GetBucketAcl on the bucket itself. In addition, the policy must include permission to check the region and list the content of the S3 bucket used as Nextflow work directory. We also recommend granting the s3:GetObject permission on the work directory path to fetch Nextflow log files.

note

If you opted to create a separate S3 bucket only for Nextflow work directories, there is no need for the IAM user to have read/write access to it. If Seqera is allowed to manage resources (using Batch Forge) the IAM roles automatically created will have the necessary permissions.

If you set up the compute environment manually, you can create the required IAM roles with the necessary permissions as detailed in the manual AWS Batch setup documentation.

{
  "Sid": "S3CheckBucketWorkDirectory",
  "Effect": "Allow",
  "Action": [
    "s3:GetBucketLocation",
    "s3:ListBucket"
  ],
  "Resource": [
    "arn:aws:s3:::example-bucket-used-as-work-directory"
  ]
},
{
  "Sid": "S3ReadOnlyNextflowLogFiles",
  "Effect": "Allow",
  "Action": [
    "s3:GetObject"
  ],
  "Resource": [
    "arn:aws:s3:::example-bucket-used-as-work-directory/path/to/work/directory/*"
  ]
},
{
  "Sid": "S3ReadWriteBucketsForStudiosDataExplorer",
  "Effect": "Allow",
  "Action": [
    "s3:GetObject",
    "s3:GetObjectTagging",
    "s3:GetBucketLocation",
    "s3:GetBucketPolicy",
    "s3:GetBucketAcl",
    "s3:ListBucket",
    "s3:PutObject",
    "s3:PutObjectTagging",
    "s3:DeleteObject"
  ],
  "Resource": [
    "arn:aws:s3:::example-bucket-read-write-studios",
    "arn:aws:s3:::example-bucket-read-write-studios/*",
    "arn:aws:s3:::example-bucket-read-write-data-explorer",
    "arn:aws:s3:::example-bucket-read-write-data-explorer/*"
  ]
}

note

s3:GetBucketLocation allows Data Explorer to resolve each bucket's region. s3:GetBucketPolicy and s3:GetBucketAcl allow it to inspect each bucket's access configuration when it lists and connects to data repositories. If you prefer not to enumerate individual actions, the s3:Get* and s3:List* wildcards shown in the full permissive policy above also cover these actions.

IAM roles for AWS Batch (optional)

Seqera can automatically create the IAM roles needed to interact with AWS Batch and other AWS services. You can opt out of this behavior by creating the required IAM roles manually and providing their ARNs during compute environment creation in Platform: refer to the documentation for more details on how to manually set up IAM roles.

To allow Seqera to create IAM roles but restrict it to your specific account and the default IAM role prefix, use the following statement:

{
  "Sid": "IAMRoleAndProfileManagement",
  "Effect": "Allow",
  "Action": [
    "iam:AddRoleToInstanceProfile",
    "iam:AttachRolePolicy",
    "iam:CreateInstanceProfile",
    "iam:CreateRole",
    "iam:DeleteInstanceProfile",
    "iam:DeleteRole",
    "iam:DeleteRolePolicy",
    "iam:DetachRolePolicy",
    "iam:GetRole",
    "iam:ListAttachedRolePolicies",
    "iam:ListRolePolicies",
    "iam:PutRolePolicy",
    "iam:RemoveRoleFromInstanceProfile",
    "iam:TagInstanceProfile",
    "iam:TagRole"
  ],
  "Resource": [
    "arn:aws:iam::<ACCOUNT_ID>:role/TowerForge-*"
    "arn:aws:iam::<ACCOUNT_ID>:instance-profile/TowerForge-*"
  ]
}

note

The quick start policy is expecting role names automatically created by Seqera to start with the TowerForge- prefix, which is the default prefix used by Platform Cloud resources and can't be customized.

AWS Systems Manager (optional)

Seqera Platform can interact with AWS Systems Manager (SSM) to identify ECS Optimized AMIs for pipeline execution. This permission is optional, meaning that a custom AMI ID can be provided at compute environment creation, removing the need for this permission.

EC2 describe permissions (optional)

Seqera can interact with EC2 to retrieve information about existing AWS resources in your account, including VPCs, subnets, and security groups. This data is used to populate drop-downs in the Platform UI when creating new compute environments. While these permissions are optional, they are recommended to enhance the user experience. Without these permissions, resource ARNs need to be manually entered in the interface by the user.

note

AWS does not support restricting IAM permissions on EC2 Describe actions based on specific resource names or tags. As a result, permission to operate on any resource * must be granted.

FSx file systems (optional)

Seqera can manage AWS FSx file systems, if needed by the pipelines.

This section of the policy is optional and can be omitted if FSx file systems are not used by your pipelines. The describe actions cannot be restricted to specific resources, so permission to operate on any resource * must be granted. The management actions can be restricted to specific resources, like in the example below.

{
  "Sid": "FSxDescribe",
  "Effect": "Allow",
  "Action": [
    "fsx:DescribeFileSystems"
  ],
  "Resource": "*"
},
{
  "Sid": "FSxManagement",
  "Effect": "Allow",
  "Action": [
    "fsx:CreateFileSystem",
    "fsx:DeleteFileSystem",
    "fsx:TagResource"
  ],
  "Resource": "arn:aws:fsx:<REGION>:<ACCOUNT_ID>:file-system/MyManualFSx"
}

EFS file systems (optional)

Seqera can manage AWS EFS file systems, if needed by the pipelines.

This section of the policy is optional and can be omitted if EFS file systems are not used by your pipelines. The describe actions cannot be restricted to specific resources, so permission to operate on any resource * must be granted. The management actions can be restricted to specific resources, like in the example below.

{
  "Sid": "EFSDescribe",
  "Effect": "Allow",
  "Action": [
    "elasticfilesystem:DescribeFileSystems",
    "elasticfilesystem:DescribeMountTargets"
  ],
  "Resource": "*"
},
{
  "Sid": "EFSManagement",
  "Effect": "Allow",
  "Action": [
    "elasticfilesystem:CreateFileSystem",
    "elasticfilesystem:DeleteFileSystem",
    "elasticfilesystem:CreateMountTarget",
    "elasticfilesystem:DeleteMountTarget",
    "elasticfilesystem:UpdateFileSystem",
    "elasticfilesystem:PutLifecycleConfiguration",
    "elasticfilesystem:TagResource"
  ],
  "Resource": "arn:aws:elasticfilesystem:<REGION>:<ACCOUNT_ID>:file-system/MyManualEFS"
}

Pipeline secrets (optional)

Seqera can synchronize pipeline secrets defined on the Platform workspace with AWS Secrets Manager, which requires additional permissions on the IAM user. If you do not plan to use pipeline secrets, you can omit this section of the policy.

The listing of secrets cannot be restricted, but the management actions can be restricted to only allow managing secrets in a specific account and region, which must be the same region where the pipeline runs. Note that Seqera only creates secrets with the tower- prefix.

{
  "Sid": "PipelineSecretsListing",
  "Effect": "Allow",
  "Action": "secretsmanager:ListSecrets",
  "Resource": "*"
},
{
  "Sid": "PipelineSecretsManagementCanBeRestricted",
  "Effect": "Allow",
  "Action": [
    "secretsmanager:DescribeSecret",
    "secretsmanager:DeleteSecret",
    "secretsmanager:CreateSecret"
  ],
  "Resource": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:tower-*"
}

Additional steps required to use secrets in a pipeline

To successfully use pipeline secrets, the IAM roles manually created must follow the steps detailed in the documentation.

Userdata script error detection (optional)

Platform can retrieve the EC2 instance console output to detect errors in the userdata script that bootstraps the VM during instance startup. If the userdata script fails, Platform surfaces the failure as a warning on the workflow. Without this permission, userdata script failures are not detected and no warning is shown.

{
  "Sid": "OptionalUserdataCheck",
  "Effect": "Allow",
  "Action": [
    "ec2:GetConsoleOutput"
  ],
  "Resource": "*"
}

Data lineage (optional)

If you enable data lineage in your workspace, add the following permissions to your Platform integration credentials to create the queue infrastructure and bucket notifications used by the lineage service:

{
  "Sid": "LineageIntegrationSQS",
  "Effect": "Allow",
  "Action": [
    "sqs:CreateQueue",
    "sqs:GetQueueAttributes",
    "sqs:SetQueueAttributes",
    "sqs:GetQueueUrl",
    "sqs:ReceiveMessage",
    "sqs:DeleteMessage"
  ],
  "Resource": "arn:aws:sqs:<REGION>:<ACCOUNT_ID>:seqera-lineage-*"
},
{
  "Sid": "LineageIntegrationS3",
  "Effect": "Allow",
  "Action": [
    "s3:CreateBucket",
    "s3:GetBucketNotification",
    "s3:PutBucketNotification",
    "s3:GetBucketLocation"
  ],
  "Resource": "arn:aws:s3:::seqera-lineage-*"
}

If you manage your own EC2 instance role or head job role (rather than letting Seqera create them with Batch Forge), see Manual AWS Batch configuration for additional S3 permissions to add to those roles.

Create the IAM policy

The policy above must be created in the AWS account where the AWS Batch resources need to be created.

Open the AWS IAM console in the account where you want to create the AWS Batch resources.
From the left navigation menu, select Policies under Access management.
Select Create policy.
On the Policy editor section, select the JSON tab.
Following the instructions detailed in the IAM permissions breakdown section replace the default text in the policy editor area under the JSON tab with a policy adapted to your use case, then select Next.
Enter a name and description for the policy on the Review and create page, then select Create policy.

IAM user creation

For key-based credentials only, Seqera requires an Identity and Access Management (IAM) User to create and manage AWS Batch resources in your AWS account. We recommend creating a separate IAM policy rather an IAM User inline policy, as the latter only allows 2048 characters, which may not be sufficient for all the required permissions.

In certain scenarios, for example when multiple users need to access the same AWS account and provision AWS Batch resources, an IAM role with the required permissions can be created instead, and the IAM user can assume that role when accessing AWS resources, as detailed in the IAM role creation (optional) section. For Cloud deployments, Seqera Cloud is the user that will manage resources with the permissions you give it, managed through a trust policy.

Depending whether you choose to let Seqera automatically create the required AWS Batch resources in your account, or prefer to set them up manually, the IAM user must have specific permissions as detailed in the Required Platform IAM permissions section. Alternatively, you can create an IAM role with the required permissions and allow the IAM user to assume that role when accessing AWS resources, as detailed in the IAM role creation (optional) section.

AWS credential options

AWS credentials can be configured in two ways:

Key-based credentials: Access key and secret key with direct IAM permissions. If you provide a role ARN in Assume role, the Generate External ID switch is displayed and External ID generation is optional.
Role-based credentials (recommended): Use role assumption only (no static keys). Paste the IAM role ARN which Seqera must use for accessing your AWS resources in Assume role. External ID is generated automatically when you save.

Use the IAM role ARN which Seqera must use for accessing your AWS resources in Assume role. This field is available for both key-based and role-based credentials. It is optional for key-based credentials and required for role-based credentials.

Existing credentials created before March 2026 continue to work without changes.

Create an IAM user (key-based)

From the AWS IAM console, select Users in the left navigation menu, then select Create User at the top right of the page.
Enter a name for your user (e.g., seqera) and select Next.
Under Permission options, select Attach policies directly, then search for and select the policy created above, and select Next.

Optionally, if you want to use an ARN and External ID with key based access, add the following to the user's Permission policy. This will allow the IAM User to assume a role in order to manage batch resources

{
  "Sid": "AssumeRoleToManageBatchResources",
  "Effect": "Allow",
  "Action": "sts:AssumeRole",
  "Resource": "arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>",
  "Condition": {
    "StringEquals": {
      "sts:ExternalId": "<EXTERNAL_ID>"
    }
  }
}

On the last page, review the user details and select Create user.

The user has now been created. The most up-to-date instructions for creating an IAM user can be found in the AWS documentation.

Obtain IAM user credentials (key-based)

To get the credentials needed to connect Seqera to your AWS account, follow these steps:

From the AWS IAM console, select Users in the left navigation menu, then select the newly created user from the users table.
Select the Security credentials tab, then select Create access key under the Access keys section.
In the Use case dialog that appears, select Command line interface (CLI), then tick the confirmation checkbox at the bottom to acknowledge that you want to proceed creating an access key, and select Next.
Optionally provide a description for the access key, like the reason for creating it, then select Create access key.
Save the Access key and Secret access key in a secure location as you will need to provide them when creating credentials in Seqera.

IAM role-based credential creation

Rather than attaching permissions directly to the IAM user, you can create an IAM role with the required permissions and allow the Seqera Cloud to assume that role when accessing AWS resources. This is useful when multiple third parties access the same AWS account: this way the actual permissions to operate on the resources are only granted to a single centralized role.

From the AWS IAM console, select Roles in the left navigation menu, then select Create role at the top right of the page.

Select Custom trust policy as the trusted entity type in the AWS Console. Allow the Seqera Cloud access role arn:aws:iam::161471496260:role/SeqeraPlatformCloudAccessRole in your trust policy as shown below, then select Next.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::161471496260:role/SeqeraPlatformCloudAccessRole"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "<EXTERNAL_ID>"
        }
      }
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::161471496260:role/SeqeraPlatformCloudAccessRole"
      },
      "Action": "sts:TagSession"
    }
  ]
}

On the Permissions page, search for and select the policy created in the Create the IAM policy section, then select Next.
Give the role a name and optionally a description, review the details of the role, optionally provide tags to help you identify the role, then select Create role.

note

The External ID is generated by Seqera when you save your credentials. Complete the following steps to finalize the trust policy:

In Seqera, create new AWS credentials, select Role mode, paste the role ARN, and save. Seqera generates and displays a unique External ID.
Return to the IAM role's trust policy in AWS and replace the <EXTERNAL_ID> placeholder with the generated value.

Automatic configuration of Batch resources

Seqera automates the configuration of an AWS Batch compute environment and the queues required for deploying Nextflow pipelines.

caution

AWS Batch creates resources that you may be charged for in your AWS account. See Cloud costs for guidelines to manage cloud resources effectively and prevent unexpected costs.

AWS Batch

Seqera automates the configuration of an AWS Batch compute environment and the queues required to deploy Nextflow pipelines. After your IAM User or Role and S3 bucket have been set up, create a new AWS Batch compute environment in Seqera.

Create a Seqera AWS Batch compute environment

Seqera will create the head and compute job queues and their respective compute environments where jobs will be executed. The job queues are configured with job state limit actions to automatically purge jobs that cannot be scheduled on any node type available for the compute environment. Depending on the provided configuration in the UI, Seqera might also create IAM roles for Nextflow head job execution, CloudWatch log groups, EFS or FSx filesystems, etc.

Select Compute environments from the navigation menu of the Seqera Workspace where you want to setup the CE.
Select Add compute environment.
Enter a descriptive name for this environment, e.g., AWS Batch Spot (eu-west-1).
Select AWS Batch as the target platform.
From the Credentials drop-down, select existing AWS credentials, or select + to add new credentials. If you're using existing credentials, skip to step 9.

note
You can create multiple credentials in your Seqera environment. See Credentials.
Enter a name, e.g., AWS Credentials.
Under AWS credential mode, select Keys or Role.
For Keys mode:
- Add the Access key and Secret key you previously obtained.
- Optionally paste the IAM role ARN which Seqera must use for accessing your AWS resources in Assume role.
- If you paste a role ARN in Assume role, the Generate External ID switch is displayed. Generating an External ID is optional in Keys mode.
- If Generate External ID is selected, an External ID is automatically generated and shown after you save the credential.
For Role mode:
- Paste the IAM role ARN which Seqera must use for accessing your AWS resources in Assume role.
- External ID is generated automatically when you save the credential.
note
When using AWS keys without an assumed role, the associated AWS user must have been granted permissions to operate on the cloud resources directly. When an assumed role is provided, the IAM user keys are only used to retrieve temporary credentials impersonating the role specified: this could be useful when e.g. multiple IAM users are used to access the same AWS account, and the actual permissions to operate on the resources are only granted to the role.
Select a Region, e.g., eu-west-1 - Europe (Ireland). This region must match the location of the S3 bucket or EFS/FSx file system you plan to use as work directory.
In the Pipeline work directory field type or select from the drop-down the S3 bucket previously created, e.g., s3://seqera-bucket. The work directory can be customized to specify a folder inside the bucket where Nextflow intermediate files will be stored, e.g., s3://seqera-bucket/nextflow-workdir. The bucket must be located in the same region chosen in the previous step.

note
When you specify an S3 bucket as your work directory, this bucket is used for the Nextflow cloud cache by default. Seqera adds a cloudcache block to the Nextflow configuration file for all runs executed with this compute environment. This block includes the path to a cloudcache folder in your work directory, e.g., s3://seqera-bucket/cloudcache/.cache. You can specify an alternative cache location with the Nextflow config file field on the pipeline launch form.

Similarly you can specify a path in an EFS or FSx file system as your work directory. When using EFS or FSx, you'll need to scroll down to "EFS file system" or "FSx for Lustre" sections to specify either an existing file system ID or let Seqera create a new one for you automatically. Read the notes in steps 23 and 24 below on how to setup EFS or FSx.

warning
Using an EFS or FSx file system as your work directory is currently incompatible with Studios, and will result in errors with checkpoints and mounted data. Use an S3 bucket as your work directory when using Studios.
Select Enable Wave containers to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See Wave containers for more information.
Select Enable Fusion v2 to allow access to your S3-hosted data via the Fusion v2 virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See Fusion file system for configuration details.
Use Fusion v2 file system
note
The compute recommendations below are based on internal benchmarking performed by Seqera. Benchmark runs of nf-core/rnaseq used profile test_full, consisting of an input dataset with 16 FASTQ files and a total size of approximately 123.5 GB.
We recommend using Fusion with AWS NVMe instances (fast instance storage) as this delivers the fastest performance when compared to environments using only AWS EBS (Elastic Block Store).
1. Use Seqera Platform version 23.1 or later.
2. Use an S3 bucket as the pipeline work directory.
3. Enable Wave containers, Fusion v2, and fast instance storage.
4. Select the Batch Forge config mode.
5. Fast instance storage requires an EC2 instance type that uses NVMe disks. Specify NVMe-based instance types in Instance types under Advanced options. If left unspecified, Platform selects instances from AWS NVMe-based instance type families. See Instance store temporary block storage for EC2 instances for more information.
note
When enabling fast instance storage, do not select the optimal instance type families (c4, m4, r4) for your compute environment as these are not NVMe-based instances. Specify AWS NVMe-based instance types, or leave the Instance types field empty for Platform to select NVMe instances for you.
tip
We recommend selecting 8xlarge or above for large and long-lived production pipelines:

A local temp storage disk of at least 200 GB and a random read speed of 1000 MBps or more. To work with files larger than 100 GB, increase temp storage accordingly (400 GB or more).

Dedicated networking ensures a guaranteed network speed service level compared with "burstable" instances. See Instance network bandwidth for more information.
When using Fusion v2 without fast instance storage, the following EBS settings are applied to optimize file system performance:
- EBS boot disk size is increased to 100 GB
- EBS boot disk type GP3 is selected
- EBS boot disk throughput is increased to 325 MB/s
Extensive benchmarking of Fusion v2 has demonstrated that the increased cost associated with these settings are generally outweighed by the costs saved due to decreased run time.
Select Enable Fusion Snapshots (beta) to enable Fusion to automatically restore jobs that are interrupted when an AWS Spot instance reclamation occurs. Requires Fusion v2. See Fusion Snapshots for more information.
Set the Config mode to Batch Forge to allow Seqera Platform to manage AWS Batch compute environments using the Forge tool.
Select a Provisioning model. To minimize compute costs select Spot. You can specify an allocation strategy and instance types under Advanced options. If advanced options are omitted, Seqera Platform 23.2 and later versions default to BEST_FIT_PROGRESSIVE for On-Demand and SPOT_PRICE_CAPACITY_OPTIMIZED for Spot compute environments.

note
You can create a compute environment that launches either Spot or On-Demand instances. Spot instances can cost as little as 20% of On-Demand instances, and with Nextflow's ability to automatically relaunch failed tasks, Spot is almost always the recommended provisioning model. Note, however, that when choosing Spot instances, Seqera will also create a dedicated queue for running the main Nextflow job using a single On-Demand instance to prevent any execution interruptions.
From Nextflow version 24.10, the default Spot reclamation retry setting changed to 0 on AWS and Google. By default, no internal retries are attempted on these platforms. Spot reclamations now lead to an immediate failure, exposed to Nextflow in the same way as other generic failures (returning for example, exit code 1 on AWS). Nextflow will treat these failures like any other job failure unless you actively configure a retry strategy. For more information, see Spot instance failures and retries.
Enter the Max CPUs, e.g., 64. This is the maximum number of combined CPUs (the sum of all instances' CPUs) AWS Batch will provision at any time.
Select EBS Auto scale (deprecated) to allow the EC2 virtual machines to dynamically expand the amount of available disk space during task execution. This feature is deprecated, and is not compatible with Fusion v2.

note
When you run large AWS Batch clusters (hundreds of compute nodes or more), EC2 API rate limits may cause the deletion of unattached EBS volumes to fail. You should delete volumes that remain active after Nextflow jobs have completed to avoid additional costs. Monitor your AWS account for any orphaned EBS volumes via the EC2 console, or with a Lambda function. See here for more information.
With the optional Enable Fusion mounts (deprecated) feature enabled, S3 buckets specified in Pipeline work directory and Allowed S3 Buckets are mounted as file system volumes in the EC2 instances carrying out the Batch job execution. These buckets can then be accessed at /fusion/s3/<bucket-name>. For example, if the bucket name is s3://imputation-gp2, your pipeline will access it using the file system path /fusion/s3/imputation-gp2. Note: This feature has been deprecated. Consider using Fusion v2 (see above) for enhanced performance and stability.

note
You do not need to modify your pipeline or files to take advantage of this feature. Nextflow will automatically recognize and replace any reference to files prefixed with s3:// with the corresponding Fusion mount paths.
Select Enable Fargate for head job to run the Nextflow head job with the AWS Fargate container service and speed up pipeline launch. Fargate is a serverless compute engine that enables users to run containers without the need to provision servers or clusters in advance. AWS takes a few minutes to spin up an EC2 instance, whereas jobs can be launched with Fargate in under a minute (depending on container size). We recommend Fargate for most pipeline deployments, but EC2 is more suitable for environments that use GPU instances, custom AMIs, or that require more than 16 vCPUs. If you specify a custom AMI ID in the Advanced options below, this will not be applied to the Fargate-enabled head job. See here for more information on Fargate's limitations.

note
Fargate requires the Fusion v2 file system and a Spot provisioning model. Fargate is not compatible with EFS and FSx file systems.
Select Enable GPUs if you intend to run GPU-dependent workflows in the compute environment. See GPU usage for more information.

note
Seqera only supports NVIDIA GPUs. Select instances with NVIDIA GPUs for your GPU-dependent processes.
Select Use Graviton CPU architecture to execute on Graviton-based EC2 instances (i.e., ARM64 CPU architecture). When enabled, m6g, r6g, and c6g instance types are used by default for compute jobs, but 3rd-generation Graviton instances are also supported. You can specify your own Instance types under Advanced options.

note
Graviton requires Fargate, Wave containers, and Fusion v2 file system to be enabled. This feature is not compatible with GPU-based architecture.
Enter any additional Allowed S3 buckets that your workflows require to read input data or write output data. The Pipeline work directory bucket above is added by default to the list of Allowed S3 buckets.
To use an EFS file system in your pipeline, you can either select Use existing EFS file system and specify an existing EFS instance, or select Create new EFS file system to create one.

To use the EFS file system as the work directory of the compute environment specify <your_EFS_mount_path>/work in the Pipeline work directory field (step 10 of this guide).
- To use an existing EFS file system, enter the EFS file system id and EFS mount path. This is the path where the EFS volume is accessible to the compute environment. For simplicity, we recommend that you use /mnt/efs as the EFS mount path.
- To create a new EFS file system, enter the EFS mount path. We advise that you specify /mnt/efs as the EFS mount path.
- EFS file systems created by Batch Forge are automatically tagged in AWS with Name=TowerForge-<id>, with <id> being the compute environment ID. Any manually-added resource label with the key Name (capital N) will override the automatically-assigned TowerForge-<id> label.
- A custom EC2 security group needs to be configured to allow the compute environment to access the EFS file system.
  - Visit the AWS Console for Security groups and switch to the region where your workload will run.
  - Select Create security group.
  - Enter a relevant name like seqera-efs-access-sg and description, e.g., EFS access for Seqera Batch compute environment.
  - Empty both Inbound rules and Outbound rules sections by deleting default rules.
  - Optionally add Tags to the security group, then select Create security group.
  - After creating the security group, select it from the security groups list, then select the Inbound rules tab and select Edit inbound rules.
  - Select Add rule and configure the new rule as follows:
    - Type: NFS
    - Source: Custom and enter the security group ID that you're editing (you can search for it by name, e.g., seqera-efs-access-sg). This allows resources associated with the same security group to communicate with each other.
  - Select Save rules to finalize the inbound rule configuration.
  - Repeat the same steps to add an outbound rule to allow all outbound traffic: set type All traffic and destination Anywhere-IPv4/Anywhere-IPv6.
  - See the AWS documentation about EFS security groups for more information.
  - The Security group then needs to be defined in the Advanced options below to allow the compute environment to access the EFS file system.
warning
EFS file systems cannot be used as work directory for Studios, but can be mounted and used by applications running in Studios.
To use a FSx for Lustre file system in your pipeline, you can either select Use existing FSx file system and specify an existing FSx instance, or select Create new FSx file system to create one.

To use the FSx file system as your work directory, specify <your_FSx_mount_path>/work in the Pipeline work directory field (step 10 of this guide).
- To use an existing FSx file system, enter the FSx DNS name and FSx mount path. The FSx mount path is the path where the FSx volume is accessible to the compute environment. For simplicity, we recommend that you use /mnt/fsx as the FSx mount path.
- To create a new FSx file system, enter the FSx size (in GB) and the FSx mount path. We advise that you specify /mnt/fsx as the FSx mount path.
- FSx file systems created by Batch Forge are automatically tagged in AWS with Name=TowerForge-<id>, with <id> being the compute environment ID. Any manually-added resource label with the key Name (capital N) will override the automatically-assigned TowerForge-<id> label.
- A custom EC2 security group needs to be configured to allow the compute environment to access the FSx file system.
  - Visit the AWS Console for Security groups and switch to the region where your workload will run.
  - Select Create security group.
  - Enter a relevant name like seqera-fsx-access-sg and description, e.g., FSx access for Seqera Batch compute environment.
  - Empty both Inbound rules and Outbound rules sections by deleting default rules.
  - Optionally add Tags to the security group, then select Create security group.
  - After creating the security group, select it from the security groups list, then select the Inbound rules tab and select Edit inbound rules.
  - Select Add rule and configure the new rule as follows:
    - Type: Custom TCP
    - Port range: 988
    - Source: Custom and enter the security group ID that you're editing (you can search for it by name, e.g., seqera-fsx-access-sg). This allows resources associated with the same security group to communicate with each other.
  - Repeat the step to add another rule with:
    - Type: Custom TCP
    - Port range: 1018-1023
    - Source: Custom, same as above.
  - Select Save rules to finalize the inbound rule configuration.
  - Repeat the same steps to add an outbound rule to allow all outbound traffic: set type All traffic and destination Anywhere-IPv4/Anywhere-IPv6.
  - See the AWS documentation about FSx security groups for more information.
  - The Security group then needs to be defined in the Advanced options below to allow the compute environment to access the FSx file system.
- You may need to install the lustre client in the AMI used by your compute environment to access FSx file systems. See Installing the Lustre client for more information.
warning
FSx file systems cannot be used as work directory for Studios, but can be mounted and used by applications running in Studios.
Select Dispose resources to automatically delete all AWS resources created by Seqera Platform when you delete the compute environment, including EFS/FSx file systems.
Apply Resource labels to the cloud resources produced by this compute environment. Workspace default resource labels are prefilled.
Expand Staging options to include:
- Optional pre- or post-run Bash scripts that execute before or after the Nextflow pipeline execution in your environment.
- Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Values defined here are pre-filled in the Nextflow config file field in the pipeline launch form. These values can be overridden during pipeline launch.
info
Configuration settings in this field override the same values in the pipeline repository nextflow.config file. See Nextflow config file for more information on configuration priority.
Under Environment variables, add each variable with a Name, Value, and Target Environment:
- Head job: Adds the variable to the Nextflow head job container, which evaluates nextflow.config and submits tasks to the compute backend. Use this target for variables that Nextflow or its plugins read, such as NXF_OPTS, NXF_JVM_ARGS, NXF_PLUGINS_DEFAULT, or proxy settings the head node uses to reach external services.
- Compute job: Adds the variable to the worker containers that run individual pipeline tasks. Use this target for variables your pipeline tools read, such as OPENAI_API_KEY for a process that calls the OpenAI API, registry credentials needed inside the task container, or tool-specific settings like JAVA_HOME.
- Head and Compute jobs: Adds the variable to both the head job and the compute jobs. Use this target for values needed in both places, such as an HTTP proxy used by both Nextflow and task tools, or a credential needed in both the head job and individual compute tasks.
note
For sensitive values such as API keys and tokens, use pipeline secrets instead of custom environment variables. Custom environment variables are stored in the compute environment configuration and cannot be edited after creation. To rotate a value, recreate the compute environment.
Configure any advanced options described in the next section, as needed.
Select Create to finalize the compute environment setup. It will take a few seconds for all the AWS resources to be created before you are ready to launch pipelines.

info

See Launch pipelines to start executing workflows in your AWS Batch compute environment.

Advanced options

Seqera Platform compute environments for AWS Batch include advanced options to configure instance types, resource allocation, custom networking, and CloudWatch and ECS agent integration.

Seqera AWS Batch advanced options

Specify the Allocation strategy and indicate any preferred Instance types. AWS applies quotas for the number of running and requested Spot and On-Demand instances per account. AWS will allocate instances from up to 20 instance types, based on those requested for the compute environment. AWS excludes the largest instances when you request more than 20 instance types.

note
If these advanced options are omitted, allocation strategy defaults are BEST_FIT_PROGRESSIVE for On-Demand and SPOT_PRICE_CAPACITY_OPTIMIZED for Spot compute environments.

caution
Platform CLI (known as tw) v0.8 and earlier do not support the SPOT_PRICE_CAPACITY_OPTIMIZED allocation strategy in AWS Batch. You cannot currently use CLI to create or otherwise interact with AWS Batch Spot compute environments that use this allocation strategy.
Configure a custom networking setup using the VPC ID, Subnets, and Security groups fields.
- If not defined, the default VPC, subnets, and security groups for the selected region will be used.
- When using EFS or FSx file systems, select the security group previously created to allow access to the file system. The VPC ID the security group belongs to needs to match the VPC ID defined for the Seqera Batch compute environment.
You can specify a custom AMI ID.

note
From version 24.2, Seqera supports Amazon Linux 2023 ECS-optimized AMIs, in addition to previously supported Amazon Linux-2 AMIs. AWS-recommended Amazon Linux 2023 AMI names start with al2023-. To learn more about approved versions of the Amazon ECS-optimized AMIs or creating a custom AMI, see this AWS guide.
If a custom AMI is specified and the Enable GPU option is also selected, the custom AMI will be used instead of the AWS-recommended GPU-optimized AMI.
If you need to debug the EC2 instance provisioned by AWS Batch, specify a Key pair to log in to the instance via SSH.
You can set Min CPUs to be greater than 0, in which case some EC2 instances will remain active. An advantage of this is that pipeline executions will initialize faster.

note
Setting Min CPUs to a value greater than 0 will keep the required compute instances active, even when your pipelines are not running. This will result in additional AWS charges.
Use Head job CPUs and Head job memory to specify the hardware resources allocated for the Nextflow head job. The default head job memory allocation is 4096 MiB.

warning
Setting head job values will also limit the size of any Studio session that can be created in the compute environment.
Use Head job role and Compute job role to grant fine-grained IAM permissions to the Head job and Compute jobs.
Add an execution role ARN to the Batch execution role field to grant permissions to make API calls on your behalf to the ECS container used by Batch. This is required if the pipeline launched with this compute environment needs access to the secrets stored in this workspace. This field can be ignored if you are not using secrets.
Specify an EBS block size (in GB) in the EBS auto-expandable block size field to control the initial size of the EBS auto-expandable volume. New blocks of this size are added when the volume begins to run out of free space. This feature is deprecated, and is not compatible with Fusion v2.
Enter the Boot disk size (in GB) to specify the size of the boot disk in the VMs created by this compute environment.
If you're using Spot instances, you can also specify the Cost percentage, which is the maximum allowed price of a Spot instance as a percentage of the On-Demand price for that instance type. Spot instances will not be launched until the current Spot price is below the specified cost percentage.
Use AWS CLI tool path to specify the location of the aws CLI.
Specify a CloudWatch Log group for the awslogs driver to stream the logs entry to an existing Log group in Cloudwatch.
Specify a custom ECS agent configuration for the ECS agent parameters used by AWS Batch. This is appended to the /etc/ecs/ecs.config file in each cluster node.

note
Altering this file may result in a malfunctioning Batch Forge compute environment. See Amazon ECS container agent configuration to learn more about the available parameters.

Manual configuration of Batch resources

This section is for users with a pre-configured AWS environment: follow the AWS Batch queue and compute environment creation instructions to set up the required AWS Batch resources in your account.

A S3 bucket or EFS/FSx file system is required to store Nextflow intermediate files when using Seqera with AWS Batch.

Refer to the IAM user creation section to ensure that your IAM user has the necessary permissions to run pipelines in Seqera Platform. Remove any permissions that are not required for your use case.

Seqera manual compute environment

With your AWS environment and resources set up and your user permissions configured, create an AWS Batch compute environment in Seqera.

caution

AWS Batch creates resources that you may be charged for in your AWS account. See Cloud costs for guidelines to manage cloud resources effectively and prevent unexpected costs.

Select Compute environments from the navigation menu of the Seqera Workspace where you want to setup the CE.
Select Add compute environment.
Enter a descriptive name for this environment, e.g., AWS Batch Spot (eu-west-1).
Select AWS Batch as the target platform.
From the Credentials drop-down, select existing AWS credentials, or select + to add new credentials. If you're using existing credentials, skip to step 9.

note
You can create multiple credentials in your Seqera environment. See Credentials.
Enter a name, e.g., AWS Credentials.
Under AWS credential mode, select Keys or Role.
For Keys mode:
- Add the Access key and Secret key you previously obtained.
- Optionally paste the IAM role ARN which Seqera must use for accessing your AWS resources in Assume role.
- If you paste a role ARN in Assume role, the Generate External ID switch is displayed. Generating an External ID is optional in Keys mode.
- If Generate External ID is selected, an External ID is automatically generated and shown after you save the credential.
For Role mode:
- Paste the IAM role ARN which Seqera must use for accessing your AWS resources in Assume role.
- External ID is generated automatically when you save the credential.
note
When using AWS keys without an assumed role, the associated AWS user must have been granted permissions to operate on the cloud resources directly. When an assumed role is provided, the IAM user keys are only used to retrieve temporary credentials impersonating the role specified: this could be useful when e.g. multiple IAM users are used to access the same AWS account, and the actual permissions to operate on the resources are only granted to the role.
Select a Region, e.g., eu-west-1 - Europe (Ireland). This region must match the region where your S3 bucket or EFS/FSx work directory is located to avoid high data transfer costs.
Enter or select from the drop-down the S3 bucket previously created in the Pipeline work directory field, e.g., s3://seqera-bucket. This bucket must be in the same region chosen in the previous step to avoid incurring high data transfer costs. The work directory can be customized to specify a folder inside the bucket, e.g., s3://seqera-bucket/nextflow-workdir.

note
When you specify an S3 bucket as your work directory, this bucket is used for the Nextflow cloud cache by default. Seqera adds a cloudcache block to the Nextflow configuration file for all runs executed with this compute environment. This block includes the path to a cloudcache folder in your work directory, e.g., s3://seqera-bucket/cloudcache/.cache. You can specify an alternative cache location with the Nextflow config file field on the pipeline launch form.

Similarly you can specify a path in an EFS or FSx file system as your work directory. When using EFS or FSx, you'll need to scroll down to "EFS file system" or "FSx for Lustre" sections to specify either an existing file system ID or let Seqera create a new one for you automatically. Read the notes in steps 23 and 24 below on how to setup EFS or FSx.

warning
Using an EFS or FSx file system as your work directory is currently incompatible with Studios, and will result in errors with checkpoints and mounted data. Use an S3 bucket as your work directory when using Studios.
Select Enable Wave containers to facilitate access to private container repositories and provision containers in your pipelines using the Wave containers service. See Wave containers for more information.
Select Enable Fusion v2 to allow access to your S3-hosted data via the Fusion v2 virtual distributed file system. This speeds up most data operations. The Fusion v2 file system requires Wave containers to be enabled. See Fusion file system for configuration details.
Use Fusion v2 file system
note
The compute recommendations below are based on internal benchmarking performed by Seqera. Benchmark runs of nf-core/rnaseq used profile test_full, consisting of an input dataset with 16 FASTQ files and a total size of approximately 123.5 GB.
We recommend using Fusion with AWS NVMe instances (fast instance storage) as this delivers the fastest performance when compared to environments using only AWS EBS (Elastic Block Store).
1. Use Seqera Platform version 23.1 or later.
2. Use an S3 bucket as the pipeline work directory.
3. Enable Wave containers, Fusion v2, and fast instance storage.
4. Select the Batch Forge config mode.
5. Fast instance storage requires an EC2 instance type that uses NVMe disks. Specify NVMe-based instance types in Instance types under Advanced options. If left unspecified, Platform selects instances from AWS NVMe-based instance type families. See Instance store temporary block storage for EC2 instances for more information.
note
When enabling fast instance storage, do not select the optimal instance type families (c4, m4, r4) for your compute environment as these are not NVMe-based instances. Specify AWS NVMe-based instance types, or leave the Instance types field empty for Platform to select NVMe instances for you.
tip
We recommend selecting 8xlarge or above for large and long-lived production pipelines:

A local temp storage disk of at least 200 GB and a random read speed of 1000 MBps or more. To work with files larger than 100 GB, increase temp storage accordingly (400 GB or more).

Dedicated networking ensures a guaranteed network speed service level compared with "burstable" instances. See Instance network bandwidth for more information.
When using Fusion v2 without fast instance storage, the following EBS settings are applied to optimize file system performance:
- EBS boot disk size is increased to 100 GB
- EBS boot disk type GP3 is selected
- EBS boot disk throughput is increased to 325 MB/s
Extensive benchmarking of Fusion v2 has demonstrated that the increased cost associated with these settings are generally outweighed by the costs saved due to decreased run time.
Select Enable Fusion Snapshots (beta) to enable Fusion to automatically restore jobs that are interrupted when an AWS Spot instance reclamation occurs. Requires Fusion v2. See Fusion Snapshots for more information.
Set the Config mode to Manual.
Enter the Head queue created following the instructions, which is the name of the AWS Batch queue that the Nextflow main job will run.
Enter the Compute queue, which is the name of the AWS Batch queue where tasks will be submitted.
Apply Resource labels to the cloud resources produced by this compute environment. Workspace default resource labels are prefilled.
Expand Staging options to include:
- Optional pre- or post-run Bash scripts that execute before or after the Nextflow pipeline execution in your environment.
- Global Nextflow configuration settings for all pipeline runs launched with this compute environment. Values defined here are pre-filled in the Nextflow config file field in the pipeline launch form. These values can be overridden during pipeline launch.
info
Configuration settings in this field override the same values in the pipeline repository nextflow.config file. See Nextflow config file for more information on configuration priority.
Under Environment variables, add each variable with a Name, Value, and Target Environment:
- Head job: Adds the variable to the Nextflow head job container, which evaluates nextflow.config and submits tasks to the compute backend. Use this target for variables that Nextflow or its plugins read, such as NXF_OPTS, NXF_JVM_ARGS, NXF_PLUGINS_DEFAULT, or proxy settings the head node uses to reach external services.
- Compute job: Adds the variable to the worker containers that run individual pipeline tasks. Use this target for variables your pipeline tools read, such as OPENAI_API_KEY for a process that calls the OpenAI API, registry credentials needed inside the task container, or tool-specific settings like JAVA_HOME.
- Head and Compute jobs: Adds the variable to both the head job and the compute jobs. Use this target for values needed in both places, such as an HTTP proxy used by both Nextflow and task tools, or a credential needed in both the head job and individual compute tasks.
note
For sensitive values such as API keys and tokens, use pipeline secrets instead of custom environment variables. Custom environment variables are stored in the compute environment configuration and cannot be edited after creation. To rotate a value, recreate the compute environment.
Configure any advanced options described in the next section, as needed.
Select Create to finalize the compute environment setup.

info

See Launch pipelines to start executing workflows in your AWS Batch compute environment.

Advanced options

Seqera compute environments for AWS Batch include advanced options to configure resource allocation, execution roles, custom AWS CLI tool paths, and CloudWatch integration.

Configure a custom networking setup using the VPC ID, Subnets, and Security groups fields.
- If not defined, the default VPC, subnets, and security groups for the selected region will be used.
- When using EFS or FSx file systems, select the security group previously created to allow access to the file system. The VPC ID the security group belongs to needs to match the VPC ID defined for the Seqera Batch compute environment.
Use Head job CPUs and Head job memory to specify the hardware resources allocated for the Nextflow head job. The default head job memory allocation is 4096 MiB.
Use Head job role and Compute job role to grant fine-grained IAM permissions to the head job and compute jobs,
Add an execution role ARN to the Batch execution role field to grant permissions to make API calls on your behalf to the ECS container used by Batch. This is required if the pipeline launched with this compute environment needs access to the secrets stored in this workspace. This field can be ignored if you are not using secrets.
Use AWS CLI tool path to specify the location of the aws CLI.
Specify a CloudWatch Log group for the awslogs driver to stream the logs entry to an existing Log group in Cloudwatch.

caution

Seqera is designed to terminate compute resources when a Nextflow pipeline completes or is canceled. However, due to external factors — including user-defined workflow logic, transient cloud faults, or abnormal pipeline exits — residual resources may persist. While Seqera provides visibility to detect and resolve these states, customers are responsible for final resource cleanup and ensuring compute environments operate according to Platform expectations.

From Nextflow v24.10+, compute jobs are identifiable by Seqera workflow ID. If you search your AWS console/CLI/API for jobs prefixed by a given workflow ID, you can check the status and perform additional cleanup in edge case scenarios.

Help

Company

AWS Batch

S3 bucket creation

EFS or FSx file system creation

Creating an EFS file system

Creating an FSx file system

Required Platform IAM permissions

AWS Batch management

Launch template management

Pass role to Batch

CloudWatch logs access

S3 access (optional)

IAM roles for AWS Batch (optional)

AWS Systems Manager (optional)

EC2 describe permissions (optional)

FSx file systems (optional)

EFS file systems (optional)

Pipeline secrets (optional)

Additional steps required to use secrets in a pipeline

Userdata script error detection (optional)

Data lineage (optional)

Create the IAM policy

IAM user creation

AWS credential options

Create an IAM user (key-based)

Obtain IAM user credentials (key-based)

IAM role-based credential creation

Automatic configuration of Batch resources

AWS Batch

Create a Seqera AWS Batch compute environment

Advanced options

Seqera AWS Batch advanced options

Manual configuration of Batch resources

Seqera manual compute environment

Advanced options

Help

Company

S3 bucket creation​

EFS or FSx file system creation​

Creating an EFS file system​

Creating an FSx file system​

Required Platform IAM permissions​

AWS Batch management​

Launch template management​

Pass role to Batch​

CloudWatch logs access​

S3 access (optional)​

IAM roles for AWS Batch (optional)​

AWS Systems Manager (optional)​

EC2 describe permissions (optional)​

FSx file systems (optional)​

EFS file systems (optional)​

Pipeline secrets (optional)​

Additional steps required to use secrets in a pipeline​

Userdata script error detection (optional)​

Data lineage (optional)​

Create the IAM policy​

IAM user creation​

AWS credential options​

Create an IAM user (key-based)​

Obtain IAM user credentials (key-based)​

IAM role-based credential creation​

Automatic configuration of Batch resources​

AWS Batch​

Create a Seqera AWS Batch compute environment​

Advanced options​

Seqera AWS Batch advanced options​

Manual configuration of Batch resources​

Seqera manual compute environment​

Advanced options​

S3 bucket creation

EFS or FSx file system creation

Creating an EFS file system

Creating an FSx file system

Required Platform IAM permissions

AWS Batch management

Launch template management

Pass role to Batch

CloudWatch logs access

S3 access (optional)

IAM roles for AWS Batch (optional)

AWS Systems Manager (optional)

EC2 describe permissions (optional)

FSx file systems (optional)

EFS file systems (optional)

Pipeline secrets (optional)

Additional steps required to use secrets in a pipeline

Userdata script error detection (optional)

Data lineage (optional)

Create the IAM policy

IAM user creation

AWS credential options

Create an IAM user (key-based)

Obtain IAM user credentials (key-based)

IAM role-based credential creation

Automatic configuration of Batch resources

AWS Batch

Create a Seqera AWS Batch compute environment

Advanced options

Seqera AWS Batch advanced options

Manual configuration of Batch resources

Seqera manual compute environment

Advanced options