Skip to main content
Version: 23.4

Manual AWS Batch configuration

This page describes how to set up AWS roles and Batch queues manually for the deployment of Nextflow workloads with Seqera Platform.

tip

Manual AWS Batch configuration is only necessary if you don't use Batch Forge.

Batch Forge automatically creates the AWS Batch queues required for your workflow executions.

Complete the following procedures to configure AWS Batch manually:

  1. Create a user policy.
  2. Create the instance role policy.
  3. Create the AWS Batch service role.
  4. Create an EC2 Instance role.
  5. Create an EC2 SpotFleet role.
  6. Create a launch template.
  7. Create the AWS Batch compute environments.
  8. Create the AWS Batch queue.

Create a user policy

Create the policy for the user launching Nextflow jobs:

  1. In the IAM Console, select Create policy from the Policies page.

  2. Create a new policy with the following content:

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "Stmt1530313170000",
    "Effect": "Allow",
    "Action": [
    "batch:CancelJob",
    "batch:RegisterJobDefinition",
    "batch:DescribeComputeEnvironments",
    "batch:DescribeJobDefinitions",
    "batch:DescribeJobQueues",
    "batch:DescribeJobs",
    "batch:ListJobs",
    "batch:SubmitJob",
    "batch:TerminateJob"
    ],
    "Resource": ["*"]
    }
    ]
    }
  3. Save with it the name seqera-user.

Create the instance role policy

Create the policy with a role that allows Seqera to submit Batch jobs on your EC2 instances:

  1. In the IAM Console, select Create policy from the Policies page.

  2. Create a new policy with the following content:

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "VisualEditor0",
    "Effect": "Allow",
    "Action": [
    "batch:DescribeJobQueues",
    "batch:CancelJob",
    "batch:SubmitJob",
    "batch:ListJobs",
    "batch:DescribeComputeEnvironments",
    "batch:TerminateJob",
    "batch:DescribeJobs",
    "batch:RegisterJobDefinition",
    "batch:DescribeJobDefinitions",
    "batch:TagResource",
    "ecs:DescribeTasks",
    "ec2:DescribeInstances",
    "ec2:DescribeInstanceTypes",
    "ec2:DescribeInstanceAttribute",
    "ecs:DescribeContainerInstances",
    "ec2:DescribeInstanceStatus",
    "logs:Describe*",
    "logs:Get*",
    "logs:List*",
    "logs:Create*",
    "logs:Put*",
    "logs:StartQuery",
    "logs:StopQuery",
    "logs:TestMetricFilter",
    "logs:FilterLogEvents"
    ],
    "Resource": "*"
    }
    ]
    }
  3. Save it with the name seqera-batchjob.

Create the Batch Service role

Create a service role used by AWS Batch to launch EC2 instances on your behalf:

  1. In the IAM Console, select Create role from the Roles page.
  2. Select AWS service as the trusted entity type, and Batch as the service.
  3. On the next page, the AWSBatchServiceRole is already attached. No further permissions are needed for this role.
  4. Enter seqera-servicerole as the role name and add an optional description and tags if needed, then select Create.

Create an EC2 instance role

Create a role that controls which AWS resources the EC2 instances launched by AWS Batch can access:

  1. In the IAM Console, select Create role from the Roles page.
  2. Select AWS service as the trusted entity type, EC2 as the service, and EC2 - Allows EC2 instances to call AWS services on your behalf as the use case.
  3. Select Next: Permissions. Search for the following policies to attach to the role:
    • AmazonEC2ContainerServiceforEC2Role
    • AmazonS3FullAccess (you may want to use a custom policy to allow access only on specific S3 buckets)
    • seqera-batchjob (the instance role policy created above)
  4. Enter seqera-instancerole as the role name and add an optional description and tags if needed, then select Create.

Create an EC2 SpotFleet role

The EC2 SpotFleet role allows you to use Spot instances when you run jobs in AWS Batch. Create a role for the creation and launch of Spot fleets — Spot instances with similar compute capabilities (i.e., vCPUs and RAM):

  1. In the IAM Console, select Create role from the Roles page.
  2. Select AWS service as the trusted entity type, EC2 as the service, and EC2 - Spot Fleet Tagging as the use case.
  3. On the next page, the AmazonEC2SpotFleetTaggingRole is already attached. No further permissions are needed for this role.
  4. Enter seqera-fleetrole as the role name and add an optional description and tags if needed, then select Create.

Create a launch template

Create a launch template to configure the EC2 instances deployed by Batch jobs:

  1. In the EC2 Console, select Create launch template from the Launch templates page.

  2. Scroll down to Advanced details and paste the following in the User data field:

    MIME-Version: 1.0
    Content-Type: multipart/mixed; boundary="//"

    --//
    Content-Type: text/cloud-config; charset="us-ascii"

    #cloud-config
    write_files:
    - path: /root/custom-ce.sh
    permissions: 0744
    owner: root
    content: |
    #!/usr/bin/env bash
    yum install -q -y jq sed wget unzip nvme-cli lvm2
    wget -q https://amazoncloudwatch-agent.s3.amazonaws.com/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
    rpm -U ./amazon-cloudwatch-agent.rpm
    rm -f ./amazon-cloudwatch-agent.rpm
    curl -s https://nf-xpack.seqera.io/amazon-cloudwatch-agent/custom-v0.1.json \
    # | sed 's/custom-id/<your custom ID>/g' \
    > /opt/aws/amazon-cloudwatch-agent/bin/config.json
    /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
    -a fetch-config \
    -m ec2 \
    -s \
    -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
    mkdir -p /scratch/fusion
    NVME_DISKS=($(nvme list | grep 'Amazon EC2 NVMe Instance Storage' | awk '{ print $1 }'))
    NUM_DISKS=${#NVME_DISKS[@]}
    if (( NUM_DISKS > 0 )); then
    if (( NUM_DISKS == 1 )); then
    mkfs -t xfs ${NVME_DISKS[0]}
    mount ${NVME_DISKS[0]} /scratch/fusion
    else
    pvcreate ${NVME_DISKS[@]}
    vgcreate scratch_fusion ${NVME_DISKS[@]}
    lvcreate -l 100%FREE -n volume scratch_fusion
    mkfs -t xfs /dev/mapper/scratch_fusion-volume
    mount /dev/mapper/scratch_fusion-volume /scratch/fusion
    fi
    fi
    chmod a+w /scratch/fusion
    mkdir -p /etc/ecs
    echo ECS_IMAGE_PULL_BEHAVIOR=once >> /etc/ecs/ecs.config
    echo ECS_ENABLE_AWSLOGS_EXECUTIONROLE_OVERRIDE=true >> /etc/ecs/ecs.config
    systemctl stop docker
    ## install AWS CLI
    mkdir -p /home/ec2-user
    curl -s https://nf-xpack.seqera.io/miniconda-awscli/miniconda-awscli.tar.gz \
    | tar xz -C /home/ec2-user
    export PATH=$PATH:/home/ec2-user/miniconda/bin
    ln -s /home/ec2-user/miniconda/bin/aws /usr/bin/aws
    systemctl start docker
    systemctl enable --now --no-block ecs
    echo "1258291200" > /proc/sys/vm/dirty_bytes
    echo "629145600" > /proc/sys/vm/dirty_background_bytes

    runcmd:
    - bash /root/custom-ce.sh

    --//--
  3. To prepend a custom identifier to the CloudWatch log streams for AWS resources created by your manual compute environment, uncomment the | sed 's/custom-id/<your custom ID>/g' \ line and replace <your custom ID> with your custom ID. If ommitted, this defaults to custom-id.

  4. Save the template with the name seqera-launchtemplate.

Create the Batch compute environments

caution

AWS Graviton instances (ARM64 CPU architecture) are not supported in manual compute environments. To use Graviton instances, create your AWS Batch compute environment with Batch Forge.

Nextflow makes use of two job queues during workflow execution:

  • A head queue to run the Nextflow application
  • A compute queue where Nextflow will submit job executions

While the compute queue can use a compute environment with Spot instances, the head queue requires an on-demand compute environment. If you intend to use an on-demand compute environment for compute jobs, the same job queue can be used for both head and compute.

note

Spot instances can significantly reduce your AWS compute costs, provided your workflow compute tasks can run on ephemeral instances.

Create a compute environment for each queue in the AWS Batch console:

The head queue requires an on-demand compute environment. Do not select Use Spot instances during compute environment creation.

  1. In the Batch Console, select Create on the Compute environments page.
  2. Select Amazon EC2 as the compute environment configuration.
    note

    Seqera AWS Batch compute environments created with Batch Forge support using Fargate for the head job, but manual compute environments must use EC2.

  3. Enter a name of your choice, and apply the seqera-servicerole and seqera-instancerole.
  4. Enter vCPU limits and instance types, if needed.
    note

    To use the same queue for both head and compute tasks, you must assign sufficient resources to your compute environment.

  5. Expand Additional configuration and select the seqera-launchtemplate from the Launch template dropdown.
  6. Configure VPCs, subnets, and security groups on the next page as needed.
  7. Review your configuration and select Create compute environment.

Create the Batch queue

Create a Batch queue to be associated with each compute environment.

note

You only need to create one queue if you intend to use on-demand instances for your workflow compute tasks. Compute environments with Spot instances require separate queues for the head and compute tasks.

  1. Go to the Batch Console.
  2. Create a new queue.
  3. Associate the queue with the head queue compute environment created in the previous section.
  4. Save it with a name of your choice.

Use the AWS resources created on this page to create your manual AWS Batch compute environment.