Skip to main content
Version: 23.4.0

Manual AWS Batch configuration

This page describes how to set up AWS roles and Batch queues manually for the deployment of Nextflow workloads with Seqera Platform.

Manual AWS Batch configuration is only necessary if you don't use Batch Forge.

Batch Forge automatically creates the AWS Batch queues required for your workflow executions.

Complete the following procedures to configure AWS Batch manually:

  1. Create a user policy.
  2. Create the instance role policy.
  3. Create the AWS Batch service role.
  4. Create an EC2 Instance role.
  5. Create an EC2 SpotFleet role.
  6. Create a launch template.
  7. Create the AWS Batch compute environments.
  8. Create the AWS Batch queue.

Create a user policy

Create the policy for the user launching Nextflow jobs:

  1. In the IAM Console, select Create policy from the Policies page.

  2. Create a new policy with the following content:

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "Stmt1530313170000",
    "Effect": "Allow",
    "Action": [
    "batch:CancelJob",
    "batch:RegisterJobDefinition",
    "batch:DescribeComputeEnvironments",
    "batch:DescribeJobDefinitions",
    "batch:DescribeJobQueues",
    "batch:DescribeJobs",
    "batch:ListJobs",
    "batch:SubmitJob",
    "batch:TerminateJob"
    ],
    "Resource": ["*"]
    }
    ]
    }
  3. Save with it the name seqera-user.

Create the instance role policy

Create the policy with a role that allows Seqera to submit Batch jobs on your EC2 instances:

  1. In the IAM Console, select Create policy from the Policies page.

  2. Create a new policy with the following content:

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "VisualEditor0",
    "Effect": "Allow",
    "Action": [
    "batch:DescribeJobQueues",
    "batch:CancelJob",
    "batch:SubmitJob",
    "batch:ListJobs",
    "batch:DescribeComputeEnvironments",
    "batch:TerminateJob",
    "batch:DescribeJobs",
    "batch:RegisterJobDefinition",
    "batch:DescribeJobDefinitions",
    "batch:TagResource",
    "ecs:DescribeTasks",
    "ec2:DescribeInstances",
    "ec2:DescribeInstanceTypes",
    "ec2:DescribeInstanceAttribute",
    "ecs:DescribeContainerInstances",
    "ec2:DescribeInstanceStatus",
    "logs:Describe*",
    "logs:Get*",
    "logs:List*",
    "logs:StartQuery",
    "logs:StopQuery",
    "logs:TestMetricFilter",
    "logs:FilterLogEvents"
    ],
    "Resource": "*"
    }
    ]
    }
  3. Save it with the name seqera-batchjob.

Create the Batch Service role

Create a service role used by AWS Batch to launch EC2 instances on your behalf:

  1. In the IAM Console, select Create role from the Roles page.
  2. Select AWS service as the trusted entity type, and Batch as the service.
  3. On the next page, the AWSBatchServiceRole is already attached. No further permissions are needed for this role.
  4. Enter seqera-servicerole as the role name and add an optional description and tags if needed, then select Create.

Create an EC2 instance role

Create a role that controls which AWS resources the EC2 instances launched by AWS Batch can access:

  1. In the IAM Console, select Create role from the Roles page.

  2. Select AWS service as the trusted entity type, EC2 as the service, and EC2 - Allows EC2 instances to call AWS services on your behalf as the use case.

  3. Select Next: Permissions. Search for the following policies to attach to the role:

    • AmazonEC2ContainerServiceforEC2Role
    • AmazonS3FullAccess (you may want to use a custom policy to allow access only on specific S3 buckets)
    • seqera-batchjob (the instance role policy created above)
  4. Enter seqera-instancerole as the role name and add an optional description and tags if needed, then select Create.

Create an EC2 SpotFleet role

The EC2 SpotFleet role allows you to use Spot instances when you run jobs in AWS Batch. Create a role for the creation and launch of Spot fleets — Spot instances with similar compute capabilities (i.e., vCPUs and RAM):

  1. In the IAM Console, select Create role from the Roles page.
  2. Select AWS service as the trusted entity type, EC2 as the service, and EC2 - Spot Fleet Tagging as the use case.
  3. On the next page, the AmazonEC2SpotFleetTaggingRole is already attached. No further permissions are needed for this role.
  4. Enter seqera-fleetrole as the role name and add an optional description and tags if needed, then select Create.

Create a launch template

Create a launch template to configure the EC2 instances deployed by Batch jobs:

  1. In the EC2 Console, select Create launch template from the Launch templates page.
  2. Scroll down to Advanced details and paste the following in the User data field:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"

--//
Content-Type: text/cloud-config; charset="us-ascii"

#cloud-config
write_files:
- path: /root/tower-forge.sh
permissions: 0744
owner: root
content: |
#!/usr/bin/env bash
exec > >(tee /var/log/tower-forge.log|logger -t TowerForge -s 2>/dev/console) 2>&1
##
yum install -q -y jq sed wget unzip nvme-cli lvm2
wget -q https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
rpm -U ./amazon-cloudwatch-agent.rpm
rm -f ./amazon-cloudwatch-agent.rpm
curl -s https://nf-xpack.seqera.io/amazon-cloudwatch-agent/config-v0.3.json \
| sed 's/$FORGE_ID/<your prefix>/g' \
> /opt/aws/amazon-cloudwatch-agent/bin/config.json
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-s \
-c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
mkdir -p /scratch/fusion
NVME_DISKS=($(nvme list | grep 'Amazon EC2 NVMe Instance Storage' | awk '{ print $1 }'))
NUM_DISKS=${#NVME_DISKS[@]}
if (( NUM_DISKS > 0 )); then
if (( NUM_DISKS == 1 )); then
mkfs -t xfs ${NVME_DISKS[0]}
mount ${NVME_DISKS[0]} /scratch/fusion
else
pvcreate ${NVME_DISKS[@]}
vgcreate scratch_fusion ${NVME_DISKS[@]}
lvcreate -l 100%FREE -n volume scratch_fusion
mkfs -t xfs /dev/mapper/scratch_fusion-volume
mount /dev/mapper/scratch_fusion-volume /scratch/fusion
fi
fi
chmod a+w /scratch/fusion
mkdir -p /etc/ecs
echo ECS_IMAGE_PULL_BEHAVIOR=once >> /etc/ecs/ecs.config
echo ECS_ENABLE_AWSLOGS_EXECUTIONROLE_OVERRIDE=true >> /etc/ecs/ecs.config
systemctl stop docker
## install AWS CLI v2
curl "https://awscli.amazonaws.com/awscli-exe-linux-$(arch).zip" -o "awscliv2.zip"
unzip -q awscliv2.zip
sudo ./aws/install
systemctl start docker
systemctl enable --now --no-block ecs
echo "1258291200" > /proc/sys/vm/dirty_bytes
echo "629145600" > /proc/sys/vm/dirty_background_bytes

runcmd:
- bash /root/tower-forge.sh

--//--
  1. In the line | sed 's/$FORGE_ID/<your prefix>/g' \, replace <your prefix> with your desired prefix to be added to the AWS resources created by your compute environment.
  2. Save the template with the name seqera-launchtemplate.

Create the Batch compute environments

Nextflow makes use of two job queues during workflow execution:

  • A head queue to run the Nextflow application
  • A compute queue where Nextflow will submit job executions

While the compute queue can use a compute environment with Spot instances, the head queue requires an on-demand compute environment. If you intend to use an on-demand compute environment for compute jobs, the same job queue can be used for both head and compute.

Spot instances can significantly reduce your AWS compute costs, provided your workflow compute tasks can run on ephemeral instances.

Create a compute environment for each queue in the AWS Batch console:

The head queue requires an on-demand compute environment. Do not select Use Spot instances during compute environment creation.

  1. In the Batch Console, select Create on the Compute environments page.

  2. Select Amazon EC2 as the compute environment configuration.

    Seqera AWS Batch compute environments created with Batch Forge support using Fargate for the head job, but manual compute environments must use EC2.

  3. Enter a name of your choice, and apply the seqera-servicerole and seqera-instancerole.

  4. Enter vCPU limits and instance types, if needed.

    To use the same queue for both head and compute tasks, you must assign sufficient resources to your compute environment.

  5. Expand Additional configuration and select the seqera-launchtemplate from the Launch template dropdown.

  6. Configure VPCs, subnets, and security groups on the next page as needed.

  7. Review your configuration and select Create compute environment.

Create the Batch queue

Create a Batch queue to be associated with each compute environment.

You only need to create one queue if you intend to use on-demand instances for your workflow compute tasks. Compute environments with Spot instances require separate queues for the head and compute tasks.

  1. Go to the Batch Console.
  2. Create a new queue.
  3. Associate the queue with the head queue compute environment created in the previous section.
  4. Save it with a name of your choice.

Use the AWS resources created on this page to create your manual AWS Batch compute environment.