Manual AWS Batch configuration
This page describes how to set up AWS roles and Batch queues manually for the deployment of Nextflow workloads with Seqera Platform.
Manual AWS Batch configuration is only necessary if you don't use Batch Forge.
Batch Forge automatically creates the AWS Batch queues required for your workflow executions.
Complete the following procedures to configure AWS Batch manually:
- Create a user policy.
- Create the instance role policy.
- Create the AWS Batch service role.
- Create an EC2 Instance role.
- Create an EC2 SpotFleet role.
- Create a launch template.
- Create the AWS Batch compute environments.
- Create the AWS Batch queue.
Create a user policy
Create the policy for the user launching Nextflow jobs:
-
In the IAM Console, select Create policy from the Policies page.
-
Create a new policy with the following content:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1530313170000",
"Effect": "Allow",
"Action": [
"batch:CancelJob",
"batch:RegisterJobDefinition",
"batch:DescribeComputeEnvironments",
"batch:DescribeJobDefinitions",
"batch:DescribeJobQueues",
"batch:DescribeJobs",
"batch:ListJobs",
"batch:SubmitJob",
"batch:TerminateJob"
],
"Resource": ["*"]
}
]
} -
Save with it the name
seqera-user
.
Create the instance role policy
Create the policy with a role that allows Seqera to submit Batch jobs on your EC2 instances:
-
In the IAM Console, select Create policy from the Policies page.
-
Create a new policy with the following content:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"batch:DescribeJobQueues",
"batch:CancelJob",
"batch:SubmitJob",
"batch:ListJobs",
"batch:DescribeComputeEnvironments",
"batch:TerminateJob",
"batch:DescribeJobs",
"batch:RegisterJobDefinition",
"batch:DescribeJobDefinitions",
"batch:TagResource",
"ecs:DescribeTasks",
"ec2:DescribeInstances",
"ec2:DescribeInstanceTypes",
"ec2:DescribeInstanceAttribute",
"ecs:DescribeContainerInstances",
"ec2:DescribeInstanceStatus",
"logs:Describe*",
"logs:Get*",
"logs:List*",
"logs:Create*",
"logs:Put*",
"logs:StartQuery",
"logs:StopQuery",
"logs:TestMetricFilter",
"logs:FilterLogEvents"
],
"Resource": "*"
}
]
} -
Save it with the name
seqera-batchjob
.
Create the Batch Service role
Create a service role used by AWS Batch to launch EC2 instances on your behalf:
- In the IAM Console, select Create role from the Roles page.
- Select AWS service as the trusted entity type, and Batch as the service.
- On the next page, the
AWSBatchServiceRole
is already attached. No further permissions are needed for this role. - Enter
seqera-servicerole
as the role name and add an optional description and tags if needed, then select Create.
Create an EC2 instance role
Create a role that controls which AWS resources the EC2 instances launched by AWS Batch can access:
- In the IAM Console, select Create role from the Roles page.
- Select AWS service as the trusted entity type, EC2 as the service, and EC2 - Allows EC2 instances to call AWS services on your behalf as the use case.
- Select Next: Permissions. Search for the following policies to attach to the role:
AmazonEC2ContainerServiceforEC2Role
AmazonS3FullAccess
(you may want to use a custom policy to allow access only on specific S3 buckets)seqera-batchjob
(the instance role policy created above)
- Enter
seqera-instancerole
as the role name and add an optional description and tags if needed, then select Create.
Create an EC2 SpotFleet role
The EC2 SpotFleet role allows you to use Spot instances when you run jobs in AWS Batch. Create a role for the creation and launch of Spot fleets — Spot instances with similar compute capabilities (i.e., vCPUs and RAM):
- In the IAM Console, select Create role from the Roles page.
- Select AWS service as the trusted entity type, EC2 as the service, and EC2 - Spot Fleet Tagging as the use case.
- On the next page, the
AmazonEC2SpotFleetTaggingRole
is already attached. No further permissions are needed for this role. - Enter
seqera-fleetrole
as the role name and add an optional description and tags if needed, then select Create.
Create a launch template
Create a launch template to configure the EC2 instances deployed by Batch jobs:
- AWS Batch with Fusion v2
- AWS Batch without Fusion v2
-
In the EC2 Console, select Create launch template from the Launch templates page.
-
Scroll down to Advanced details and paste the following in the User data field:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"
--//
Content-Type: text/cloud-config; charset="us-ascii"
#cloud-config
write_files:
- path: /root/custom-ce.sh
permissions: 0744
owner: root
content: |
#!/usr/bin/env bash
yum install -q -y jq sed wget unzip nvme-cli lvm2
wget -q https://amazoncloudwatch-agent.s3.amazonaws.com/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
rpm -U ./amazon-cloudwatch-agent.rpm
rm -f ./amazon-cloudwatch-agent.rpm
curl -s https://nf-xpack.seqera.io/amazon-cloudwatch-agent/custom-v0.1.json \
# | sed 's/custom-id/<your custom ID>/g' \
> /opt/aws/amazon-cloudwatch-agent/bin/config.json
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-s \
-c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
mkdir -p /scratch/fusion
NVME_DISKS=($(nvme list | grep 'Amazon EC2 NVMe Instance Storage' | awk '{ print $1 }'))
NUM_DISKS=${#NVME_DISKS[@]}
if (( NUM_DISKS > 0 )); then
if (( NUM_DISKS == 1 )); then
mkfs -t xfs ${NVME_DISKS[0]}
mount ${NVME_DISKS[0]} /scratch/fusion
else
pvcreate ${NVME_DISKS[@]}
vgcreate scratch_fusion ${NVME_DISKS[@]}
lvcreate -l 100%FREE -n volume scratch_fusion
mkfs -t xfs /dev/mapper/scratch_fusion-volume
mount /dev/mapper/scratch_fusion-volume /scratch/fusion
fi
fi
chmod a+w /scratch/fusion
mkdir -p /etc/ecs
echo ECS_IMAGE_PULL_BEHAVIOR=once >> /etc/ecs/ecs.config
echo ECS_ENABLE_AWSLOGS_EXECUTIONROLE_OVERRIDE=true >> /etc/ecs/ecs.config
systemctl stop docker
## install AWS CLI
mkdir -p /home/ec2-user
curl -s https://nf-xpack.seqera.io/miniconda-awscli/miniconda-awscli.tar.gz \
| tar xz -C /home/ec2-user
export PATH=$PATH:/home/ec2-user/miniconda/bin
ln -s /home/ec2-user/miniconda/bin/aws /usr/bin/aws
systemctl start docker
systemctl enable --now --no-block ecs
echo "1258291200" > /proc/sys/vm/dirty_bytes
echo "629145600" > /proc/sys/vm/dirty_background_bytes
runcmd:
- bash /root/custom-ce.sh
--//-- -
To prepend a custom identifier to the CloudWatch log streams for AWS resources created by your manual compute environment, uncomment the
| sed 's/custom-id/<your custom ID>/g' \
line and replace<your custom ID>
with your custom ID. If ommitted, this defaults tocustom-id
. -
Save the template with the name
seqera-launchtemplate
.
-
In the EC2 Console, select Create launch template from the Launch templates page.
-
Scroll down to Advanced details and paste the following in the User data field:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"
--//
Content-Type: text/cloud-config; charset="us-ascii"
#cloud-config
write_files:
- path: /root/custom-ce.sh
permissions: 0744
owner: root
content: |
#!/usr/bin/env bash
yum install -q -y jq sed wget unzip nvme-cli lvm2
wget -q https://amazoncloudwatch-agent.s3.amazonaws.com/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
rpm -U ./amazon-cloudwatch-agent.rpm
rm -f ./amazon-cloudwatch-agent.rpm
curl -s https://nf-xpack.seqera.io/amazon-cloudwatch-agent/custom-v0.1.json \
# | sed 's/custom-id/<your custom ID>/g' \
> /opt/aws/amazon-cloudwatch-agent/bin/config.json
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-s \
-c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
mkdir -p /etc/ecs
echo ECS_IMAGE_PULL_BEHAVIOR=once >> /etc/ecs/ecs.config
echo ECS_ENABLE_AWSLOGS_EXECUTIONROLE_OVERRIDE=true >> /etc/ecs/ecs.config
systemctl stop docker
## install AWS CLI v2
mkdir -p /home/ec2-user
curl -s https://nf-xpack.seqera.io/miniconda-awscli/miniconda-awscli.tar.gz \
| tar xz -C /home/ec2-user
export PATH=$PATH:/home/ec2-user/miniconda/bin
ln -s /home/ec2-user/miniconda/bin/aws /usr/bin/aws
systemctl start docker
systemctl enable --now --no-block ecs
echo "1258291200" > /proc/sys/vm/dirty_bytes
echo "629145600" > /proc/sys/vm/dirty_background_bytes
runcmd:
- bash /root/custom-ce.sh
--//-- -
To prepend a custom identifier to the CloudWatch log streams for AWS resources created by your manual compute environment, uncomment the
| sed 's/custom-id/<your custom ID>/g' \
line and replace<your custom ID>
with your custom ID. If ommitted, this defaults tocustom-id
. -
Save the template with the name
seqera-launchtemplate
.
Create the Batch compute environments
AWS Graviton instances (ARM64 CPU architecture) are not supported in manual compute environments. To use Graviton instances, create your AWS Batch compute environment with Batch Forge.
Nextflow makes use of two job queues during workflow execution:
- A head queue to run the Nextflow application
- A compute queue where Nextflow will submit job executions
While the compute queue can use a compute environment with Spot instances, the head queue requires an on-demand compute environment. If you intend to use an on-demand compute environment for compute jobs, the same job queue can be used for both head and compute.
Spot instances can significantly reduce your AWS compute costs, provided your workflow compute tasks can run on ephemeral instances.
Create a compute environment for each queue in the AWS Batch console:
- Head queue with on-demand instances
- Compute queue with Spot instances
The head queue requires an on-demand compute environment. Do not select Use Spot instances during compute environment creation.
- In the Batch Console, select Create on the Compute environments page.
- Select Amazon EC2 as the compute environment configuration.
note
Seqera AWS Batch compute environments created with Batch Forge support using Fargate for the head job, but manual compute environments must use EC2.
- Enter a name of your choice, and apply the
seqera-servicerole
andseqera-instancerole
. - Enter vCPU limits and instance types, if needed.
note
To use the same queue for both head and compute tasks, you must assign sufficient resources to your compute environment.
- Expand Additional configuration and select the
seqera-launchtemplate
from the Launch template dropdown. - Configure VPCs, subnets, and security groups on the next page as needed.
- Review your configuration and select Create compute environment.
Create this compute environment to use Spot instances for your workflow compute tasks. This compute environment cannot be assigned to the Nextflow head job queue.
- In the Batch Console, select Create on the Compute environments page.
- Select Amazon EC2 as the compute environment configuration.
- Enter a name of your choice, and apply the
seqera-servicerole
andseqera-instancerole
. - Select Enable using Spot instances to use Spot instances and save computing costs.
- Select the
seqera-fleetrole
and enter vCPU limits and instance types, if needed. - Expand Additional configuration and select the
seqera-launchtemplate
from the Launch template dropdown. - Configure VPCs, subnets, and security groups on the next page as needed.
- Review your configuration and select Create compute environment.
Create the Batch queue
Create a Batch queue to be associated with each compute environment.
You only need to create one queue if you intend to use on-demand instances for your workflow compute tasks. Compute environments with Spot instances require separate queues for the head and compute tasks.
- Head queue
- Compute queue
- Go to the Batch Console.
- Create a new queue.
- Associate the queue with the head queue compute environment created in the previous section.
- Save it with a name of your choice.
- Go to the Batch Console.
- Create a new queue.
- Associate the queue with the compute queue environment created in the previous section.
- Save it with a name of your choice.
Use the AWS resources created on this page to create your manual AWS Batch compute environment.