Google Cloud Storage
With Fusion, you can run Nextflow pipelines using the local executor and Google Cloud Storage. This is useful to scale your pipeline execution vertically with a large compute instance, without the need to allocate a large storage volume for temporary pipeline data.
Nextflow CLI
This configuration requires Docker or a similar container engine to run pipeline tasks.
-
Set the
GOOGLE_APPLICATION_CREDENTIALS
environment variable with your service account JSON key to grant Nextflow and Fusion access to your storage credentials. See Credentials for more information. -
Add the following to your
nextflow.config
file:wave.enabled = true
docker.enabled = true
tower.accessToken = '<PLATFORM_ACCESS_TOKEN>'
fusion.enabled = true
fusion.exportStorageCredentials = trueReplace
<PLATFORM_ACCESS_TOKEN>
with your Platform access token. -
Run the pipeline with the Nextflow run command:
nextflow run <PIPELINE_SCRIPT> -w gs://<GCS_BUCKET>/work
Replace the following:
<PIPELINE_SCRIPT>
: your pipeline Git repository URI.<GCS_BUCKET>
: your Google Cloud Storage bucket to which you have read-write access.
To achieve optimal performance, set up an SSD volume as the temporary directory.
The option fusion.exportStorageCredentials
leaks credentials on the task launcher script created by Nextflow.
This option should only be used for testing and development purposes.