Nextflow

Nextflow type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Database:

  • Job Scheduler:

  • Utility:

  • Extension:

Nextflow type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Extension:

Nextflow type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Extension:

Nextflow type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Extension:

Nextflow type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

Nextflow type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Database:

Nextflow type access

Nextflow is a bioinformatics workflow management system that enables the development of scalable and reproducible scientific workflows. It supports deploying workflows on a variety of execution platforms. It allows the adaptation of pipelines written in the most common scripting languages.

For more information check here and here.

Select Input Parameters

To run a pipeline the user must set two parameters:

  • Input folder: It mounts the folder containing source code and input files.

  • Pipeline script: It selects the Nextflow pipeline script, which is a .nf file containing the workflow instructions.

Initialization

For information on how to use the Initialization parameter, please refer to the Initialization: Bash Script, Initialization: Conda Packages, and Initialization: PyPI Packages section of the documentation.

Create a Conda environment

The user can also install the required software dependencies via Conda by specifying the packages or the path(s) to the configuration YAML file(s) directly in the pipeline script. In this case, the user must use the option -with-conda.

Configure SSH Access

The app provides optional support for SSH access from an external client. An SSH public key must be uploaded using the corresponding panel in Resources section of the UCloud side menu.

By checking on Enable SSH server a random port is opened for connection. The connection command is shown in the job progress view page.

Import a Configuration File

The parameter Configuration is used to upload a Nextflow configuration file. The latter is a simple text file containing a set of properties defined using the syntax:

name = value

More information about configuration settings in Nextflow can be found in the official documentation.

Using Slurm

In multi-node Nextflow jobs on UCloud, Slurm is used as the default executor. It is configured in a dedicated configuration profile which is automatically updated to make full use of the resources allocated to the UCloud job.

A Slurm cluster is started by default in Nextflow jobs. If this cluster is not needed, the user can prevent it from starting by setting the optional parameter Start Slurm cluster to false.

If a Slurm cluster is started, but not used, it will remain idle throughout the job.

Note

The Slurm configuration profile is not used in single-node jobs. Instead, the local executor is used unless otherwise specified by the user. Therefore, users can always set the optional parameter Start Slurm cluster to false in single-node jobs.

Slurm configuration profile

The Slurm configuration profile is specified with the following information:

  • cpus: Set to the total number of logical CPUs (per node) for the chosen machine type.

  • memory: Set to the total memory (per node) in the chosen machine type.

  • time: Set effectively to infinity (99999h).

  • queue: Set to CLOUD, which is the default Slurm partition.

Furthermore, the Slurm configuration profile contains the following two clusterOptions:

  • --nodes=1-<num-nodes> where <num-nodes> is the number of nodes allocated to the UCloud job.

  • --gres=gpu:<gpu-type>:<num-gpu> where <gpu-type> is the type of GPU (e.g., h100), and <num-gpu> is the number of GPUs (per node) allocated to the UCloud job. The option is only included if the UCloud job runs on a GPU machine type.

Adding cluster options

The user can customize the Slurm configuration profile with any native configuration option. This can be done using the Cluster options parameter.

Options passed via this parameter are appended to the pre-defined clusterOptions in the Slurm configuration profile (see above).

The user must specify the additional cluster options as one string where each option is separated by a single whitespace:

--option1=value1 --option2=value2 --option3=value3

In case of duplicate options, Slurm only uses the value from the last occurrence of the given option. This means that the user can override the default clusterOptions (i.e., --nodes and --gres) by adding these options with the desired values using the Cluster options optional parameter.

Warning

Users should not use the Cluster options parameter to override cpus, memory, time, or queue since doing so can lead to undefined behavior. Use the Configuration optional parameter for this instead.

Interactive Mode

The Interactive mode parameter is used to start an interactive job session where the user can open a terminal window from the job progress page and execute shell commands.