02 - running MitoPilot modules

Running the MitoPilot workflow

Now that our test project is set up, it’s time to run the analysis workflow. MitoPilot utilizes Nextflow to analyze all of the samples in parallel. This is far more efficient than running each sample one at a time in a serial workflow.

To generate the Nextflow command for the Assemble module, click the UPDATE button at the top of the GUI.

Fig. 13

There are two ways to run this Nextflow command on Hydra.

Method 1: Using the GUI

With the Update window open, click the Start Nextflow button to run the analysis pipeline within the MitoPilot GUI. This requires you to keep the GUI open while the pipeline is running.

Note

We recommend only using this method for small numbers of samples. If your internet connection is interrupted, the analysis pipeline will stop. You can resume an interrupted run, but it will waste time and can occasionally cause bugs.

Fig. 14

And we’re off! You should see Nextflow output being printed the to Progress window. This window will continually update as the pipeline progresses. As long as the gears in the top right are spinning, the pipeline is still running.

The Assemble pipeline module has 3 steps: preprocessing (filtering) the raw FASTQ files with fastp, assembling the mitogenomes using GetOrganelle (or MitoFinder), and calculating coverage maps for the assemblies using bowtie2.

Nextflow distributes the analyses across multiple batch jobs on the Hydra cluster. This can allow you to run hundreds of samples simultaneously. The Progress window will update as samples finish each step.

You can use Hydra’s job scheduler to check on the status of MitoPilot jobs. Some samples may finish faster than others.

Fig. 15

Some samples may fail at certain steps (though this is unlikely for the test dataset). Failures are often due to running out of memory, so MitoPilot will retry failed samples with more RAM. The Nextflow log tracks the number of failed samples and retries. We describe some common errors here.

Fig. 16

The test dataset should take a few minutes to finish. When it’s done, your Progress window will look something like this.

Fig. 17

Scroll all the way to the bottom of the Progress window to find some summary info about the run.

Fig. 18

Click Close to return to the samples table.

Method 2: using a batch job

If you have a large number of samples to process (more than a few dozen), we recommend running the Assemble and Annotate MitoPilot modules as batch jobs.

You will still need to generate the Nextflow command using the MitoPilot GUI. But instead of clicking the Start Nextflow button, copy the Nextflow command and create a batch submission script.

We have provided a template below. You may wish to modify the job name (-N) and the log file name (-o).

#!/bin/sh
#$ -N MitoPilot_assembly # MODIFY THIS IF DESIRED
#$ -o MitoPilot_assembly.log # MODIFY THIS IF DESIRED
#$ -cwd -j y
#$ -q lTWFM.sq
#$ -l wfmq
#$ -pe mthread 2
#$ -S /bin/sh

echo + `date` job $JOB_NAME started in $QUEUE with jobID=$JOB_ID on $HOSTNAME

source ~/.bashrc

module load tools/java/21.0.2 # required for Nextflow on Hydra

# EXAMPLE NEXTFLOW COMMAND
nextflow -log /pool/public/genomics/macguigand/MitoPilot/NMNH_workshop_2025/test_project_01/.logs/nextflow.log run /home/macguigand/R/x86_64-redhat-linux-gnu-library/4.4/MitoPilot/nextflow -c /pool/public/genomics/macguigand/MitoPilot/NMNH_workshop_2025/test_project_01/.config -entry WF1 

echo = `date` job $JOB_NAME done
Important

You must use the options -q lTWFM.sq and -l wfmq. This is a special Hydra queue for workflow managers like Nextflow. You must also include -pe mthread 2 in order to have enough available RAM for building the MitoPilot Singularity image.

Move the submission script into your MitoPilot project directory. In the above example, /pool/public/genomics/macguigand/MitoPilot/NMNH_workshop_2025/test_project_01. Then submit the job using qsub MY_SCRIPT_NAME.sh.

You can monitor the progress of this job using the qstat command and by checking on the log files. Once the job is done, you can relaunch the GUI to inspect the results. The same approach can be used for the Annotate module.