NovaSeq Sequencer¶

This page contains information about handling NovaSeq sequencer data.

NovaSeq Run Structure¶

NovaSeq runs typically have the following directory structure:

220102_A00001_0001_AHGV7DRXX/
├── Data/
├── InterOp/
├── Logs/
├── Recipe/
├── RTAComplete.txt
├── RunInfo.xml
├── RunParameters.xml
└── SampleSheet.csv

Required Files¶

The following files are required for processing NovaSeq runs:

RunInfo.xml: Contains information about the run, including the run ID, instrument, flowcell, and read configuration.
RunParameters.xml: Contains parameters used for the run, including chemistry, application version, and experiment name.
SampleSheet.csv: Contains information about the samples in the run, including sample IDs, indices, and projects.
RTAComplete.txt: Indicates that the run has completed.

Metadata Extraction¶

The following metadata is extracted from NovaSeq runs:

Run ID
Instrument ID
Flowcell ID
Date
Chemistry
Application version
Sample count
Read configuration
Flow cell mode

Workflow¶

The NovaSeq ingest workflow performs the following steps:

Find completed NovaSeq runs
Validate the run structure
Extract metadata from the run files
Upload the run to iRODS
Add metadata to the iRODS collection

For more information, see the Workflows section.