Quickstart¶
This guide will help you get started with the Rodrunner project quickly. We'll cover the basic steps to ingest sequencer data into iRODS and query metadata.
Prerequisites¶
Before starting, make sure you have:
- Installed the project following the Installation Guide
- Configured the project following the Configuration Guide
- Access to an iRODS server (or using the Docker setup)
- Some sequencer data to ingest (or using the sample data)
Using the Docker Setup¶
If you're using the Docker setup, you can use the sample data included in the project.
Step 1: Start the Docker containers¶
docker-compose up -d
Step 2: Access the API¶
The API will be available at http://localhost:8000. You can check the API documentation at http://localhost:8000/docs.
Ingesting Sequencer Data¶
There are two ways to ingest sequencer data:
- Using the API
- Running the workflows directly
Using the API¶
Step 1: Check available workflows¶
curl http://localhost:8000/workflows/list
This will return a list of available workflows, including the ingest workflows.
Step 2: Run the ingest workflow¶
curl -X POST http://localhost:8000/workflows/run \
-H "Content-Type: application/json" \
-d '{
"workflow_name": "Ingest Sequencer Runs",
"parameters": {
"sequencer_type": "miseq",
"root_dir": "/data/sequencer/miseq"
}
}'
This will start the ingest workflow for MiSeq data and return a flow run ID.
Step 3: Check the workflow status¶
curl http://localhost:8000/workflows/status/{flow_run_id}
Replace {flow_run_id} with the ID returned from the previous step.
Running Workflows Directly¶
You can also run the workflows directly using Python:
from rodrunner.config import get_config
from rodrunner.workflows.ingest import ingest_sequencer_runs
# Load the configuration
config = get_config()
# Run the ingest workflow
results = ingest_sequencer_runs(
config=config,
sequencer_type="miseq",
root_dir="/data/sequencer/miseq"
)
# Print the results
for result in results:
print(f"Run: {result['run_dir']}, Success: {result['success']}")
Querying Metadata¶
Once you've ingested data, you can query the metadata using the API.
Search for Metadata by Key and Value¶
curl "http://localhost:8000/metadata/search?key=run_id&value=220101_M00001_0001_000000000-A1B2C"
This will return all objects with the specified metadata key and value.
Get Metadata for a Specific Object¶
curl "http://localhost:8000/metadata/object?path=/tempZone/home/rods/sequencer/miseq/220101_M00001_0001_000000000-A1B2C"
This will return the metadata for the specified object.
Using the Python Client¶
You can also use the Python client to interact with iRODS directly:
from rodrunner.config import get_config
from rodrunner.irods.client import iRODSClient
# Load the configuration
config = get_config()
# Create an iRODS client
irods_client = iRODSClient(config.irods)
# Check if a collection exists
if irods_client.collection_exists("/tempZone/home/rods/sequencer/miseq"):
print("Collection exists")
# Get a collection
collection = irods_client.get_collection("/tempZone/home/rods/sequencer/miseq")
# Print metadata
for meta in collection.metadata.items():
print(f"{meta.name}: {meta.value}")
Next Steps¶
Now that you've completed the quickstart, you can:
- Explore the User Guide for more detailed information
- Learn about the Filesystem Operations
- Learn about the iRODS Client
- Learn about the Parsers
- Learn about the Workflows
- Learn about the API