Home » BioCentric » The story of NVIDIA Parabricks

The story of NVIDIA Parabricks

Introduction

NVIDIA Parabricks is a cutting-edge genomic analysis platform that uses GPU acceleration to dramatically reduce the time required for computational tasks in next-generation sequencing (NGS) workflows. Tailored for applications in precision medicine, genomics research, and healthcare, Parabricks accelerates essential bioinformatics tools like the Genome Analysis Toolkit (GATK) for tasks such as variant calling, alignment, and haplotyping. By harnessing NVIDIA GPUs, Parabricks reduces genomic analysis time from days to hours, enabling efficient processing of large-scale genomic data.

History of Parabricks

Parabricks was founded in 2015 in Ann Arbor, Michigan, as a startup with a mission to tackle the computational challenges associated with analyzing massive genomic datasets. The company’s GPU-optimized pipelines addressed these bottlenecks, delivering unprecedented speed without compromising accuracy. Its early success in accelerating genomic workflows caught the attention of key players in technology and healthcare.

In December 2019, NVIDIA acquired Parabricks to bolster its healthcare and life sciences portfolio. This acquisition was part of NVIDIA’s broader strategy to leverage GPU technology for AI and accelerated computing in healthcare. Parabricks was integrated into NVIDIA’s Clara platform, a comprehensive healthcare computing framework supporting applications such as medical imaging, drug discovery, and genomics.

nvidia parabricks

Enhanced Capabilities Post-Acquisition

Under NVIDIA’s leadership, Parabricks has seen significant advancements. The platform now supports a wide array of bioinformatics tools and workflows, including:

  • Joint calling for population studies.
  • Structural variant detection.
  • Oncogenomics data analysis
  • RNA sequencing pipelines.

These enhancements have enabled researchers to gain faster insights into clinical and research applications such as genetic mutation identification, rare disease diagnosis, and cancer research. By dramatically reducing costs and time, Parabricks has transformed genomic analysis, making it more accessible and impactful for healthcare providers and researchers.

Key Pipelines and Tools

Parabricks offers a suite of tools categorized into six main domains:

1. FASTQ and BAM File Processing

Includes tools like applybsqr, bam2fq, bqsr, and markdup for quality improvement and preprocessing.

2. Variant Calling

Supports germline and somatic variant detection using tools such as HaplotypeCaller and Mutect.

3. RNA Processing

Optimized for RNA sequencing with STAR aligner and HaplotypeCaller.

4. Results Quality Control

Provides tools like bammetrics for comprehensive data quality assessment.

5. Variant Processing

Enables dbSNP annotation and other downstream analyses.

6. gVCF File Processing

Handles genotype and index gVCF files for population-scale studies.

Specialized Pipelines

Germline Pipeline

Used for analyzing non-reproductive cell data, this pipeline includes options for sex inference and alignment using BWA-MEM, followed by variant calling with HaplotypeCaller.

Somatic Pipeline

Designed for tumor and non-tumor genome analysis, it employs BWA-MEM for alignment and GATK Mutect for mutation detection.

RNA Pipeline

Focused on RNA sequencing, it leverages the STAR aligner for mapping and HaplotypeCaller for variant discovery.

Hardware and Deployment Options

Parabricks pipelines can run on local servers or cloud platforms such as AWS, GCP, OCI, and Azure. The latest version (v4.3.1-1) supports the NVIDIA Grace Hopper superchip, offering enhanced performance for large-scale genomic analysis. It is even possible to run Parabricks on gaming hardware, as long as the minimum requirements are met.

New Applications in Research

Parabricks is widely used in genomics research, particularly in cancer studies. Notable examples of new and upcoming features include:

  • Accelerating DeepVariant pipelines for long-read Hi-Fi whole-genome sequencing data.
  • Reducing processing times in Nanopore WGS data using GPU-optimized tools.

Conclusion

NVIDIA Parabricks represents a paradigm shift in genomics, bridging advanced computing and healthcare. Its ability to deliver real-time, cost-effective insights is revolutionizing genomic research, enabling breakthroughs in precision medicine and beyond. With ongoing innovations, Parabricks continues to redefine the boundaries of what’s possible in genomic analysis.