The story of NVIDIA Parabricks
Introduction
NVIDIA Parabricks is a cutting-edge genomic analysis platform that uses GPU acceleration to dramatically reduce the time required for computational tasks in next-generation sequencing (NGS) workflows. Tailored for applications in precision medicine, genomics research, and healthcare, Parabricks accelerates essential bioinformatics tools like the Genome Analysis Toolkit (GATK) for tasks such as variant calling, alignment, and haplotyping. By harnessing NVIDIA GPUs, Parabricks reduces genomic analysis time from days to hours, enabling efficient processing of large-scale genomic data.
History of Parabricks
Parabricks was founded in 2015 in Ann Arbor, Michigan, as a startup with a mission to tackle the computational challenges associated with analyzing massive genomic datasets. The company’s GPU-optimized pipelines addressed these bottlenecks, delivering unprecedented speed without compromising accuracy. Its early success in accelerating genomic workflows caught the attention of key players in technology and healthcare.
In December 2019, NVIDIA acquired Parabricks to bolster its healthcare and life sciences portfolio. This acquisition was part of NVIDIA’s broader strategy to leverage GPU technology for AI and accelerated computing in healthcare. Parabricks was integrated into NVIDIA’s Clara platform, a comprehensive healthcare computing framework supporting applications such as medical imaging, drug discovery, and genomics.
Enhanced Capabilities Post-Acquisition
Under NVIDIA’s leadership, Parabricks has seen significant advancements. The platform now supports a wide array of bioinformatics tools and workflows, including:
- Joint calling for population studies.
- Structural variant detection.
- Oncogenomics data analysis
- RNA sequencing pipelines.
These enhancements have enabled researchers to gain faster insights into clinical and research applications such as genetic mutation identification, rare disease diagnosis, and cancer research. By dramatically reducing costs and time, Parabricks has transformed genomic analysis, making it more accessible and impactful for healthcare providers and researchers.
Key Pipelines and Tools
Parabricks offers a suite of tools categorized into six main domains:
1. FASTQ and BAM File Processing
Includes tools like applybsqr, bam2fq, bqsr, and markdup for quality improvement and preprocessing.
2. Variant Calling
Supports germline and somatic variant detection using tools such as HaplotypeCaller and Mutect.
3. RNA Processing
Optimized for RNA sequencing with STAR aligner and HaplotypeCaller.
4. Results Quality Control
Provides tools like bammetrics for comprehensive data quality assessment.
5. Variant Processing
Enables dbSNP annotation and other downstream analyses.
6. gVCF File Processing
Handles genotype and index gVCF files for population-scale studies.
Specialized Pipelines
Germline Pipeline
Used for analyzing non-reproductive cell data, this pipeline includes options for sex inference and alignment using BWA-MEM, followed by variant calling with HaplotypeCaller.
Somatic Pipeline
Designed for tumor and non-tumor genome analysis, it employs BWA-MEM for alignment and GATK Mutect for mutation detection.
RNA Pipeline
Focused on RNA sequencing, it leverages the STAR aligner for mapping and HaplotypeCaller for variant discovery.
Hardware and Deployment Options
Parabricks pipelines can run on local servers or cloud platforms such as AWS, GCP, OCI, and Azure. The latest version (v4.3.1-1) supports the NVIDIA Grace Hopper superchip, offering enhanced performance for large-scale genomic analysis. It is even possible to run Parabricks on gaming hardware, as long as the minimum requirements are met.
New Applications in Research
Parabricks is widely used in genomics research, particularly in cancer studies. Notable examples of new and upcoming features include:
- Accelerating DeepVariant pipelines for long-read Hi-Fi whole-genome sequencing data.
- Reducing processing times in Nanopore WGS data using GPU-optimized tools.
Conclusion
NVIDIA Parabricks represents a paradigm shift in genomics, bridging advanced computing and healthcare. Its ability to deliver real-time, cost-effective insights is revolutionizing genomic research, enabling breakthroughs in precision medicine and beyond. With ongoing innovations, Parabricks continues to redefine the boundaries of what’s possible in genomic analysis.
BioCentric Services
- Home
- About BioCentric
- Consultancy and Research
- BioIT Infrastructure
- Bioinformatics Training
- Contact BioCentric
Posts and News
- March 2023
- January 2023
- November 2022
- March 2022
- February 2021
- July 2020
- March 2020
- February 2020
- November 2019
- October 2019
- September 2019
- July 2019
- June 2019
- March 2019
- February 2019
- January 2019
- December 2018
- November 2018
- September 2018
- August 2018
- July 2018
- June 2018
- May 2018
- April 2018
- March 2018
- February 2018
- December 2017
- November 2017
- September 2017