OLD Blog – OpenCRAVAT

Calibrated Classification Package: Applying ACMG/AMP Guidelines to Computational Predictors in OpenCRAVAT

Sep 19, 2025

Variant interpretation is central to genomic medicine, but one of its most debated areas is how to use computational evidence. The ACMG/AMP guidelines specify criteria such as PP3 (supporting pathogenicity) and BP4 (supporting benignity), but until recently, applying these criteria consistently across predictive tools has been a challenge. The Calibrated Classification Package in OpenCRAVAT was built to address precisely this need. Computational tools like REVEL, BayesDel, and CADD have long been used to predict the functional impact of genetic variants. However, each tool outputs scores on its own scale, and there has been no universally accepted way to map these scores to ACMG/AMP categories. This lack of calibration has led to inconsistent application of PP3 and BP4 across labs and pipelines. In 2022, the ClinGen Sequence Variant Interpretation (SVI) Working Group published a standardized procedure for calibrating computational predictors to ACMG/AMP evidence strengths (Pejaver et al., AJHG 2022). The Calibrated Classification Package implements this procedure within OpenCRAVAT, providing an open-source, ready-to-use solution for clinical curators and researchers.

What the Package Provides

The package delivers pathogenicity and benignity strength of evidence classifications for seven widely used variant effect predictors. These predictors were rigorously evaluated using ClinVar variants (excluding training variants) to ensure unbiased performance assessment. By embedding calibrated predictors into the OpenCRAVAT ecosystem, this package makes it easier to produce reproducible, evidence-based variant interpretations that are directly linked to community standards. The Calibrated Classification Package moves us closer to harmonizing computational evidence in variant interpretation—helping curators, researchers, and clinicians apply ACMG/AMP criteria with greater precision and reproducibility.

Learn More

Blog Post Latest Release News

Bringing Genetic Diversity to the Forefront: Moez Dawood on the AllofUs 250k Annotator for OpenCRAVAT

Aug 15, 2025

Moez Dawood, an MD/PhD student at Baylor College of Medicine, has recently developed the AllofUs 250k annotator for OpenCRAVAT, a genomic annotation platform designed to handle both single variant lookups and large-scale data processing. His work brings one of the most diverse genomic resources in the world directly into variant annotation pipelines, making it easier for researchers and clinical laboratories to incorporate ancestry-specific allele frequency data into their analyses.

Moez just completed his PhD in the Genetics and Genomics program, working with Dr. Richard Gibbs and Dr. James Lupski. His research spans the full spectrum of rare disease genomics, from patient recruitment and consent to sequencing, functional genomics, and variant interpretation. By working across the entire pipeline, he has developed a strong sense of how genomic tools need to be built to work at scale and to work equitably. His motivation for this annotator was straightforward: the All of Us Research Program has intentionally recruited participants from diverse genetic ancestries, and the data produced is uniquely positioned to improve rare variant interpretation.

The All of Us dataset currently includes genomic data from 250,000 participants, with allele counts, numbers, and frequencies broken down by calculated genetic ancestry groups such as African, East Asian, South Asian, Admixed American, Middle Eastern, and European. While other databases remain invaluable, they have historically underrepresented certain populations. Moez saw an opportunity to fill that gap by making the All of Us Variant Annotation Table directly available through OpenCRAVAT. Instead of being limited to searching a website or downloading unwieldy data files, researchers can now integrate this ancestry-aware frequency information directly into their own annotation workflows.

Developing the annotator required both persistence and technical problem-solving. The sequencing for All of Us is performed by three genome centers, including Baylor, and the raw data is harmonized by the All of Us Data Coordinating Center. After obtaining approval for the project, Moez downloaded the variant annotation tables, reformatted them, converted them into a SQL database, and performed quality control before building the OpenCRAVAT module.

Once the data was prepared, integrating it into OpenCRAVAT was straightforward thanks to the platform’s modular architecture and Python backend. The annotator installs like any other module and requires no external dependencies. Moez has since deployed it both locally and in cloud environments, enabling large-scale use cases. The data footprint is significant, around 140 GB, but the indexing and structure allow it to perform efficiently even on complex datasets.

The annotator is already in active use for rare disease research. In Moez’s work, it adds confidence during variant filtering by providing more accurate frequency information for underrepresented populations, helping to rule out common variants that may not be flagged in other databases. The benefit extends beyond rare disease, as the dataset can be applied to population genetics studies, adult-onset disease research, and the discovery of novel gene–disease associations. Moez points out that adult genetics is still relatively underexplored compared to pediatric genetics, and biobanks like All of Us and the UK Biobank are now providing unprecedented opportunities to study genetic architecture in large adult cohorts.

Early validation has shown the utility of the resource. Comparing allele frequencies in ClinVar variants between All of Us and gnomAD revealed expected agreement for common variants, while highlighting key differences in rare and ultra-rare sites. In practical terms, this means there are thousands of variants per genome that lack frequency annotations in gnomAD but are annotated in All of Us, giving researchers actionable information they did not previously have. Moez and his colleagues have presented these findings at the 2025 American College of Medical Genetics and Genomics annual meeting and are preparing a manuscript to describe the implementation and use cases in more detail.

The AllofUs 250k annotator is part of a broader effort by Moez and his collaborators to build a flexible, scalable annotation platform using OpenCRAVAT. Alongside this project, they have released a companion annotator for the Regeneron Million Exome Project and have integrated large language model datasets for additional analytical depth. While Moez will soon return to medical school to complete his final year, he plans to remain active in developing and maintaining genomic resources for the research community.

Researchers and clinical labs can download the AllofUs 250k annotator from the OpenCRAVAT module store and begin incorporating ancestry-specific allele frequency data into their own workflows. Those who use the tool are encouraged to share their findings with the All of Us community and the OpenCRAVAT team. Moez is also happy to connect directly and can be reached at mdawood@bcm.edu.

The All of Us Research Program and its expanding genomic dataset represent a significant step toward inclusive precision medicine. By bringing this data into widely used annotation platforms, tools like the AllofUs 250k annotator make it possible for the benefits of diversity in genomic research to translate directly into better science, more accurate variant interpretation, and ultimately improved patient care.

Blog Post Community News Outreach