Realtime Variant Warehouse
The engine behind Mosaic. A population-scale variant warehouse enabling complex annotation, phenotype, and genotype queries in realtime
Request DemoFrom the creators of iobio.io
Population Scale
Can store 10's of thousands of whole exomes enabling powerful queries that cut across all of an organization's data.
Realtime Querying
Designed to support realtime visualization and analytics, our variant warehouse can execute most variant-focused queries in < 1 second. Unlike most solutions that rely on fixed sample-sets, ours is truly real-time, meaning the sample set can be defined on-the-fly or even be determined based on other criteria including phenotypes.
Genotype & Phenotype Querying
Create complex queries that contain genotype, phenotype, annotations, and meta-data criteria. For example: give me all the variants in ((gene A) or (gene B) or (random region C)) that have a gnomAD allele frequency less than 0.01, are labeled as 'Pathogenic' in ClinVar, and are SNPs. Now let me sort by AtlernateAlleleFrequency. Now let me select the samples that are homozygous for these variants from individuals who are smokers and older than 60.
Custom Annotations
Add your custom annotations as well as host of industry-standard annotations without sacrificing scale or speed.
- ClinVar Significance
- Gene Consequence
- Gene Impact
- Gene Symbol
- gnomAD AF Popmax
- gnomAD Allele Count
- gnomAD Allele Frequency
- gnomAD Allele Number
- gnomAD Homozygous Count
- gnomAD Popmax
Solid, Secure Foundation
The variant warehouse is built on top of Postgres, which means its rock-solid, secure, and has millions of developer-hours behind it. The ecosystem of plugins and documentation that are available for Postgres allows the variant warehouse to be extremely extensible and can be customized to fit your organization's requirements now and in the future. Additonally, the variant warehouse is (1) transactional, which helps with ensuring complicated data operations or things over a flaky/distributed network don't leave the system in a bad state; (2) has a robust multi-tenant model built in, so there won't be concurrency issues, and (3) it's SQL so it should be easy to integrate with existing pipelines, workflows, and applications.
Running the Variant Warehouse
Having Postgres as a foundation means it is easy to run locally, on the cloud, or even as a managed cloud service (e.g. AWS RDS). The Variant Warehouse is relatively fast to load, taking on the same order of time that it takes to fully read the VCF file. It requires extra disk space as it is optimized for speed, whereas VCF files are optimized for disk usage. Although the variant warehouse will require much more space than the related VCF files, the increase of disk usage for your entire project (including alignment data) will be relatively small (< 5%).