The ATAV data browser is a web user interface that allows everyone within the network to access variant level data directly from the full data set in the ATAV database. It supports the search of variants by gene, region and variant ID. The gene or region view displays a list of variants with allele count, allele frequency, number of samples, effect, gene etc. The variant view displays a set of annotations (effect, gene, transcript, polyphen) and details about variant carriers (gender, phenotype and quality metrics). It includes links to other public data resources such as Ensembl, gnomAD, ClinVar etc. and directly integrates additional annotations via APIs (e.g. Genoox Franklin API for clinical variant interpretation). The data browser has advanced filters such as a maximum allele frequency threshold to only search rare or ultra-rare variants, restriction to high quality variants or restriction to a certain phenotype. In contrast to many other platforms, the data browser is able to show newly added sample data in real time and is therefore evolving rapidly as more and more samples are sequenced.

Data Generation

Sequencing of DNA was performed by Institute for Genomic Medicine. Samples were either exome sequenced or whole-genome sequenced using Illumina's HiSeq 2000, HiSeq 2500 or NovaSeq 6000 sequencers according to standard protocols.

The Illumina lane-level fastq files were aligned to the Human Reference Genome (NCBI Build 37) using the Illumina DRAGEN Alignment tool. Picard software was used to remove duplicate reads and process these lane-level BAM files, merging into a single sample-level BAM file that was used for variant calling and coverage binning. GATK was used to recalibrate base quality scores, realign around indels, and call variants. All variants were annotated to Ensembl 87 using CLINEFF.