- The Illumina sequencing pipeline will be used to convert the images to compressed fastq sequencing data and made available from the Experiments record.
- Alignments using Novoalign will be performed upon request. These are needed prior to any further downstream analysis.
- The USeq project, developed by the core, is a good start for additional analysis. The Usage guide contains, among other things, detailed tutorials for chIP-seq and RNA-seq analysis.
Exome/ Whole Genome Analysis Services
- The Utah Bioinformatics Shared Resource (UBSR) provides several analysis services for whole genome and exome resequencing projects. The following automated low level pipeline is run using the Tomato Framework on a 156 core cluster managed by CHPC for the UBSR. Run times for a low level analysis are typically 1-2 days although the queue of jobs typically pushes delivery to ~1 week.
A low level, 1st pass analysis entails:
- Alignment to a reference genome using novoalign, one of the better gapped aligners available.
- Generation of BAM alignment files from filtered, sorted, SAM files for visualization in genome browsers and downstream analysis using the Picard package
- Estimation of sequencing depth read coverage over Repeat Masked CCDS exons. Generation of read coverage tracks for visualization in genome browsers using the Useq package.
- SNP and INDEL variant detection via GATK.
- Variant classification is performed using VAAST for human samples and ANNOVAR for other species.
- HCI's GNomEx LIMS and Analysis Project Center is used extensively by the UBSR to organize, annotate, and distribute the raw sequencing data (fastq), alignment (sam/bam), and higher level analysis datasets to the investigator. Analysis is made available for visualization in the Integrated Genome Browser or the UCSC Genome Browser using the UBSR's GenoPub server.
See the Exome Tutorial to learn how you can use core resources and TomatoFarmer to process your own exomes
The UBSR has some (yet limited) bandwidth to support higher level analysis services for creating ranked lists of genes likely to be altered in effected samples. This entails working closely with a bioinformatician over several weeks to develop appropriate variant filters that incorporate additional information about the disease into the analysis (e.g. recessive/dominant/autosomal/sex-linked, penetrance, pedigrees, known variants, linkage analysis, gene expression, etc.)
Illumina Pipeline Software Versions
The history of updates to the Illumina Pipeline Software is listed here.