high throughput genomics core banner
slider

Guidelines for Genome Assembly Sequencing Projects

The High Throughput Genomics Core Facility provides genome assembly solutions through the support of short insert (220 bp) libraries and long insert (Mate Pair) libraries. Sequence reads from these two classes of libraries can be used with the ALLPATHS-LG software for genome assembly. A description of these library preparation features and recommendations for preparing samples for genome assembly sequencing projects are provided below::
  1. Library Prep kits supported by the High Throughput Genomics Core Facility for genome assembly projects are described below:
    1. The Illumina TruSeq DNA PCR-Free Sample Prep (180 bp mean insert size) protocol recommends an input of 2000 ng of DNA (in a volume of 100 ul) to construct a library with an average insert size of 180 bp, enabling an average 70 bp overlap when performing 2 x 125 cycle paired end sequencing. Insert sizes range from approximately 100 to 300 bp and nearly 80% of the reads will have some overlap. Libraries are constructed in the absence of pcr to minimize bias routinely introduced by amplification.
    2. The Illumina TruSeq Nano DNA Sample Prep (200 bp mean insert size) protocol recommends an input of 200 ng of DNA (in a volume of 100 ul) to construct a library with an average insert size of 220 bp, enabling an average 50 bp overlap when performing 2 x 125 cycle paired end sequencing. Eight cycles of pcr are required to complete construction of the library.
    3. The Nextera MatePair Sample Preparation Kit requires 4 ug of column-purified, EDTA-free, high molecular weight genomic DNA in a volume of 100-200 ul. The kit uses a transpose activity to fragment DNA which simultaneously adds a biotinylated junction adapter to mark the DNA ends. Discreet size ranges of fragmented DNA (3 kb, 5 kb and 10 kb) are purified on a Sage Science ELF. Additional customized size ranges can also be defined by the researcher. Purified, fragmented DNA is circularized and subsequently sheared in Covaris AFA instrument to yield sub-fragments which contain the biotinylated junction adapter and likewise the joined ends of the size-selected DNA. Sequencing adapters are subsequently added to the enriched set of sub-fragments to complete library preparation.
  2. Purification of Genomic DNA: Genomic DNA should be purified using a column-based purification protocol such as the Qiagen DNeasy Blood and Tissue kit (cat#69504), or one of the DNA purification kits from Zymo Research (Quick-gDNA MiniPrep, Quick-gDNA Blood MiniPrep, or ZR Genomic DNA-Tissue MiniPrep). We recommend elution of the DNA in 10mM Tris pH 8 as an alternative to using the Qiagen elution buffer (Buffer AE) or the Zymo Research Elution Buffer, both which contains EDTA. The omission of EDTA from the elution buffer is required for the use of SureSelect QXT system as the transposase activity of the QXT kit will inefficiently fragment genomic DNA in the presence of EDTA.
  3. Avoid Organic Extraction Methods: We recommend that you avoid organic extraction methods (such as phenol or Trizol) or other methods that include an alcohol precipitation step. The quality of DNA purified by these protocols tends to be lower due to co-precipitation of contaminants. Furthermore organic carryover can inhibit the enzymatic reactions used in library preparation.
  4. Assessment of DNA Concentration: The concentration of genomic DNA can be determined using the Qubit dsDNA BR assay (Invitrogen cat#Q32850) or the Qubit dsDNA HS assay (Invitrogen cat#32851). These assays use a fluorescent dye that is highly selective for double-stranded DNA over RNA and can detect samples in a concentration range from 10 pg/ul to 1000 ng/ul. In contrast, a NanoDrop measurement often exaggerates the concentration of purified DNA as any RNA that co-purifies with the DNA will contribute to the absorbance measurement at 260 nm.
  5. Assessment of DNA Quality: The quality of genomic DNA can be assessed by running an aliquot of the sample (approximately 10-100 ng) on a 1% agarose gel stained with SYBR Safe DNA Gel Stain (Invitrogen cat#S33102). High quality, intact genomic DNA should appear as a high molecular weight band (>10,000 bp) in the absence of a lower molecular weight smear. Low molecular weight smearing can be indicative of the presence of RNA.
  6. Library QC: Sequencing libraries will be validated by running an aliquot on an Agilent 2200 TapeStation and by defining the molarity of the library using a qPCR assay (Kapa Biosystems Library Quantification Kit for Illumina). After normalizing library concentrations following the completion of these two assays, libraries can be pooled in preparation for sequencing.