bioinformatics banner
slider

Software:DAS2

Contents

DAS2

  • What is DAS/2? It's all about programatically distributing massive genomic datasets to your genome browser or other DAS/2 aware analysis application. Read Gregg Helt's powerpoint presentation and visit the DAS web site.
  • The Integrated Genome Browser is fully DAS/2 compliant, see IGB

GenoPub and the GenoViz DAS/2 Server

  • The GenoViz DAS/2 can now be managed using HCI's GenoPub (Genomic Annotation Publisher), a Flex/Flash web application that stores information about each annotation in a database. It is a user friendly web-based tool that allows annotations to be easily added, described, and organized. Through the tool, users and groups are managed and annotation security is specified. Access to annotations is brokered by the DAS2 server so that only authorized users can gain access to these resources.


Genopub

  • To try out our GenoPub instance
    • Login as guest to browse (Lgn: guest, pwd: guest) or authenticate using your GenoPub account if you wish to add public or private annotations.
    • Contact us for an account.

How To Upload Data Annotation Tracks Into GenoPub

  • GenoPub supports bam, bai, bar, bed, bgn, bgr, bps, bp1, bp2, brs, cyt, ead, gff, gtf, psl, useq, bnib, and fasta file formats. Use the Text2USeq and Wig2USeq apps to convert many text based file formats to binary, preindexed, compressed xxx.useq format. This is critical since text annotation files are loaded into memory. Their size on our server is restricted to 10,000 lines so convert to xxx.useq or xxx.bar formats to enable unrestricted data uploading and distribution.
  • To upload bam files use the Picard tools (http://picard.sourceforge.net) to sort your bam file by coordinate, build an index, and validate both. Then upload both the xxx.bam and it’s corresponding xxx.bam.bai index file simultaneously into the same Annotation in GenoPub. Note, the C based SamTools seem to generate bam and bai files that don’t always pass the java based Picard validation so be sure to run the Picard ValidateSamFile app before uploading to GenoPub.
    • Example:
    • (Optional but recommended) Run the USeq SamFixer to remove poor quality reads, merge multiple lanes of sam files, and if appropriate convert RNA-Seq splice junctions to genomic coordinates
      • java -jar -Xmx1G ~/Software/USeq_7.0.1/Apps/SamFixer -f 7940X13_81BNAABXX_8.novo.sam.gz -s 7940X13_81BNAABXX.sam -c 46
    • Make sorted BAM file with index
      • java -jar -Xmx10G ~/Software/picard-tools-1.36/SortSam.jar SO=coordinate CREATE_INDEX=true TMP_DIR=. CREATE_MD5_FILE=true I=7940X13_81BNAABXX_8.sam O=7940X13_81BNAABXX_8.bam
    • Validate the bam file and index (ESSENTIAL)
      • java -jar -Xmx10G ~/Software/picard-tools-1.36/ValidateSamFile.jar VALIDATE_INDEX=true I=7940X13_81BNAABXX_8.bam
  • Common problems with bam files and DAS/2
    • Listing chromosomes as 1,2,3,4...; these need to be chr1, chr2, chr3...; check the sam header
  • Once your data is properly formatted, launch GenoPub, navigate to the appropriate genome build and into a subfolder for your group (create it if needed using the New Folder button). Then use the New Annotation button to create a record for your data track. If you have many similar datasets, you can clone a prior record and add a new dataset to it using the Duplicate button. Annotations can be dragged between folders. You can use this feature to create multiple views/ groupings of the same data (e.g. by patient, by cell line, by factor, etc.).
  • Lastly, hit the Refresh DAS/2 Server button in GenoPub to add your annotations, fire up IGB, and navigate to the UofU DAS2 server to see your data. If your datasets aren't public you'll need to provide your GenoPub authentication credentials to IGB when prompted. Restart IGB if it is already running to load the new annotations from GenoPub.

 Viewing GenoPub Tracks in the UCSC Genome Browser

To view GenoPub annotation tracks in the UCSC Genome Browser:

  • Upload your data tracks to GenoPub as xxx.useq (see above) or xxx.bam and xxx.bai files
  • Make sure the UCSC Name field is correct on the associated Genome Version, see UCSC Releases
  • Select a data track and click the UCSC Browser button. This posts the linked data as a Custom Track to UCSC.
  • By default, annotations are sent to the public UCSC Genome Browser. This can be changed in GenoPub's User preference panel.
  • If needed, xxx.useq files will be converted on the fly to UCSC xxx.bb or xxx.bw, this can take some time depending on the size of the dataset. To avoid the delay, upload xxx.bb/bw files along with the xxx.useq files. See the USeq2UCSCBig app.

Security Warning! UCSC maintains caches of the uploaded data slices and the urls to the GenoPub data. By default GenoPub deletes the links after 7 days. During that time, UCSC could download the entire file. The urls are unique and can't be deconstructed to access other GenoPub data.

Bulk Upload of Annotation Tracks Into GenoPub

  • Transfer files to a directory accessible by the GenoPub instance
  • Set their permissions to something GenoPub can write to (e.g. chmod -R 770 /path2/BulkUploads/)
  • Launch GenoPub and make a dummy annotation in a particular genome build under the root
  • Set the dummy annotation's visability and any properties you want to clone into all the annotations for upload
  • Create a tab delimited file containing 4 columns, one row per annotation:
    • Name of the annotation in GenoPub - this can be just the name or a path and name, if the grouping folders don't already exist they will be created. For annotations with multiple files (e.g. xxx.bam and xxx.bam.bai), use the same name and path.
    • Full path file name for where it resides on the server
    • A summary
    • A description


Example:

/Hot Lab/Exomes/Default SNPs/Patient3415   /home/fdt/bulkUploads/patient3415SNPs.bed
SNP calls from GATK, default settings   First pass analysis of six fingered crooks.    
/Hot Lab/Exomes/Alignments/Patient3415  /home/fdt/bulkUploads/patient3415.bam
Novoalignments  101bp Novoalignments from 1 lane HiSeq per Agilent V2 Exome capture    
/Hot Lab/Exomes/Alignments/Patient3415  /home/fdt/bulkUploads/patient3415.bam.bai
Novoalignments  101bp Novoalignments from 1 lane HiSeq per Agilent V2 Exome capture     
  • Save this text file with the xxx.bulkUpload extension and upload it into the dummy annotation. This will trigger the bulk uploader in GenoPub.
  • Lastly, hit the Refresh DAS/2 Server button to load the annotations into the server for distribution.

Installation