bioinformatics banner
slider

Tomato Manual

Parameters

Parameters are specified in the command file ("cmd.txt") for pipelines. Example:

@align -g hg19 -i *.gz

In above example,

Pipeline name: @align
Pipeline parameters: -g hg19 -i *.gz

Basic Parameters

-i INPUT

Required

Input files. Generally inputs are the raw sequence files in FASTQ format (may gzipped). You can supply multiple inputs. An input can be a regular file (like "a123.fq.gz") or a folder(which contains input files) or a pattern to match certain files like "a*.fq.gz". You can think it as a "$ls INPUT" operation to get the qualified files. For example, if you use following meta-command in "cmd.txt":

@align -g hg19 -i A*.txt.gz

Then any files under current job directory that start with "A" and end with ".txt.gz" like "A_1.txt.gz", "A_1_B2.txt.gz" will be aligned to reference genome.

If you want to process some samples as a group, you must put these samples into one sub-folder and use the folder name as input.

-i dir_my_case_samples

Here group means these samples share some common features, like a normal samples group, breast cancer patients group, etc. The difference between grouped and non-grouped is that all the individual alignments from grouped samples will be merged into one single alignment for recalibration while non-grouped samples' alignments will NOT be merged. This will affect all pipelines except for the @align pipeline.

One input must have same file suffix which imply same file type. For example, if you have A_1.txt.gz, A_2.txt.gz and A.sam under your job folder, the following meta-command will fail:

@snp -g hg19 -i A*

Because the input "A*" will match both "A_1.txt.gz", " A_2.txt.gz" and "A.sam". You must separate different type of inputs. The correct command is like:

@snp -g hg19 -i A*.txt.gz A*.sam

You can also use SAM or BAM files as inputs. Tomato will automatically skip the alignment step for SAM/BAM inputs.

Example:
@snpindel -g hg19 -i *.bam

-g GENOME_BUILD

Required

Reference genome build.

Example: @align -g mm9 -i A_1.txt.gz

-s SAMPLE

Required for Control/Case analysis using VarScan or variants group calling using GATK

1. The SAMPLE must be a text file with two columns separated by a TAB. 2. The first column defined the control samples, separated by comma for multiple files. 3. The second column defined the case samples, separated by comma for multiple files.

For example, you have following pair-end sequencing files to analyze:

A1_1.txt.gz        A1_2.txt.gz     A2_1.txt.gz     A2_2.txt.gz
B1_1.txt.gz     B1_2.txt.gz     B2_1.txt.gz     B2_2.txt.gz

So in total you have 4 samples. The two samples start with ‘A’ are case samples, and the other two samples start with ‘B’ are control samples. The sample.txt will look like this for Tumor/Normal analysis with VarScan

B1_1.txt.gz,B1_2.txt.gz,B2_1.txt.gz,B2_2.txt.gz       
A1_1.txt.gz,A1_2.txt.gz,A2_1.txt.gz A2_2.txt.gz


If you want to call variants in B group only, the sample.txt will look like (only one column)

B1_1.txt.gz,B1_2.txt.gz,B2_1.txt.gz,B2_2.txt.gz

This will instruct Tomato to do:

novoalign  B1_1.txt.gz,B1_2.txt.gz > B1.sam
novoalign  B2_1.txt.gz,B2_2.txt.gz > B2.sam
HaplotypeCaller.jar -I B1.sam -I B2.sam -o B.vcf

Without the "-s sample.txt", Tomato will do:

novoalign  B1_1.txt.gz,B1_2.txt.gz > B1.sam
novoalign  B2_1.txt.gz,B2_2.txt.gz > B2.sam
HaplotypeCaller.jar -I B1.sam -o B1.vcf
HaplotypeCaller.jar -I B2.sam -o B2.vcf

-r

Optional

Reserve all intermediate files.

Example: @align -g mm9 -i A_1.txt.gz -r

Caution: Pipeline will generate lots of intermediate files which are not necessary at most cases and these intermediate files may occupy huge amount of disk space on cluster node. If the total size of all files are exceeding the available disk space on cluster node during processing, your job will be failed.


For any default pipelines, parameter "-g" (for reference genome) and "-i" (for input files) are required

Overriding Parameters

These parameters were used for overriding (or add new parameters) the default settings in the pipeline.

-novoalign [ parameters ]

You can use this parameter to pass your own settings to novoalign when you are using a pipeline that use novoalign internally.

Reserved parameters: '-d', '-f' and '-o'

(You can NOT specify reserved parameters in the "[parameters]"


Example:

@align -g hg19 -i *.gz -novoalign [ -r None -t 50 ]

-bwase [ parameters ]

You can use this parameter to pass your own settings to BWA when you are using a pipeline that use BWA internally for Single End Alignment.

Reserved parameters: '-r'

Example:

@align -g hg19 -i *.gz -bwase [ -a 123 -b 456 -c 789 ]

-bwape [ parameters ]

You can use this parameter to pass your own settings to BWA when you are using a pipeline that use BWA internally for Pair End Alignment.

Reserved parameters: '-r'

Example:

@align -g hg19 -i *.gz -bwape [ -a 123 -b 456 -c 789 ]

-UnifiedGenotyper [ parameters ]

Reserved parameters: '-R', '-I' and '-o'

-HaplotypeCaller [ parameters ]

Reserved parameters: '-R', '-I' and '-o'

-VariantRecalibrator [ parameters ]

Reserved parameters: '-R', '-input', '-recalFile' and '-tranchesFile'

-ApplyRecalibration [ parameters ]

Reserved parameters: '-R', '-input', '-o', '-recalFile' and '-tranchesFile'

-SelectVariants [ parameters ]

Reserved parameters: '-R', '--variant' and '-o'

-VAAST [ parameters ]

Reserved parameters: NONE

[-annovar [ parameters ]

Reserved parameters: NONE

Deprecated Parameters

These parameters from Tomato1 are now deprecated in Tomato2.

-bam

You can NOT generate the BAM alignment in the pipeline by using "@align ... -bam". Instead, you should use or construct a new pipeline.

-annovar for @annot pipeline

You can NOT generate the ANNOVAR report in the "@annot" pipeline by using "@annot ... -annovar". Instead, you should use or construct a new pipeline.

-vaast for @annot pipeline

You can NOT generate the VAAST report in the "@annot" pipeline by using "@annot ... -vaast". Instead, you should use or construct a new pipeline.

 Header lines

Tomato use a few special header lines that start with '#' for some special purposes.

#e MY_EMAIL OPTION

Which "#e" means "email". It is reserved for email address. Tomato will send email to this address if the job finished or failed. Please note there is a separator space between the "#e" and your email address.

Example: 
#e u12345@utah.edu

Attention: Now email address is required in all jobs. This is for better management of jobs and disk space.

By default, Tomato will notify you on all events by email. You can selectively receive emails with OPTION.

"#e name@address.com -abcefn"

-a  mail will be sent when possible(equal -bcef).
-b  mail will be sent when the job begins execution.
-c  mail will be sent when the job completed successfully.
-e  mail will be sent when the job has exception (may or not failed).
-f  mail will be sent when the job failed.
-n  mail will NOT be sent  (reverse of -a)

Example:

1. receive all emails

(default, no options specified)

#e name@address.com

or

#e name@address.com -a

2. receive email only when the job Begins execution.

#e name@address.com -b

3. receive email only when the job Failed.

#e name@address.com -f

4. receive email only when the job Begins and Failed.

#e name@address.com -bf

or

#e name@address.com -b -f

5. I do Not want to receive any emails

#e name@address.com -n

Please note '-n' will supress all other options. so '-abcefn' equals to '-n'

#l MY_LAB

Which "#l" means "lab" (It is lower case of "L", not number "1"). It is reserved for sending result to GNomEx and generating Analysis Report directly. Please note that all outputs including the log.txt file are stored in the GNomEx. Check your email for name of the Analysis created by Tomato for this job.

Example: 
#l Brad Cairns

Please make sure the lab's name is correct (Case-sensitive, as defined in GNomEx). A single typo will fail your job. To avoid this , you can use lab number as shown in Appendix A.

For example, if you are from "Brad Carins Lab" which number is 23, just use:
#l 23
If you are from "U of U Bioinformatics Shared Resource", it will be:
#l 244

It is strongly recommended to use lab number instead of full lab name.

#a ANALYSIS_NAME

Which "#a" means "append". It is reserved for sending result to an existing GNomEx Analysis.

Example: 
#a A404

This head tells Tomato to append results to an existing GNomEx Analysis A404. To avoid overwriting existing files, Tomato will create an subfolder under GNomEx Analysis A404 and use the time of creation as the new folder name. For example, the folder name could be like "02_07_2012_13_43_12" which means this folder was created at "Feb/07/2012, 13:43:12"

Please note this parameter will supersede "#l" if you use them together.

Example: 
#l 244
#a A404

In above example, "#l" will NOT work. No new GNomEx Analysis will be created. All results go to existing A404.

##COMMENT

This header is reserved for "general description of this job". If you also specified "#l", then these lines will be appeared as "description" of Analysis Report in GNomEx.

Example:

#e my_name@my_organization.com
#l 244
##This is brief description of this job for GNomEx.
##you can write anything in here and it will appear in the "Description"
tab in the GNomEx Analysis
##End of description
@align -g hg19 -i *.gz -r

As a general rule, you should always give your email address as the first header line (not necessary, but highly recommended).

Job Control

Begin

To start your job you have to create a 'b' file which means "Begin". It is a dummy file like a signal to Tomato.

$touch b

Tomato will start downloading(and running) your job after 'seeing' this file in your job folder. After Tomato starting your job, that 'b' file will be automatically deleted from your job folder. So you will observe 'b' file disappeared which means Tomato has already started running your job.

Abort

To terminate a running job (your job is running on CHPC, but not finished yet), you can create a dummy 'a' file which means "Abort".

$touch a

Tomato will terminate your job and clean all debris. You will NOT get notification email in this case (because you expect what will happen).

Dry run

You can create a 'd' file which means "Dry run".

$touch d

Tomato will NOT start this job, instead it will email you the full commands list for this job.

Number of Jobs

These rules applies to all general users.

1. At any time, any users can run at most 6 jobs.

2. The first 3 jobs will have 240 hours of walltime limitation and the remaining 3 jobs will have 24 hours of walltime limitation.

Watch job progress

If Tomato accepted your job, it will create a log file named "log.txt" under your job folder. You will observe that the "b" file disappeared firstly then a new "log.txt" will appear thereafter. During the whole processing period, Tomato will continuously update this log file once there is a new event happened. So you can monitor this file to get the progress of your job. A typical command is like:

$tail -qF log.txt

Which will continuously refresh the log.txt once it was updated by Tomato on progress.

Standard output and error

By default, Tomato will also create two files named "stdout.txt" for standard outputs and "stderr.txt" for standard errors that coming from your commands in "cmd.txt". For example: if you have the following lines in "cmd.txt":

echo "hello,world"
echo "byebye"
touch foo.txt
thisdoesnotexist  
echo "all done"
touch bar.txt

Then after execution of these job on Tomato, the "stdout.txt" will be like:

hello,world
byebye

the "stderr.txt" will be like:

bash: thisdoesnotexist: command not found

Please note "stdout.txt" does NOT have the output ("all done") from the command echo "all done" because the third command "thisdoesnotexist" failed, then all other commands that after failed command will NOT be executed. This fail mechanisam is to guarantee that if a "pipe" in a "pipeline" is broke, then cancel the whole pipeline because a pipeline is alike a highly organized assembly line, any bad parts from previous worker should not passed to next worker to avoid bad final product. However, any good parts before that failed worker (command) will be transfered back to your original job folder in case you want to re-use the good parts. In this case, you will get the "foo.txt" but no "bar.txt".

"stderr.txt" also serves as a bug tracker. Tomato will write the error message to the "stderr.txt" once the job is failed. That helps to locate and fix the problem.

Advance features

Tomato running on CHPC and scan a special directory JOB_DIR on HCI servers. There are three important paths in Tomato:

JOB_DIR = /tomato/job/

APP_DIR = /tomato/app/

DATA_DIR = /tomato/data/

The JOB_DIR is the path that “tomato” will remotely scan for possible jobs. Therefore, you have to create your job folder under JOB_DIR.

>cd /tomato/job/
>mkdir hello
>cd hello
>ln -s /PATH/TO/MY_INPUTS/*
>vim cmd.txt
>touch b

The APP_DIR has all applications (include novoalign, picard, etc.) to be used by Tomato for different pipelines.

The DATA_DIR has all pre-build library files (include hg19.fasta, mm9.fasta, etc.)and supplementay files to be used by applications in APP_DIR.

Generally, Tomato only use the applications that already installed in the PATH_APP. However sometimes you may need to run other applications in Tomato. To do so, you have two choices:

1. copy your application to APP_DIR only if this application will be used in the future very often (consider it as system level application)

2. copy your application on your job folder (consider it as user level application)


Example: you have a java application named "hello.jar" which can clean SAM file and you want to use this java application in Tomato.

1. #copy hello.jar to PATH_APP
    $cp hello.jar /tomato/app/
2. #use hello.jar in Tomato
    $pwd
    /tomato/job/foo
    
    $ls
    A.sam
    
    $cat > cmd.txt
    hello.jar A.sam > B.sam    
    
  3. #Make the b file
    $touch b

Please note do NOT use the "java -jar" to run your "hello.jar" file in Tomato. Tomato determines how to execute a command by its suffix and file type. If the command is a native linux application or compiled binary executable application, Tomato will execute it directly. If it is like "hello.jar ...", then the real command to be executed would be like "java -jar -Xmx20g /PATH/TO/hello.jar ..." If it is like "hello.py ...", the real command to be executed would be like "python hello.py ...". Any files that end with ".pl" (for perl script), and ".sh" (for bash script) are also supported.

To summarize, Tomato supports user applications include:

1. Executable Linux binary application

2. Shell script application that end with ".sh"

3. Python script application that end with ".py"

4. Perl script application that end with ".pl"

5. Java application that end with ".class"

6. Java jar application that end with ".jar"

Tomato automatically synchronize APP_DIR and DATA_DIR between the HCI-server and Clusters. So in above example, you only copy your application to the APP_DIR however Tomato is capable to run this application on cluster node because all applications under APP_DIR are automatically uploaded to clusters and indexed. You do not need to care about the full path to your application and data, just run your application as if all the files were in the same folder. You can also copy your library files or any other commonly used files to DATA_DIR if needed.

$cp /PATH/TO/MYHOME/bv3.nov.illumina.nix /tomato/data/

Then other user can share and reuse your library files "bv3.nov.illumina.nix" in his/her own jobs. This helps to reduce duplicated library files.

$ cat '@align -g bv3 -i *.gz' > cmd.txt


However, the command itself (and library files) is case-sensitive, so you might be careful on typo in typing your commands in "cmd.txt".

Pairing and grouping reads file

Some sequencing file are pair-ended while some sequencing files are single-ended, depends on the job. Suppose you have many files under your working directory

$ll
A_1_1.txt.gz
A_1_2.txt.gz
B_1_1.txt.gz
B_2_2.txt.gz
C_1_1.txt.gz
C_1_2.txt.gz

And only one line in "cmd.txt"

@align -g hg19 -i *.txt.gz

In this case, Tomato will try to pair these files. First, Tomato assume all input sequencing files should have a name like "XXXX_N.Y" which 'N' is a number. Regarding above example, Tomato will take "A_1_1.txt.gz" and "A_1_2.txt.gz" as paired sequencing files because these two files have common prefix "A_1_" (the common prefix will be used as the name for output). But Tomato will take "B_1_1.txt.gz" and "B_2_2.txt.gz" as two independent files because their prefix are not same : "B_1" and "B_2". If you change the names of two files into "B_1.txt.gz" and "B_2.txt.gz", then they will be paired.

For above example, the final commands (not real code, just for demonstration) to be executed on cluster will be like:

novoalign -o SAM -d hg19.nix -f  A_1_1.txt.gz A_1_2.txt.gz > A_1.sam 2>A_1.log
novoalign -o SAM -d hg19.nix -f  B_1_1.txt.gz > B_1.sam 2>B_1.log
novoalign -o SAM -d hg19.nix -f  B_2_2.txt.gz > B_2.sam 2>B_2.log
novoalign -o SAM -d hg19.nix -f  C_1_1.txt.gz C_1_2.txt.gz > C_1.sam 2>C_1.log

The automatic pairing is designed to work on "standard" file names that has pattern "X_N.Y" (please note the underscore before N). X is called prefix and Y is called suffix. If both X and Y of two files are equal and one file's N is 1 and another file's N is 2, then these two files are processed as pair-ended files.

For example:

A_1_1.txt.gz and A_1_2.txt.gz are paired. Here Both of their X are "A_1_" and both "Y" are ".txt.gz". One N is 1 and another N is 2.

A_1_1.txt.gz and A_2_2.txt.gz are NOT paired. X are different.

A_1_1.txt.gz and A_1_2.txt are NOT paired. Y are different.

A_1_1.txt.gz and A_1_3.txt are NOT paired. N are not 1 and 2.

A1.txt and A2.txt are NOT paired because the underscore is missing.


If you specify directory as input like "-i DIR_A DIR_B", then all files under the same directory are processed as a group.

All the individual alignments will be merged into one alignment for recalibration. For example:

>ls DIR_A
A_1_1.txt.gz
A_1_2.txt.gz
A_2_1.txt.gz
A_2_2.txt.gz
A_3_1.txt.gz
A_3_2.txt.gz

The pseudo-commands will be like:

novoalign A_1_1.txt.gz A_1_2.txt.gz > A_1.sam
novoalign A_2_1.txt.gz A_2_2.txt.gz > A_2.sam
novoalign A_3_1.txt.gz A_3_2.txt.gz > A_3.sam
cat A_1.sam A_2.sam A_3.sam> A.sam
recalibrate A.sam
process A.sam
SNP A.sam > A.vcf

Output and GNomEx

By default, Tomato will send results back to your original job folder. However you can tell Tomato to send results to GNomEx directly by using a special header line "#l LAB_NUMBER_OR_LAB_NAME" into your "cmd.txt". The benefit of sending results to GNomEx is that you will have automatically generated report.

For example, to create a Analysis at GNomEx for this job, you can write your "cmd.txt" as

#e my_email@gmail.com
#l 244
##my_description_of_this_job
##my_other_description of this job
##all the words I want to say about this job will appear in GNomEx's Analysis
@align -g hg19 -i *.gz

Once the job is done, you should receive an email which give the name of this analysis like "A371". Then you can log into GNomEx and feed "A371" to "Look up Experiment or Analysis" which at the top left corner of GNomEx's header. Where you can view the report and download all the result files.

Image:tomato_gnomex_4.jpg

Naming rules for library files

By default, Tomato use Novoalign to process all alignments. You can create your own genome index files and put it under PATH_DATA for sharing with other people. This genome index file MUST be named as "NAME.nov.PLATFORM.nix".

Example:

You want to create the index for the latest Chimpanzee genome.

1. Download the genome data from UCSC in fasta format like "panTro3.fa"

$wget 
http://hgdownload.cse.ucsc.edu/goldenPath/panTro3/bigZips/panTro3.fa.gz
$gunzip panTro3.fa.gz
$mv panTro3.fa chimp3.fasta

2. Create the index using novoindex for illumina platform.

$novoindex -k 14 -s 1 chimp3.nov.illumina.nix chimp3.fasta

If you want to create the index for SOLiD ColorSpace platform, using:

$novoindex -c -k 14 -s 1 chimp3.nov.solid.nix chimp3.fasta

If you want to create the index for bisulphite, using:

$novoindex -m -k 14 -s 1 chimp3.nov.bisulphite.nix chimp3.fasta

3. Move the index file to DATA_DIR

$mv chimp3.nov.bisulphite.nix /tomato/data/

4. Use this new index in your "cmd.txt"

@align -g chimp3 -i *.gz

Except for the genome index file for novoalign, there are no constraints of names on other prepared files - You can use any Unix-compatible names you want. However one general rule is that any files under DATA_DIR (/tomato/data) should be repeatedly and publicly used in the future by you and/or other users. If it is only used for one-time job, just put it under your job folder.

Limitations

1. Disk space. Each job are dispatched to one cluster node for processing. For maximal performance, all files(include input files, library files, sequences files and all other supplementary files) that are necessary to run this job were copied to that node's local disk. All intermediate and resulting files are also stored in that local disk too. However, the local disk on each node has only ~400GB free space. Therefore, the total size of your job will have that limitation.

2. Maximum running time. The maximum running time assigned to each job is 240 hours (10 days) on a single cluster node. It is enough for most jobs. If your job is not finished within 240 hours, then it will be terminated and existing resulting files will be back to you. You may split your big job into several small jobs and submit small jobs one by one again. The walltime for each job will be adjuested at runtime if there is a scheduled downtime at CHPC. Let us assume now is 2012/03/05 1:00PM and the CHPC will be down at 2012/03/06 2:00PM as scheduled, then we know we have 25 hours until the CHPC is down. Now your job will get 25 hours to complete. If a job was submitted at 2012/03/05 3:00PM, it will get 23 hours. As usual, if the job did not finish in 23 hours, it will be terminated.


3. Tomato is just a framework. The internal pipelines only handle common cases and have some limitations. For example, many applications in pipelines use default parameter settings which may not be appropriate for your project. If you know what you are doing, please use raw commands in "cmd.txt". Please note: do not put path for applications and library files.

For example, you can write your "cmd.txt" like:
novoalign -r Random -k -o SAM $'@RG\tID:tomato\tPL:illumina\tLB:libtmp\tSM:sample\tCN:HCI' -d hg19.nov.illumina.nix -f A/A_1.txt.gz A/A_2.txt.gz >A/A.sam FixMateInformation.jar INPUT=A/A.sam OUTPUT=A/A.mate.bam TMP_DIR=/scratch/ibrix/chpc_gen/HCIHiSeqPipeline SO=coordinate VERBOSITY=ERROR QUIET=true VALIDATION_STRINGENCY=SILENT BuildBamIndex.jar INPUT=A/A.mate.bam VERBOSITY=ERROR QUIET=true MarkDuplicates.jar INPUT=A/A.mate.bam OUTPUT=A/A.mate.dup.bam M=A/A.mate.bam.duplicate VERBOSITY=ERROR QUIET=true REMOVE_DUPLICATES=true ASSUME_SORTED=true TMP_DIR=/scratch/ibrix/chpc_gen/HCIHiSeqPipeline SortSam.jar INPUT=A/A.mate.dup.bam OUTPUT=A/A.mate.dup.sort.bam CREATE_INDEX=true SO=coordinate COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=5000000 VERBOSITY=ERROR QUIET=true TMP_DIR=/scratch/ibrix/chpc_gen/HCIHiSeqPipeline ...OTHER COMMANDS...

In this case, you can treat Tomato as a proxy which will run your commands in our cluster and return the result back to you.

Appendix

A: List of Labs and associated number

1 Adam Richardson

2 Agilent Genomics

3 Alana Welm

4 Albert Park

5 Alfred Cheung

6 Amnon Schlegel

7 Anderson Mayfield

8 Andy Weyrich

9 Anne Moon

10 Anne Rowley

11 Annika Svensson

12 Antonis Rokas

13 Balamurali Ambati

14 Baldomero Olivera

15 Barbara Graves

16 Beddhu Srinivasan

17 Ben Fitzpatrick

18 Ben Major

19 Betty Leibold

20 Bill Carrol

21 BioMicro Inc

22 Bob Weiss

23 Brad Cairns

24 Bradley Katz

25 Brandon Bentz

26 Brenda Bass

27 Brett Milash

28 Brian Dalley

29 Bryan Welm

30 C. Matthew Peterson

31 CJ Tsai

32 Carl Thummel

33 Carl Wittwer

34 Cathy Petti

35 Charles Murtaugh

36 Charles Parker

37 Chi-Bin Chien

38 Chris Ireland

39 Chris Lowe

40 Christian Con Yost

41 Christof Westenfelder

42 Christopher Gregg

43 Cicely Jette

44 Client

45 Colin Dale

46 Colin Thacker

47 Corrie Moreau

48 Courtney Weber

49 Curt Hagedorn

50 Dale Abel

51 Daniel Burgess

52 Daniel Dunn

53 Dave Viskochil

54 David Bearss

55 David Bull

56 David Gaffney

57 David Grunwald

58 David Jones

59 David Joyner

60 David Stillman

61 David Virshup

62 Dean Li

63 Dean Tantin

64 Deb Neklason

65 Denise Dearing

66 Dennis Winge

67 Diana Stafforini

68 Don Ayer

69 Don Blumenthal

70 Don McClain

71 Doug Grossman

72 Douglas Carrell

73 Dyamid Inc

74 Ed Levine

75 Eli Adashi

76 Elizabeth Sexton

77 Eric Huang

78 Eric Schmidt

79 Erik Andrulis

80 Erik Jorgensen

81 Estelle Harris

82 Frank Zhan

83 Gabrielle Kardon

84 Gary Drews

85 Gary Schoenwolf

86 George Rogers

87 Gerald Krueger

88 Gerald Spangrude

89 Gernot Presting

90 Gordon Lark

91 Grzegorz Bulaj

92 Guido Tricot

93 Guy Zimmerman

94 Hans Albertsen

95 Ingo Titze

96 Isabelle Carre

97 Ivor Benjamin

98 Jackie Panko

99 James Kushner

100 James Roach

101 Janis Weis

102 Jared Rutter

103 Jason Schwartz

104 Jason Stajich

105 Jay Agarwal

106 Jean-Marc Lalouel

107 Jeff McDonald

108 Jeff Shen

109 Jerry Kaplan

110 Jim Metherall

111 Jindrich Kopecek

112 JoAnn Ferrini

113 Jody Rosenblatt

114 Joe Yost

115 Joel Griffiths

116 John Atkins

117 John Hoidal

118 John Kriesel

119 John McDonald

120 John Phillips

121 John Weis

122 Josef Prchal

123 Joshua Schiffman

124 Joshua Udall

125 Julie Korenberg

126 June Round

127 Kael Fischer

128 Kania Stephen

129 Karen Zempolich

130 Karin Chen

131 Karl Voelkerding

132 Kathleen Light

133 Ken Norman

134 Ken Woycechowsky

135 Kent Golic

136 Kent Lai

137 Kevin Flanigan

138 Kevin Strait

139 Kim Davis

140 Kim Hanson

141 Kojo Elenitoba-Johnson

142 Kurt Albertine

143 Kurt Schibler

144 Larry Kraiss

145 Laurence Meyer

146 Leslie Sieburth

147 Li Wang

148 Lisa Cannon-Albright

149 Lor Randall

150 Lorise Gahring

151 Luca Brunelli

152 Lynn Jorde

153 Lyon Robison

154 Margaret Yu

155 Mario Capecchi

156 Mark Elstad

157 Mark Leppert

158 Mark Metzstein

159 Mark Yandell

160 Martin Tristani-Firouzi

161 Marty Slattery

162 Mary Beckerle

163 Mary Bronner

164 Matt Firpo

165 Matt Horton

166 Matt Mulvey

167 Matt Williams

168 Maureen Condic

169 Maurine Hobbs

170 Meg DeAngelis

171 Melissa Deadmond

172 Michael Deininger

173 Michael Deininger

174 Michael Franklin

175 Michael Lefevre

176 Michael McIntosh

177 Michel Slotman

178 Microarray Core Facility

179 Mike Howard

180 Mike Shapiro

181 Mike White

182 Min Zhang

183 Monica Vetter

184 Morton/Wang

185 Myriad Research

186 Nick Trede

187 Nicola Camp

188 Pat McAllister

189 Patrick Abbot

190 Patrick Tresco

191 Paul Shami

192 Perry Renshaw

193 Perry Ridge

194 Peter Beal

195 Phil Bernard

196 Philip Moos

197 Pinar Bayrak_Toydemir

198 Ping Xu

199 Plant Biotechnology Panjab University

200 Randall Burt

201 Randall Moon

202 Randy Jensen

203 Raoul Nelson

204 Ray Lee

205 Ray Warters

206 Ray White

207 Reid Robison

208 Richard Ajioka

209 Richard Dorsky

210 Richard Orlandi

211 Robert Fujinami

212 Robert Lane

213 Rong Mao

214 Runlin Ma

215 Sabine Fuhrmann

216 Sancy Leachman

217 Schickwann Tsai

218 Scott Edwards

219 Scott Summers

220 Sean Esplin

221 Sean Tavtigian

222 Shannon Odelberg

223 Sherrie Perkins

224 Stefan Pulst

225 Stephen DiFazio

226 Stephen Lessnick

227 Stephen Palumbi

228 Steve Blair

229 Steve Guthery

230 Steve Hunt

231 Steve Prescott

232 Steven Gray

233 Sunil Sharma

234 Susan Thibeault

235 Suzanne Mansour

236 TRAC

237 Teri Mauch

238 Thai Cao

239 Thomas Buckley

240 Tim Formosa

241 Tissue Resource and Application Lab

242 Tom McIntyre

243 Tom Parks

244 U of U Bioinformatics Shared Resource

245 Utah Autism Research Project

246 Vadim Gladyshev

247 Veronica Hinman

248 Vicente Planelles

249 Wenping Qiu

250 Wolf Samlowski

251 Wolfgang Baehr

252 Xinjian Chen

253 Yukio Saijoh

254 Zebrafish Molecular Genetics Core

255 Katie Ullman

256 Mahesh Chandrasekharan

257 Ivana De Domenico

258 Melinda Angus-Hill

259 Sankar Swaminathan

260 Bing-Jian Feng