bioinformatics banner
slider

Software:AgilentFilter

The AgilentFilter program is a tool for filtering and reformatting the raw data .txt files from the Agilent microarray platform. Have you ever wanted to remove the positive and negative control spots from a data set, or extract just the gene annotations and pertinent data columns from a set of Agilent array files, or filter out any weak or low quality spots from an Agilent data set? AgilentFilter can do all of this and more.

The AgilentFilter program can run on Windows PCs, Macs, or computers running the Linux operating system. AgilentFilter requires the Java Runtime Environment, which is available here.

Contents

Launch AgilentFilter via WebStart

Click one of the links below to launch the AgilentFilter program. AgilentFilter requires quite a bit of computer memory. Choose the largest option that will fit within your computer's RAM:

Using AgilentFilter

To use the AgilentFilter program, you need to select some Agilent input data files, specify an output file, choose some annotation, data, and quality control values to output, optionally select some filtering options, and choose a format for your output file. Each of these actions is performed on a separate "tab" in the program.

Choosing input data files

Click on the "Input Files" tab, and then the "Add..." button. Browse around your files to find the first data file you'd like to filter, and click the "Open" button. Click the "Add..." button again to add more files.

You can also drag and drop input files into the AgilentFilter program.

Choosing an output file

Click on the "Output File" tab, and then the "Select..." button. Locate the folder or directory where you'd like the output to be placed, and type the output file name into the "Save As:" box. If the output file already exists and you'd like to overwrite it simply select the output file and click "Save".

Annotation columns

You can specify which gene annotation columns to include in your output file using check boxes on the "Annotation Columns" tab. These values will be the first columns in your output file, and will be drawn from the first data file in your list of input files.

Data and quality control columns

The checkboxes on the "Data Columns" and "Quality Control Columns" tabs allow you to choose which data and quality control values to include in your output file. These columns are drawn from all of your input data files.

Data summarization

Data can be presented at the spot level (i.e. no summarization), probe level (averaged for each probe on the array), or gene level (averaged for each gene). The data summarization controls are on the "Filters & Format" tab.

Spot level data is output exactly as it appears in the input files. In addition, the FeatureNum column is added to the output file.

When files are summarized at the probe level, data and quality values are averaged for each unique probe on the array. Also, the ProbeName column is added to the output.

When files are summarized at the gene level, data and quality values are averaged for each unique transcript on the array (i.e. for each unique SystematicName value). The SystematicName column is added to the output.

Filtering

Various filtering options are available on the "Filters & Format" tab. By default the positive and negative controls are removed. The IsFound, IsPosAndSignif, and IsWellAboveBG values for the red and green channels can be used to filter by progressively more stringent intensity thresholds. The IsFeatNonUnifOL and IsBGNonUnifOL controls enable removal of non-uniform features. Population outliers can be removed with the IsFeatPopnOL and IsBGPopnOL controls. Manually flagged features can be removed using the IsManualFlag control. Saturated features are removed using the IsSaturated flags.

Data transformations

Data columns can be transformed to the log (base 2) scale using the "Log2 transform data" checkbox, and can be normalized by checking the "Normalize quantiles" checkbox.

Formatting

The only output format available in the current version of AgilentFilter is the tab-delimted text format. More formats will be added in future releases.

Suggested Settings

Annotation Columns

For use with GeneSifter, select the SystematicName and GeneName columns. For other analyses choose whichever columns you want.

Data Columns

  • Agilent Two-color experiments: gProcessedSignal and rProcessedSignal
  • Agilent One-color experiments: gProcessedSignal
  • Agilent miRNA arrays: gTotalGeneSignal

Quality Control Columns

  • Agilent Two-color experiments: gIsWellAboveBG and rIsWellAboveBG
  • Agilent One-color experiments: gIsWellAboveBG
  • Agilent miRNA arrays: gIsGeneDetected

Filters and Format

Gene Expression Arrays

  • Data Summarization: Probe Level
  • Filters:
    • Remove control spots
    • Require gIsFeatNonUnifOL=0, rIsFeatNonUnifOL=0
    • Require gIsFeatPopnOL=0, rIsFeatPopnOL=0
  • Data transformations: Log2 transform data

miRNA Arrays

  • Data Summarization: Gene Level
  • Filters: Remove control spots
  • Data transformations: None, because some of the intensity values will be less than 0. If you want log-transformed data you must first adjust these values in some other program (Excel for example) after you have filtered the data files.

Questions?

Contact Brett Milash in the Bioinformatics Core.

Revision History

  • 02/04/2011 - Added support for miRNA arrays and drag-and-drop to select input files.
  • 01/22/2009 - Added log2 transformation and quantile normalization. Note: quantile normalization will fail if arrays have different number of missing values.
  • 08/18/2008 - Added 3.0 Gb version.
  • 08/05/2008 - Includes progress indicator, cancel button, and support for one-color gene expression experiments.
  • 02/06/2008 - Initial version.