bioinformatics banner
slider

Hierarchical clustering in R

You can generate hierarchical cluster diagrams of your microarray data very easily in R. Starting with your data in a tab-delimited text file (a file produced by the AgilentFilter, for example) you can execute the following commands in R:

Read the data file into a data frame.

df=read.delim("your_data_file")

Extract the intensity values from the data frame and convert them into a matrix. In this example the intensity columns contain the word "Signal" in their names, so I use the "grep" function to identify those columns, and store the column numbers in the variable "datacolumns".

datacolumns=grep("Signal",names(df))
m=as.matrix(df[datacolumns])

Transpose the matrix. R wants to cluster the rows of the matrix, but matrix "m" contains one column for each sample.

n=t(m)

Calculate the distances between the samples.

d=dist(n,method="euclidean")

Do the clustering.

h=hclust(d,method="ward")

Plot the result.

plot(h)