Sunday, January 7

Empirical Study of Gene Ontology based Microarray Clustering


This thesis project studies the current several similarity measures over Gene Ontology and introduces a new measure combined with Euclidean distance to perform Microarray analysis. The new combined measures contain both the expression data and (known) biological information from Gene Ontology to express the real biological relation between gene products. In order to adapt the similarity measure to the Gene Ontology, an On-The-Fly probability is initially defined to calculate the probability of a term in the current problem space. A similarity measure between a term and a set of terms is defined, as well as a similarity measure between sets. The performance of applying these similarity measures is compared by clustering a dataset of which the correct clustering scheme is known. The results of the comparison are analyzed and some conclusions are drawn about the similarity measure.

