The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data

Milnthorpe, Andrew T and Soloviev, Mikhail

(2012)

Milnthorpe, Andrew T and Soloviev, Mikhail (2012) The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data. PLoS One, 7 (3).

Our Full Text Deposits

Full text access: Open

Full text file - 966.5 KB

Links to Copies of this Item Held Elsewhere


Abstract

EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding "tissue-specific" genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging.

Information about this Version

This is a Submitted version
This version's date is: 2012
This item is not peer reviewed

Link to this Version

https://repository.royalholloway.ac.uk/items/969a9dcb-5de9-85de-d016-15d92b1eb4c9/6/

Item TypeJournal Article
TitleThe Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
AuthorsMilnthorpe, Andrew T
Soloviev, Mikhail
DepartmentsFaculty of Science\Biological Science

Identifiers

doihttp://dx.doi.org/10.1371/journal.pone.0032966

Deposited by Research Information System (atira) on 22-Jul-2014 in Royal Holloway Research Online.Last modified on 22-Jul-2014


Details