Normalised Affymetrix expression data are biased by G-stacks

Shanahan, Hugh, Harrison, Andrew, Upton, Graham and Memon, Farhat

(2012)

Shanahan, Hugh, Harrison, Andrew, Upton, Graham and Memon, Farhat (2012) Normalised Affymetrix expression data are biased by G-stacks. Nucleic Acids Research, 40 (8).

Our Full Text Deposits

Full text access: Open

Full text file - 3.03 MB

Abstract

Probes with runs of 4 or more guanines (G-stacks) in their sequences can exhibit a level of hybridisation that is unrelated to the expression levels of the mRNA that they are intended to measure. This is most likely caused by the formation of G-quadruplexes, where inter-probe guanines form Hoogsteen hydrogen bonds, which probes with G-stacks are capable of forming. We demonstrate that for a specific microarray data set using the Human HG_U133A Affymetrix GeneChip and RMA normalisation there is significant bias in the expression levels, the fold change and the correlations between expression levels. These effects grow more pronounced as the number of G-stack probes in a probe set increases. Approximately 14% of the probe sets are directly affected. The analysis was repeated for a number of other normalisation pipelines and two, FARMS and PLIER, minimised the bias to some extent. We estimate that approximately 15% of the data sets deposited in the GEO database are susceptible to the effect. The inclusion of G-stack probes in the affected data sets can bias key parameters used in the selection and clustering of genes. The elimination of these probes from any analysis in such affected data sets outweighs the increase of noise in the signal.

Information about this Version

This is a Submitted version
This version's date is: 1/4/2012
This item is not peer reviewed

Link to this Version

https://repository.royalholloway.ac.uk/items/4441e97f-9fff-5366-b9e1-242c1c1f7c60/1/

Item TypeJournal Article
TitleNormalised Affymetrix expression data are biased by G-stacks
AuthorsShanahan, Hugh
Harrison, Andrew
Upton, Graham
Memon, Farhat
Uncontrolled Keywordsmicroarray, G-quadruplex
DepartmentsFaculty of Science\Computer Science

Identifiers

doihttp://dx.doi.org/10.1093/nar/gkr1230

Deposited by Research Information System (atira) on 10-Jul-2012 in Royal Holloway Research Online.Last modified on 10-Jul-2012


Details