Printer-friendly versionPrinter-friendly version

Even for gene expression, there are many different types of microarray.  Each type may have one or more methods for preprocessing which will include background and foreground identification, background correction, normalization and possibly summary of probes into probesets.  This preprocessing can have profound effects on downstream analysis.  Even when the normalized data are similar (highly correlated) we often find that different normalizations of the data lead to different lists of differentially expressed genes, different gene clusters and so on.

The plethora of methods is one of the reasons that bioinformatics analyses are not always reproducible.  It is important to keep a script of what you have done (including the version of any annotation files that you used) so that you can reproduce your analysis (and pass it on to anyone else who needs it).  It is also important to think about whether you expect the assumptions for the preprocessing to hold for your study.  For example, standard normalizations will not hold if the treatments might enhance RNA degradation, or shut down transcription.  Think things through - preferably before taking the tissue samples but certainly before trying to interpret statistical results.