Raw data, the output data from an array scanner, is usually a flat
text file. The raw data file records detailed statistics of the fluorescent
intensities: the measurement values as well as quality measures.
The first step in microarray data analysis is to extract data from
the raw data files for each individual array and to merge the raw
data into one or a few files that are suitable for further analysis.
Most researchers extract only one or two of the columns from the raw
data, and merge.
MicroHelper provides a tool for merging raw data, provide that the raw data
has the identical format. Perl Example 5 --- data merge
also provides a simple Perl script for merging raw data.
Common mistakes to avoid
1. Make sure all data have the identical format.
2. Data will be in a different format if a text file has been opened
and saved as an Excel file.
3. Check the Godlist --- gene info columns. The Godlist may be different because of
the following reasons:
- Mistakes have been found and corrected in most recent arrays, but not for earlier ones.
- New clones/controls have been added for recent printing.
- Technicians may have changed their minds on whether to include controls in the raw data file.
- Different arrays are actually used for the experiments.
4. For data that have been saved into a database and then extracted from the database, uncaught
mistakes in the data extraction process may corrupt the data.
5. Refrain from using copy and paste to move data. Copy and paste is error-prone. Either human
and machine may make mistakes using copy and paste.
|