Read Illumina expression data directly from IDAT files
Read Illumina BeadArray data from IDAT and manifest (.bgx) files for gene expression platforms.
read.idat(idatfiles, bgxfile, dateinfo = FALSE, annotation = "Symbol", tolerance = 0L, verbose = TRUE)
idatfiles |
character vector specifying idat files to be read in. |
bgxfile |
character string specifying bead manifest file (.bgx) to be read in. |
dateinfo |
logical. Should date and software version information be read in? |
annotation |
character vector of annotation columns to be read from the manifest file. |
tolerance |
integer. The number of probe ID discrepancies allowed between the manifest and any of the IDAT files. |
verbose |
logical. Should progress messages are sent to standard output? |
Illumina's BeadScan/iScan software outputs probe intensities in IDAT
format (encrypted XML files) and uses probe information stored in a platform specific manifest file (.bgx).
These files can be processed using the low-level functions readIDAT
and readBGX
from the illuminaio
package (Smith et al. 2013).
The read.idat
function provides a convenient way to read these files
into R and to store them in an EListRaw-class
object.
The function serves a similar purpose to read.ilmn
,
which reads text files exported by Illumina's GenomeStudio software,
but it reads the IDAT files directly without any need to convert them first to text.
The function reads information on control probes as well for regular probes.
Probe types are indicated in the Status
column of the genes
component of the EListRaw
object.
The annotation
argument specifies probe annotation columns to be extracted from the manifest file.
The manifest typically contains the following columns:
"Species"
, "Source"
, "Search_Key"
, "Transcript"
,
"ILMN_Gene"
, "Source_Reference_ID"
, "RefSeq_ID"
,
"Unigene_ID"
, "Entrez_Gene_ID"
, "GI"
,
"Accession"
, "Symbol"
, "Protein_Product"
,
"Probe_Id"
, "Array_Address_Id"
, "Probe_Type"
,
"Probe_Start"
, "Probe_Sequence"
, "Chromosome"
,
"Probe_Chr_Orientation"
, "Probe_Coordinates"
, "Cytoband"
,
"Definition"
, "Ontology_Component"
, "Ontology_Process"
,
"Ontology_Function"
, "Synonyms"
, "Obsolete_Probe_Id"
.
Note that "Probe_Id"
and "Array_Address_Id"
are always extracted and
do not need to included in the annotation
argument.
If more than tolerance
probes in the manifest cannot be found in an IDAT file then the function will return an error.
An EListRaw
object with the following components:
E |
numeric matrix of raw intensities. |
other$NumBeads |
numeric matrix of same dimensions as |
other$STDEV |
numeric matrix of same dimensions as |
genes |
data.frame of probe annotation.
This includes the |
targets |
data.frame of sample information.
This includes the IDAT file names plus other columns if |
Matt Ritchie
Smith ML, Baggerly KA, Bengtsson H, Ritchie ME, Hansen KD (2013). illuminaio: An open source IDAT parsing tool. F1000 Research 2, 264. http://f1000research.com/articles/2-264/
read.ilmn
imports gene expression data output by GenomeStudio.
neqc
performs normexp by control background correction, log
transformation and quantile between-array normalization for
Illumina expression data.
propexpr
estimates the proportion of expressed probes in a microarray.
detectionPValues
computes detection p-values from the negative controls.
## Not run: idatfiles <- dir(pattern="idat") bgxfile <- dir(pattern="bgx") x <- read.idat(idatfiles, bgxfile) x$other$Detection <- detectionPValues(x) propexpr(data) y <- neqc(data) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.