Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

readGFF

Reads a file in GFF format


Description

Reads a file in GFF format and creates a data frame or DataFrame object from it. This is a low-level function that should not be called by user code.

Usage

readGFF(filepath, version=0,
        columns=NULL, tags=NULL, filter=NULL, nrows=-1,
        raw_data=FALSE)

GFFcolnames(GFF1=FALSE)

Arguments

filepath

A single string containing the path or URL to the file to read. Alternatively can be a connection.

version

readGFF should do a pretty descent job at detecting the GFF version. Use this argument only if it doesn't or if you want to force it to parse and import the file as if its 9-th column was in a different format than what it really is (e.g. specify version=1 on a GTF or GFF3 file to interpret its 9-th column as the "group" column of a GFF1 file). Supported versions are 1, 2, and 3.

columns

The standard GFF columns to load. All of them are loaded by default.

tags

The tags to load. All of them are loaded by default.

filter
nrows

-1 or the maximum number of rows to read in (after filtering).

raw_data
GFF1

Value

A DataFrame with columns corresponding to those in the GFF.

Author(s)

H. Pages

See Also

Examples

## Standard GFF columns.
GFFcolnames()
GFFcolnames(GFF1=TRUE)  # "group" instead of "attributes"

tests_dir <- system.file("tests", package="rtracklayer")
test_gff3 <- file.path(tests_dir, "genes.gff3")

## Load everything.
df0 <- readGFF(test_gff3)
head(df0)

## Load some tags only (in addition to the standard GFF columns).
my_tags <- c("ID", "Parent", "Name", "Dbxref", "geneID")
df1 <- readGFF(test_gff3, tags=my_tags)
head(df1)

## Load no tags (in that case, the "attributes" standard column
## is loaded).
df2 <- readGFF(test_gff3, tags=character(0))
head(df2)

## Load some standard GFF columns only (in addition to all tags).
my_columns <- c("seqid", "start", "end", "strand", "type")
df3 <- readGFF(test_gff3, columns=my_columns)
df3
table(df3$seqid, df3$type)
makeGRangesFromDataFrame(df3, keep.extra.columns=TRUE)

## Combine use of 'columns' and 'tags' arguments.
readGFF(test_gff3, columns=my_columns, tags=c("ID", "Parent", "Name"))
readGFF(test_gff3, columns=my_columns, tags=character(0))

## Use the 'filter' argument to load only features of type "gene"
## or "mRNA" located on chr10.
my_filter <- list(type=c("gene", "mRNA"), seqid="chr10")
readGFF(test_gff3, filter=my_filter)
readGFF(test_gff3, columns=my_columns, tags=character(0), filter=my_filter)

rtracklayer

R interface to genome annotation files and the UCSC genome browser

v1.50.0
Artistic-2.0 + file LICENSE
Authors
Michael Lawrence, Vince Carey, Robert Gentleman
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.