Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

TwoBitFile-class

2bit Files


Description

These functions support the import and export of the UCSC 2bit compressed sequence format. The main advantage is speed of subsequence retrieval, as it only loads the sequence in the requested intervals. Compared to the FA format supported by Rsamtools, 2bit offers the additional feature of masking and also has better support in Java (and thus most genome browsers). The supporting TwoBitFile class is a reference to a TwoBit file.

Usage

## S4 method for signature 'TwoBitFile,ANY,ANY'
import(con, format, text,
           which = as(seqinfo(con), "GenomicRanges"), ...)
## S4 method for signature 'TwoBitFile'
getSeq(x, which = as(seqinfo(x), "GenomicRanges"))
import.2bit(con, ...)

## S4 method for signature 'ANY,TwoBitFile,ANY'
export(object, con, format, ...)
## S4 method for signature 'DNAStringSet,TwoBitFile,ANY'
export(object, con, format)
## S4 method for signature 'DNAStringSet,character,ANY'
export(object, con, format, ...)
export.2bit(object, con, ...)

Arguments

con

A path, URL or TwoBitFile object. Connections are not supported. For the functions ending in .2bit, the file format is indicated by the function name. For the export and import methods, the format must be indicated another way. If con is a path, or URL, either the file extension or the format argument needs to be “twoBit” or “2bit”.

object,x

The object to export, either a DNAStringSet or something coercible to a DNAStringSet, like a character vector.

format

If not missing, should be “twoBit” or “2bit” (case insensitive).

text

Not supported.

which

A range data structure coercible to IntegerRangesList, like a GRanges, or a TwoBitFile. Only the intervals in the file overlapping the given ranges are returned. By default, the value is the TwoBitFile itself. Its Seqinfo object is extracted and coerced to a IntegerRangesList that represents the entirety of the file.

...

Arguments to pass down to methods to other methods. For import, the flow eventually reaches the TwoBitFile method on import. For export, the TwoBitFile methods on export are the sink.

Value

For import, a DNAStringSet.

TwoBitFile objects

A TwoBitFile object, an extension of RTLFile is a reference to a TwoBit file. To cast a path, URL or connection to a TwoBitFile, pass it to the TwoBitFile constructor.

A TwoBit file embeds the sequence information, which can be retrieved with the following:

seqinfo(x): Gets the Seqinfo object indicating the lengths of the sequences for the intervals in the file. No circularity or genome information is available.

Note

The 2bit format only suports A, C, G, T and N (via an internal mask). To export sequences with additional IUPAC ambiguity codes, first pass the object through replaceAmbiguities from the Biostrings package.

Author(s)

Michael Lawrence

See Also

export-methods in the BSgenome package for exporting a BSgenome object as a twoBit file.

Examples

test_path <- system.file("tests", package = "rtracklayer")
  test_2bit <- file.path(test_path, "test.2bit")

  test <- import(test_2bit)
  test

  test_2bit_file <- TwoBitFile(test_2bit)
  import(test_2bit_file) # the whole file
  
  which_range <- IRanges(c(10, 40), c(30, 42))
  which <- GRanges(names(test), which_range)
  import(test_2bit, which = which)

  seqinfo(test_2bit_file)

## Not run: 
  test_2bit_out <- file.path(tempdir(), "test_out.2bit")
  export(test, test_2bit_out)

  ## just a character vector
  test_char <- as.character(test)
  export(test_char, test_2bit_out)

## End(Not run)

rtracklayer

R interface to genome annotation files and the UCSC genome browser

v1.50.0
Artistic-2.0 + file LICENSE
Authors
Michael Lawrence, Vince Carey, Robert Gentleman
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.