Finding the nearest range/position neighbor
The nearest
, precede
, follow
, distance
and distanceToNearest
methods for IntegerRanges
objects and subclasses.
## S4 method for signature 'IntegerRanges,IntegerRanges_OR_missing' nearest(x, subject, select = c("arbitrary", "all")) ## S4 method for signature 'IntegerRanges,IntegerRanges_OR_missing' precede(x, subject, select = c("first", "all")) ## S4 method for signature 'IntegerRanges,IntegerRanges_OR_missing' follow(x, subject, select = c("last", "all")) ## S4 method for signature 'IntegerRanges,IntegerRanges_OR_missing' distanceToNearest(x, subject, select = c("arbitrary", "all")) ## S4 method for signature 'IntegerRanges,IntegerRanges' distance(x, y) ## S4 method for signature 'Pairs,missing' distance(x, y)
x |
The query |
subject |
The subject |
select |
Logic for handling ties. By default, all the methods
select a single interval (arbitrary for |
y |
For the |
hits |
The hits between |
... |
Additional arguments for methods |
nearest:
The conventional nearest neighbor finder. Returns an integer vector
containing the index of the nearest neighbor range in subject
for each range in x
. If there is no nearest neighbor
(if subject
is empty), NA's are returned.
Here is roughly how it proceeds, for a range xi
in x
:
Find the ranges in subject
that overlap xi
. If a
single range si
in subject
overlaps xi
,
si
is returned as the nearest neighbor of xi
. If there
are multiple overlaps, one of the overlapping ranges is chosen
arbitrarily.
If no ranges in subject
overlap with xi
, then
the range in subject
with the shortest distance from its end
to the start xi
or its start to the end of xi
is
returned.
precede:
For each range in x
, precede
returns the index of the
interval in subject
that is directly preceded by the query
range. Overlapping ranges are excluded. NA
is returned when
there are no qualifying ranges in subject
.
follow:
The opposite of precede
, this function returns the index
of the range in subject
that a query range in x
directly follows. Overlapping ranges are excluded. NA
is
returned when there are no qualifying ranges in subject
.
distanceToNearest:
Returns the distance for each range in x
to its nearest
neighbor in subject
.
distance:
Returns the distance for each range in x
to the range in
y
.
The distance
method differs from others documented on this
page in that it is symmetric; y
cannot be missing. If x
and y
are not the same length, the shortest will be recycled to
match the length of the longest. The select
argument is not
available for distance
because comparisons are made in a
pair-wise fashion. The return value is the length of the longest
of x
and y
.
The distance
calculation changed in BioC 2.12 to accommodate
zero-width ranges in a consistent and intuitive manner. The new distance
can be explained by a block model where a range is represented by
a series of blocks of size 1. Blocks are adjacent to each other and there
is no gap between them. A visual representation of IRanges(4,7)
would be
+-----+-----+-----+-----+ 4 5 6 7
The distance between two consecutive blocks is 0L (prior to Bioconductor 2.12 it was 1L). The new distance calculation now returns the size of the gap between two ranges.
This change to distance affects the notion of overlaps in that we no longer say:
x and y overlap <=> distance(x, y) == 0
Instead we say
x and y overlap => distance(x, y) == 0
or
x and y overlap or are adjacent <=> distance(x, y) == 0
selectNearest:
Selects the hits that have the minimum distance within those for
each query range. Ties are possible and can be broken with
breakTies
.
For nearest
, precede
and follow
, an integer
vector of indices in subject
, or a Hits
if select="all"
.
For distanceToNearest
, a Hits
object with an elementMetadata
column of the distance
between the pair. Access distance
with mcols
accessor.
For distance
, an integer vector of distances between the ranges
in x
and y
.
For selectNearest
, a Hits
object, sorted
by query.
M. Lawrence
The IntegerRanges and Hits classes.
The GenomicRanges and GRanges classes in the GenomicRanges package.
findOverlaps
for finding just the overlapping ranges.
GenomicRanges methods for
precede
follow
nearest
distance
distanceToNearest
are documented at
?nearest-methods
or
?precede,GenomicRanges,GenomicRanges-method
## ------------------------------------------ ## precede() and follow() ## ------------------------------------------ query <- IRanges(c(1, 3, 9), c(3, 7, 10)) subject <- IRanges(c(3, 2, 10), c(3, 13, 12)) precede(query, subject) # c(3L, 3L, NA) precede(IRanges(), subject) # integer() precede(query, IRanges()) # rep(NA_integer_, 3) precede(query) # c(3L, 3L, NA) follow(query, subject) # c(NA, NA, 1L) follow(IRanges(), subject) # integer() follow(query, IRanges()) # rep(NA_integer_, 3) follow(query) # c(NA, NA, 2L) ## ------------------------------------------ ## nearest() ## ------------------------------------------ query <- IRanges(c(1, 3, 9), c(2, 7, 10)) subject <- IRanges(c(3, 5, 12), c(3, 6, 12)) nearest(query, subject) # c(1L, 1L, 3L) nearest(query) # c(2L, 1L, 2L) ## ------------------------------------------ ## distance() ## ------------------------------------------ ## adjacent distance(IRanges(1,5), IRanges(6,10)) # 0L ## overlap distance(IRanges(1,5), IRanges(3,7)) # 0L ## zero-width sapply(-3:3, function(i) distance(shift(IRanges(4,3), i), IRanges(4,3)))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.