Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

pid

Percent Sequence Identity


Description

Calculates the percent sequence identity for a pairwise sequence alignment.

Usage

pid(x, type="PID1")

Arguments

x

a PairwiseAlignments object.

type

one of percent sequence identity. One of "PID1", "PID2", "PID3", and "PID4". See Details for more information.

Details

Since there is no universal definition of percent sequence identity, the pid function calculates this statistic in the following types:

"PID1":

100 * (identical positions) / (aligned positions + internal gap positions)

"PID2":

100 * (identical positions) / (aligned positions)

"PID3":

100 * (identical positions) / (length shorter sequence)

"PID4":

100 * (identical positions) / (average length of the two sequences)

Value

A numeric vector containing the specified sequence identity measures.

Author(s)

P. Aboyoun

References

A. May, Percent Sequence Identity: The Need to Be Explicit, Structure 2004, 12(5):737.

G. Raghava and G. Barton, Quantification of the variation in percentage identity for protein sequence alignments, BMC Bioinformatics 2006, 7:415.

See Also

Examples

s1 <- DNAString("AGTATAGATGATAGAT")
  s2 <- DNAString("AGTAGATAGATGGATGATAGATA")

  palign1 <- pairwiseAlignment(s1, s2)
  palign1
  pid(palign1)

  palign2 <-
    pairwiseAlignment(s1, s2,
      substitutionMatrix =
      nucleotideSubstitutionMatrix(match = 2, mismatch = 10, baseOnly = TRUE))
  palign2
  pid(palign2, type = "PID4")

Biostrings

Efficient manipulation of biological strings

v2.58.0
Artistic-2.0
Authors
H. Pagès, P. Aboyoun, R. Gentleman, and S. DebRoy
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.