taxize: scrapenames – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

taxize

scrapenames

Resolve names using Global Names Recognition and Discovery.

Description

Uses the Global Names Recognition and Discovery service, see http://gnrd.globalnames.org/

Note: this function sometimes gives data back and sometimes not. The API that this function is extremely buggy.

Usage

scrapenames(
  url = NULL,
  file = NULL,
  text = NULL,
  engine = NULL,
  unique = NULL,
  verbatim = NULL,
  detect_language = NULL,
  all_data_sources = NULL,
  data_source_ids = NULL,
  return_content = FALSE,
  ...
)

Arguments

`url`	An encoded URL for a web page, PDF, Microsoft Office document, or image file, see examples
`file`	When using multipart/form-data as the content-type, a file may be sent. This should be a path to your file on your machine.
`text`	Type: string. Text content; best used with a POST request, see examples
`engine`	(optional) (integer) Default: 0. Either 1 for TaxonFinder, 2 for NetiNeti, or 0 for both. If absent, both engines are used.
`unique`	(optional) (logical) If `TRUE` (default), response has unique names without offsets.
`verbatim`	(optional) Type: boolean, If `TRUE` (default to `FALSE`), response excludes verbatim strings.
`detect_language`	(optional) Type: boolean, When `TRUE` (default), NetiNeti is not used if the language of incoming text is determined not to be English. When `FALSE`, NetiNeti will be used if requested.
`all_data_sources`	(optional) Type: boolean. Resolve found names against all available Data Sources.
`data_source_ids`	(optional) Type: string. Pipe separated list of data source ids to resolve found names against. See list of Data Sources http://resolver.globalnames.org/data_sources
`return_content`	(logical) return OCR'ed text. returns text string in `x$meta$content` slot. Default: `FALSE`
`...`	Further args passed to crul::verb-GET

Details

One of url, file, or text must be specified - and only one of them.

Value

A list of length two, first is metadata, second is the data as a data.frame.

Author(s)

Scott Chamberlain

Examples

## Not run: 
# Get data from a website using its URL
scrapenames('https://en.wikipedia.org/wiki/Spider')
scrapenames('https://en.wikipedia.org/wiki/Animal')
scrapenames('https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095068')
scrapenames('https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0080498')
scrapenames('http://ucjeps.berkeley.edu/cgi-bin/get_JM_treatment.pl?CARYOPHYLLACEAE')

# Scrape names from a pdf at a URL
url <- 'https://journals.plos.org/plosone/article/file?id=
10.1371/journal.pone.0058268&type=printable'
scrapenames(url = sub('\n', '', url))

# With arguments
scrapenames(url = 'https://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf',
  unique=TRUE)
scrapenames(url = 'https://en.wikipedia.org/wiki/Spider',
  data_source_ids=c(1, 169))

# Get data from a file
speciesfile <- system.file("examples", "species.txt", package = "taxize")
scrapenames(file = speciesfile)

nms <- paste0(names_list("species"), collapse="\n")
file <- tempfile(fileext = ".txt")
writeLines(nms, file)
scrapenames(file = file)

# Get data from text string
scrapenames(text='A spider named Pardosa moesta Banks, 1892')

# return OCR content
scrapenames(url='https://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf',
  return_content = TRUE)

## End(Not run)

taxize

Taxonomic Information from Around the Web

v0.9.100

MIT + file LICENSE

Authors

Scott Chamberlain [aut] (<https://orcid.org/0000-0003-1444-9135>), Eduard Szoecs [aut], Zachary Foster [aut, cre], Zebulun Arendsee [aut], Carl Boettiger [ctb], Karthik Ram [ctb], Ignasi Bartomeus [ctb], John Baumgartner [ctb], James O'Donnell [ctb], Jari Oksanen [ctb], Bastian Greshake Tzovaras [ctb], Philippe Marchand [ctb], Vinh Tran [ctb], Maëlle Salmon [ctb], Gaopeng Li [ctb], Matthias Grenié [ctb], rOpenSci [fnd] (https://ropensci.org/)

Initial release

scrapenames

Description

Usage

Arguments

Details

Value

Author(s)

Examples

taxize

We don't support your browser anymore