Resolve names using Global Names Recognition and Discovery.
Uses the Global Names Recognition and Discovery service, see http://gnrd.globalnames.org/
Note: this function sometimes gives data back and sometimes not. The API that this function is extremely buggy.
scrapenames( url = NULL, file = NULL, text = NULL, engine = NULL, unique = NULL, verbatim = NULL, detect_language = NULL, all_data_sources = NULL, data_source_ids = NULL, return_content = FALSE, ... )
url |
An encoded URL for a web page, PDF, Microsoft Office document, or image file, see examples |
file |
When using multipart/form-data as the content-type, a file may be sent. This should be a path to your file on your machine. |
text |
Type: string. Text content; best used with a POST request, see examples |
engine |
(optional) (integer) Default: 0. Either 1 for TaxonFinder, 2 for NetiNeti, or 0 for both. If absent, both engines are used. |
unique |
(optional) (logical) If |
verbatim |
(optional) Type: boolean, If |
detect_language |
(optional) Type: boolean, When |
all_data_sources |
(optional) Type: boolean. Resolve found names against all available Data Sources. |
data_source_ids |
(optional) Type: string. Pipe separated list of data source ids to resolve found names against. See list of Data Sources http://resolver.globalnames.org/data_sources |
return_content |
(logical) return OCR'ed text. returns text
string in |
... |
Further args passed to crul::verb-GET |
One of url, file, or text must be specified - and only one of them.
A list of length two, first is metadata, second is the data as a data.frame.
Scott Chamberlain
## Not run: # Get data from a website using its URL scrapenames('https://en.wikipedia.org/wiki/Spider') scrapenames('https://en.wikipedia.org/wiki/Animal') scrapenames('https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095068') scrapenames('https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0080498') scrapenames('http://ucjeps.berkeley.edu/cgi-bin/get_JM_treatment.pl?CARYOPHYLLACEAE') # Scrape names from a pdf at a URL url <- 'https://journals.plos.org/plosone/article/file?id= 10.1371/journal.pone.0058268&type=printable' scrapenames(url = sub('\n', '', url)) # With arguments scrapenames(url = 'https://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf', unique=TRUE) scrapenames(url = 'https://en.wikipedia.org/wiki/Spider', data_source_ids=c(1, 169)) # Get data from a file speciesfile <- system.file("examples", "species.txt", package = "taxize") scrapenames(file = speciesfile) nms <- paste0(names_list("species"), collapse="\n") file <- tempfile(fileext = ".txt") writeLines(nms, file) scrapenames(file = file) # Get data from text string scrapenames(text='A spider named Pardosa moesta Banks, 1892') # return OCR content scrapenames(url='https://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf', return_content = TRUE) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.