Paginator client
A client to help you paginate
See HttpClient()
for information on parameters
a list, with objects of class HttpResponse()
.
Responses are returned in the order they are passed in.
Supported now:
limit_offset
: the most common way (in my experience), so is the default.
This method involves setting how many records and what record to start at
for each request. We send these query parameters for you.
page_perpage
: set the page to fetch and (optionally) how many records
to get per page
Supported later, hopefully:
link_headers
: link headers are URLS for the next/previous/last
request given in the response header from the server. This is relatively
uncommon, though is recommended by JSONAPI and is implemented by a
well known API (GitHub).
cursor
: this works by a single string given back in each response, to
be passed in the subsequent response, and so on until no more records
remain. This is common in Solr
http_req
an object of class HttpClient
by
(character) how to paginate. Only 'limit_offset' supported for now. In the future will support 'link_headers' and 'cursor'. See Details.
chunk
(numeric/integer) the number by which to chunk
requests, e.g., 10 would be be each request gets 10 records.
number is passed through format()
to prevent larger numbers
from being scientifically formatted
limit_param
(character) the name of the limit parameter. Default: limit
offset_param
(character) the name of the offset parameter. Default: offset
limit
(numeric/integer) the maximum records wanted.
number is passed through format()
to prevent larger numbers
from being scientifically formatted
page_param
(character) the name of the page parameter. Default: NULL
per_page_param
(character) the name of the per page parameter. Default: NULL
progress
(logical) print a progress bar, using utils::txtProgressBar.
Default: FALSE
.
print()
print method for Paginator
objects
Paginator$print(x, ...)
x
self
...
ignored
new()
Create a new Paginator
object
Paginator$new( client, by = "limit_offset", limit_param = NULL, offset_param = NULL, limit = NULL, chunk = NULL, page_param = NULL, per_page_param = NULL, progress = FALSE )
client
an object of class HttpClient
, from a call to HttpClient
by
(character) how to paginate. Only 'limit_offset' supported for now. In the future will support 'link_headers' and 'cursor'. See Details.
limit_param
(character) the name of the limit parameter. Default: limit
offset_param
(character) the name of the offset parameter. Default: offset
limit
(numeric/integer) the maximum records wanted
chunk
(numeric/integer) the number by which to chunk requests, e.g., 10 would be be each request gets 10 records
page_param
(character) the name of the page parameter.
per_page_param
(character) the name of the per page parameter.
progress
(logical) print a progress bar, using utils::txtProgressBar.
Default: FALSE
.
A new Paginator
object
get()
make a paginated GET request
Paginator$get(path = NULL, query = list(), ...)
path
URL path, appended to the base URL
query
query terms, as a named list. any numeric values are
passed through format()
to prevent larger numbers from being
scientifically formatted
...
For retry
, the options to be passed on to the method
implementing the requested verb, including curl options. Otherwise,
curl options, only those in the acceptable set from curl::curl_options()
except the following: httpget, httppost, post, postfields, postfieldsize,
and customrequest
post()
make a paginated POST request
Paginator$post( path = NULL, query = list(), body = NULL, encode = "multipart", ... )
path
URL path, appended to the base URL
query
query terms, as a named list. any numeric values are
passed through format()
to prevent larger numbers from being
scientifically formatted
body
body as an R list
encode
one of form, multipart, json, or raw
...
For retry
, the options to be passed on to the method
implementing the requested verb, including curl options. Otherwise,
curl options, only those in the acceptable set from curl::curl_options()
except the following: httpget, httppost, post, postfields, postfieldsize,
and customrequest
put()
make a paginated PUT request
Paginator$put( path = NULL, query = list(), body = NULL, encode = "multipart", ... )
path
URL path, appended to the base URL
query
query terms, as a named list. any numeric values are
passed through format()
to prevent larger numbers from being
scientifically formatted
body
body as an R list
encode
one of form, multipart, json, or raw
...
For retry
, the options to be passed on to the method
implementing the requested verb, including curl options. Otherwise,
curl options, only those in the acceptable set from curl::curl_options()
except the following: httpget, httppost, post, postfields, postfieldsize,
and customrequest
patch()
make a paginated PATCH request
Paginator$patch( path = NULL, query = list(), body = NULL, encode = "multipart", ... )
path
URL path, appended to the base URL
query
query terms, as a named list. any numeric values are
passed through format()
to prevent larger numbers from being
scientifically formatted
body
body as an R list
encode
one of form, multipart, json, or raw
...
For retry
, the options to be passed on to the method
implementing the requested verb, including curl options. Otherwise,
curl options, only those in the acceptable set from curl::curl_options()
except the following: httpget, httppost, post, postfields, postfieldsize,
and customrequest
delete()
make a paginated DELETE request
Paginator$delete( path = NULL, query = list(), body = NULL, encode = "multipart", ... )
path
URL path, appended to the base URL
query
query terms, as a named list. any numeric values are
passed through format()
to prevent larger numbers from being
scientifically formatted
body
body as an R list
encode
one of form, multipart, json, or raw
...
For retry
, the options to be passed on to the method
implementing the requested verb, including curl options. Otherwise,
curl options, only those in the acceptable set from curl::curl_options()
except the following: httpget, httppost, post, postfields, postfieldsize,
and customrequest
head()
make a paginated HEAD request
Paginator$head(path = NULL, ...)
path
URL path, appended to the base URL
...
For retry
, the options to be passed on to the method
implementing the requested verb, including curl options. Otherwise,
curl options, only those in the acceptable set from curl::curl_options()
except the following: httpget, httppost, post, postfields, postfieldsize,
and customrequest
not sure if this makes any sense or not yet
responses()
list responses
Paginator$responses()
a list of HttpResponse
objects, empty list before requests made
status_code()
Get HTTP status codes for each response
Paginator$status_code()
numeric vector, empty numeric vector before requests made
status()
List HTTP status objects
Paginator$status()
a list of http_code
objects, empty list before requests made
parse()
parse content
Paginator$parse(encoding = "UTF-8")
encoding
(character) the encoding to use in parsing. default:"UTF-8"
character vector, empty character vector before requests made
content()
Get raw content for each response
Paginator$content()
raw list, empty list before requests made
times()
curl request times
Paginator$times()
list of named numeric vectors, empty list before requests made
url_fetch()
get the URL that would be sent (i.e., before executing the request) the only things that change the URL are path and query parameters; body and any curl options don't change the URL
Paginator$url_fetch(path = NULL, query = list())
path
URL path, appended to the base URL
query
query terms, as a named list. any numeric values are
passed through format()
to prevent larger numbers from being
scientifically formatted
URLs (character)
\dontrun{ cli <- HttpClient$new(url = "https://api.crossref.org") cc <- Paginator$new(client = cli, limit_param = "rows", offset_param = "offset", limit = 50, chunk = 10) cc$url_fetch('works') cc$url_fetch('works', query = list(query = "NSF")) }
clone()
The objects of this class are cloneable with this method.
Paginator$clone(deep = FALSE)
deep
Whether to make a deep clone.
## Not run: if (interactive()) { # limit/offset approach con <- HttpClient$new(url = "https://api.crossref.org") cc <- Paginator$new(client = con, limit_param = "rows", offset_param = "offset", limit = 50, chunk = 10) cc cc$get('works') cc cc$responses() cc$status() cc$status_code() cc$times() # cc$content() cc$parse() lapply(cc$parse(), jsonlite::fromJSON) # page/per page approach (with no per_page param allowed) conn <- HttpClient$new(url = "https://discuss.ropensci.org") cc <- Paginator$new(client = conn, by = "page_perpage", page_param = "page", per_page_param = "per_page", limit = 90, chunk = 30) cc cc$get('c/usecases/l/latest.json') cc$responses() lapply(cc$parse(), jsonlite::fromJSON) # page/per_page conn <- HttpClient$new('https://api.inaturalist.org') cc <- Paginator$new(conn, by = "page_perpage", page_param = "page", per_page_param = "per_page", limit = 90, chunk = 30) cc cc$get('v1/observations', query = list(taxon_name="Helianthus")) cc$responses() res <- lapply(cc$parse(), jsonlite::fromJSON) res[[1]]$total_results vapply(res, "[[", 1L, "page") vapply(res, "[[", 1L, "per_page") vapply(res, function(w) NROW(w$results), 1L) ## another ccc <- Paginator$new(conn, by = "page_perpage", page_param = "page", per_page_param = "per_page", limit = 500, chunk = 30, progress = TRUE) ccc ccc$get('v1/observations', query = list(taxon_name="Helianthus")) res2 <- lapply(ccc$parse(), jsonlite::fromJSON) vapply(res2, function(w) NROW(w$results), 1L) # progress bar (con <- HttpClient$new(url = "https://api.crossref.org")) cc <- Paginator$new(client = con, limit_param = "rows", offset_param = "offset", limit = 50, chunk = 10, progress = TRUE) cc cc$get('works') } ## End(Not run) ## ------------------------------------------------ ## Method `Paginator$url_fetch` ## ------------------------------------------------ ## Not run: cli <- HttpClient$new(url = "https://api.crossref.org") cc <- Paginator$new(client = cli, limit_param = "rows", offset_param = "offset", limit = 50, chunk = 10) cc$url_fetch('works') cc$url_fetch('works', query = list(query = "NSF")) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.