RDS: RDS.bootstrap.intervals – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

RDS.bootstrap.intervals

RDS Bootstrap Interval Estimates

Description

This function computes an interval estimate for one or more categorical variables. It optionally uses attributes of the RDS data set to determine the type of estimator and type of uncertainty estimate to use.

Usage

RDS.bootstrap.intervals(
  rds.data,
  outcome.variable,
  weight.type = NULL,
  uncertainty = NULL,
  N = NULL,
  subset = NULL,
  confidence.level = 0.95,
  number.of.bootstrap.samples = NULL,
  fast = TRUE,
  useC = TRUE,
  ci.type = "t",
  control = control.rds.estimates(),
  to.factor = FALSE,
  cont.breaks = 3,
  ...
)

Arguments

`rds.data`	An `rds.data.frame` that indicates recruitment patterns by a pair of attributes named “id” and “recruiter.id”.
`outcome.variable`	A string giving the name of the variable in the `rds.data` that contains a categorical or numeric variable to be analyzed.
`weight.type`	A string giving the type of estimator to use. The options are `"Gile's SS"`, `"RDS-I"`, `"RDS-II"`, `"RDS-I (DS)"`, and `"Arithemic Mean"`. If `NULL` it defaults to `"Gile's SS"`.
`uncertainty`	A string giving the type of uncertainty estimator to use. The options are `"SRS"`, `"Gile"` and `"Salganik"`. This is usually determined by `weight.type` to be consistent with the estimator's origins. The estimators RDS-I, RDS-I (DS), and RDS-II default to `"Salganik"`, "Arithmetic Mean" defaults to `"SRS"` and "Gile's SS" defaults to the `"Gile"` bootstrap.
`N`	An estimate of the number of members of the population being sampled. If `NULL` it is read as the `population.size.mid` attribute of the `rds.data` frame. If that is missing it defaults to 1000.
`subset`	An optional criterion to subset `rds.data` by. It is a character string giving an R expression which, when evaluated, subset the data. In plain English, it can be something like `"seed > 0"` to exclude seeds. It can be the name of a logical vector of the same length of the outcome variable where TRUE means include it in the analysis. If `NULL` then no subsetting is done.
`confidence.level`	The confidence level for the confidence intervals. The default is 0.95 for 95%.
`number.of.bootstrap.samples`	The number of bootstrap samples to take in estimating the uncertainty of the estimator. If `NULL` it defaults to the number necessary to compute the standard error to accuracy 0.001. `outcome.variable`. Otherwise it will compute the population frequencies of each value of the `outcome.variable`.
`fast`	Use a fast bootstrap where the weights are reused from the estimator rather than being recomputed for each bootstrap sample.
`useC`	Use a C-level implementation of Gile's bootstrap (rather than the R level). The implementations should be a computational equivalent estimator (except for speed).
`ci.type`	Type of confidence interval to use, if possible. If "t", use lower and upper confidence interval values based on the standard deviation of the bootstrapped values and a t multiplier. If "pivotal", use lower and upper confidence interval values based on the basic bootstrap (also called the pivotal confidence interval). If "quantile", use lower and upper confidence interval values based on the quantiles of the bootstrap sample. If "proportion", use the "t" unless the estimated proportion is less than 0.15 or the bounds are outside [0,1 . In this case, try the "quantile" and constrain the bounds to be compatible with [0,1].
`control`	A list of control parameters for algorithm tuning. Constructed using `control.rds.estimates`.
`to.factor`	force variable to be a factor
`cont.breaks`	For continuous variates, some bootstrap proceedures require categorical data. In these cases, in order to contruct each bootstrap replicate, the outcome variable is split into cont.breaks categories.
`...`	Additional arguments for RDS.*.estimates.

Value

An object of class rds.interval.estimate summarizing the inference. The confidence interval and standard error are based on the bootstrap procedure. In additon, the object has attribute bsresult which provides details of the bootstrap procedure. The contents of the bsresult attribute depends on the uncertainty used. If uncertainty=="Salganik" then bsresult is a vector of standard deviations of the bootstrap samples. If uncertainty=="Gile's SS" then bsresult is a list with components for the bootstrap point estimate, the bootstrap samples themselves and the standard deviations of the bootstrap samples. If uncertainty=="SRS" then bsresult is NULL.

References

Gile, Krista J. 2011 Improved Inference for Respondent-Driven Sampling Data with Application to HIV Prevalence Estimation, Journal of the American Statistical Association, 106, 135-146.

Gile, Krista J., Handcock, Mark S., 2010 Respondent-driven Sampling: An Assessment of Current Methodology. Sociological Methodology 40, 285-327.

Examples

## Not run: 
data(fauxmadrona)
RDS.bootstrap.intervals(rds.data=fauxmadrona,weight.type="RDS-II",
     uncertainty="Salganik",
	outcome.variable="disease",N=1000,number.of.bootstrap.samples=50)

data(fauxtime)
RDS.bootstrap.intervals(rds.data=fauxtime,weight.type="HCG",
     uncertainty="HCG",
	outcome.variable="var1",N=1000,number.of.bootstrap.samples=10)

## End(Not run)

RDS

Respondent-Driven Sampling

v0.9-3

LGPL-2.1

Authors

Mark S. Handcock [aut, cre], Krista J. Gile [aut], Ian E. Fellows [aut], W. Whipple Neely [aut]

Initial release

2021-03-11