Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

MA.estimates

MA Estimates


Description

This function computes the sequential sampling (MA) estimates for a categorical variable or numeric variable.

Usage

MA.estimates(
  rds.data,
  trait.variable,
  seed.selection = "degree",
  number.of.seeds = NULL,
  number.of.coupons = NULL,
  number.of.iterations = 3,
  N = NULL,
  M1 = 25,
  M2 = 20,
  seed = 1,
  initial.sampling.probabilities = NULL,
  MPLE.samplesize = 50000,
  SAN.maxit = 5,
  SAN.nsteps = 2^19,
  sim.interval = 10000,
  number.of.cross.ties = NULL,
  max.degree = NULL,
  parallel = 1,
  parallel.type = snow::getClusterOption("type"),
  full.output = FALSE,
  verbose = TRUE
)

Arguments

rds.data

An rds.data.frame that indicates recruitment patterns by a pair of attributes named “id” and “recruiter.id”.

trait.variable

A string giving the name of the variable in the rds.data that contains a categorical or numeric variable to be analyzed.

seed.selection

An estimate of the mechanism guiding the choice of seeds. The choices are

"allwithtrait"

indicating that all the seeds had the trait;

"random"

meaning they were, as if, a simple random sample of individuals from the population;

"sample"

indicating that the seeds are taken as those in the sample (and resampled for the population with that composition if necessary);

"degree"

is proportional to the degree of the individual;

"allwithtraitdegree"

indicating that all the seeds had the trait and the probability of being a seed is proportional to the degree of the respondent.

number.of.seeds

The number of seeds chosen to initiate the sampling.

number.of.coupons

The number of coupons given to each respondent.

number.of.iterations

The number of iterations used at the core of the algorithm.

N

An estimate of the number of members of the population being sampled. If NULL it is read as the pop.size.mid attribute of the rds.data frame. If that is missing it defaults to 1000.

M1

The number of networked populations generated at each iteration.

M2

The number of (full) RDS samples generated for each networked population at each iteration.

seed

The random number seed used to initiate the computations.

initial.sampling.probabilities

Initialize sampling probabilities for the algorithm. If missing, they are taken as proportional to degree, and this is almost always the best starting values.

MPLE.samplesize

Number of samples to take in the computation of the maximum pseudolikelihood estimator (MPLE) of the working model parameter. The default is almost always sufficient.

SAN.maxit

A ceiling on the number of simulated annealing iterations.

SAN.nsteps

Number of MCMC proposals for all the annealing runs combined.

sim.interval

Number of MCMC steps between each of the M1 sampled networks per iteration.

number.of.cross.ties

The expected number of ties between those with the trait and those without. If missing, it is computed based on the respondent's reports of the number of ties they have to population members who have the trait (i.e. ties.to.trait.variable) and do not have the trait (i.e. ties.not.to.trait.variable).

max.degree

Impose ceiling on degree size.

parallel

Number of processors to use in the computations. The default is 1, that is no parallel processing.

parallel.type

The type of cluster to start. e.g. 'sock', 'MPI', etc.

full.output

More verbose output

verbose

Should verbose diagnostics be printed while the algorithm is running.

Value

If trait.variable is numeric then the model-assisted estimate of the mean is returned, otherwise a vector of proportion estimates is returned. If full.output=TRUE this leads to:

If full.output=FALSE this leads to an object of class rds.interval.estimate which is a list with components

  • estimatethe numerical point estimate of proportion of thetrait.variable.

  • intervala matrix with size columns and one row per category of trait.variable:

    • point estimate The HT estimate of the population mean.

    • 95% Lower BoundLower 95% confidence bound

    • 95% Upper BoundUpper 95% confidence bound

    Design EffectThe design effect of the RDS

    s.e.standard error

    ncount of the number of sample values with that value of the trait

rds.data An rds.data.frame that indicates recruitment patterns by a pair of attributes named “id” and “recruiter.id”. N An estimate of the number of members of the population being sampled. If NULL it is read as the pop.size.mid attribute of the rds.data frame. If that is missing it defaults to 1000. M1 The number of networked populations generated at each iteration. M2 The number of (full) RDS populations generated for each networked population at each iteration. seed The random number seed used to initiate the computations. seed.selection An estimate of the mechanism guiding the choice of seeds. The choices are

"allwithtrait"

indicating that all the seeds had the trait;

"random"

meaning they were, as if, a simple random sample of individuals from the population;

"sample"

indicating that the seeds are taken as those in the sample (and resampled for the population with that composition if necessary);

"degree"

is proportional to the degree of the individual;

"allwithtraitdegree"

indicating that all the seeds had the trait and the probability of being a seed is proportional to the degree of the respondent.

number.of.seeds The number of seeds chosen to initiate the sampling. number.of.coupons The number of coupons given to each respondent. number.of.iterations The number of iterations used at the core of the algorithm. outcome.variable The name of the outcome variable weight.type The type of weighting used (i.e. MA) uncertainty The type of weighting used (i.e. MA) details A list of other diagnostic output from the computations. varestBS Output from the bootstrap procedure. A list with two elements: var is the bootstrap variance, and BSest is the vector of bootstrap estimates themselves. coefficient estimate of the parameter of the ERGM for the network.

Author(s)

Krista J. Gile with help from Mark S. Handcock

References

Gile, Krista J. 2011 Improved Inference for Respondent-Driven Sampling Data with Application to HIV Prevalence Estimation, Journal of the American Statistical Association, 106, 135-146.

Gile, Krista J., Handcock, Mark S., 2010. Respondent-driven Sampling: An Assessment of Current Methodology, Sociological Methodology, 40, 285-327.

See Also

Examples

## Not run: 
data(faux)
MA.estimates(rds.data=faux,trait.variable='X')

## End(Not run)

RDS

Respondent-Driven Sampling

v0.9-3
LGPL-2.1
Authors
Mark S. Handcock [aut, cre], Krista J. Gile [aut], Ian E. Fellows [aut], W. Whipple Neely [aut]
Initial release
2021-03-11

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.