Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

tTestN

Sample Size for a One- or Two-Sample t-Test


Description

Compute the sample size necessary to achieve a specified power for a one- or two-sample t-test, given the scaled difference and significance level.

Usage

tTestN(delta.over.sigma, alpha = 0.05, power = 0.95, 
    sample.type = ifelse(!is.null(n2), "two.sample", "one.sample"), 
    alternative = "two.sided", approx = FALSE, n2 = NULL, round.up = TRUE, 
    n.max = 5000, tol = 1e-07, maxiter = 1000)

Arguments

delta.over.sigma

numeric vector specifying the ratio of the true difference δ (δ = μ - μ_0 for the one-sample case and δ = μ_1 - μ_2 for the two-sample case) to the population standard deviation (σ). This is also called the “scaled difference”.

alpha

numeric vector of numbers between 0 and 1 indicating the Type I error level associated with the hypothesis test. The default value is alpha=0.05.

power

numeric vector of numbers between 0 and 1 indicating the power associated with the hypothesis test. The default value is power=0.95.

sample.type

character string indicating whether to compute power based on a one-sample or two-sample hypothesis test. When sample.type="one.sample", the computed power is based on a hypothesis test for a single mean. When
sample.type="two.sample", the computed power is based on a hypothesis test for the difference between two means. The default value is
sample.type="one.sample" unless the argument n2 is supplied.

alternative

character string indicating the kind of alternative hypothesis. The possible values are:

  • "two.sided" (the default). H_a: μ \ne μ_0 for the one-sample case and H_a: μ_1 \ne μ_2 for the two-sample case.

  • "greater". H_a: μ > μ_0 for the one-sample case and H_a: μ_1 > μ_2 for the two-sample case.

  • "less". H_a: μ < μ_0 for the one-sample case and H_a: μ_1 < μ_2 for the two-sample case.

approx

logical scalar indicating whether to compute the power based on an approximation to the non-central t-distribution. The default value is FALSE.

n2

numeric vector of sample sizes for group 2. The default value is NULL in which case it is assumed that the sample sizes for groups 1 and 2 are equal. This argument is ignored when sample.type="one.sample". Missing (NA), undefined (NaN), and infinite (Inf, -Inf) values are not allowed.

round.up

logical scalar indicating whether to round up the values of the computed sample size(s) to the next smallest integer. The default value is TRUE.

n.max

positive integer greater than 1 indicating the maximum sample size when
sample.type="one.sample" or the maximum sample size for group 1 when sample.type="two.sample". The default value is n.max=5000.

tol

numeric scalar indicating the toloerance to use in the uniroot search algorithm. The default value is tol=1e-7.

maxiter

positive integer indicating the maximum number of iterations argument to pass to the uniroot function. The default value is maxiter=1000.

Details

Formulas for the power of the t-test for specified values of the sample size, scaled difference, and Type I error level are given in the help file for tTestPower. The function tTestN uses the uniroot search algorithm to determine the required sample size(s) for specified values of the power, scaled difference, and Type I error level.

Value

When sample.type="one.sample", tTestN returns a numeric vector of sample sizes.

When sample.type="two.sample" and n2 is not supplied, equal sample sizes for each group is assumed and tTestN returns a numeric vector of sample sizes indicating the required sample size for each group.

When sample.type="two.sample" and n2 is supplied, tTestN returns a list with two components called n1 and n2, specifying the sample sizes for each group.

Note

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

See Also

Examples

# Look at how the required sample size for the one-sample t-test 
  # increases with increasing required power:

  seq(0.5, 0.9, by = 0.1) 
  #[1] 0.5 0.6 0.7 0.8 0.9 

  tTestN(delta.over.sigma = 0.5, power = seq(0.5, 0.9, by = 0.1)) 
  #[1] 18 22 27 34 44

  #----------

  # Repeat the last example, but compute the sample size based on the 
  # approximation to the power instead of the exact method:

  tTestN(delta.over.sigma = 0.5, power = seq(0.5, 0.9, by = 0.1), 
    approx = TRUE) 
  #[1] 18 22 27 34 45

  #==========

  # Look at how the required sample size for the two-sample t-test 
  # decreases with increasing scaled difference:

  seq(0.5, 2,by = 0.5) 
  #[1] 0.5 1.0 1.5 2.0 

  tTestN(delta.over.sigma = seq(0.5, 2, by = 0.5), sample.type = "two") 
  #[1] 105  27  13   8

  #----------

  # Look at how the required sample size for the two-sample t-test decreases 
  # with increasing values of Type I error:

  tTestN(delta.over.sigma = 0.5, alpha = c(0.001, 0.01, 0.05, 0.1), 
    sample.type="two") 
  #[1] 198 145 105  88

  #----------

  # For the two-sample t-test, compare the total sample size required to 
  # detect a scaled difference of 1 for equal sample sizes versus the case 
  # when the sample size for the second group is constrained to be 20.  
  # Assume a 5% significance level and 95% power.  Note that for the case 
  # of equal sample sizes, a total of 54 samples (27+27) are required, 
  # whereas when n2 is constrained to be 20, a total of 62 samples 
  # (42 + 20) are required.

  tTestN(1, sample.type="two") 
  #[1] 27 

  tTestN(1, n2 = 20)
  #$n1
  #[1] 42
  #
  #$n2
  #[1] 20

  #==========

  # Modifying the example on pages 21-4 to 21-5 of USEPA (2009), determine the 
  # required sample size to detect a mean aldicarb level greater than the MCL 
  # of 7 ppb at the third compliance well with a power of 95%, assuming the 
  # true mean is 10 or 14.  Use the estimated standard deviation from the 
  # first four months of data to estimate the true population standard 
  # deviation, use a Type I error level of alpha=0.01, and assume an 
  # upper one-sided alternative (third compliance well mean larger than 7).  
  # (The data are stored in EPA.09.Ex.21.1.aldicarb.df.) 
  # Note that the required sample size changes from 11 to 5 as the true mean 
  # increases from 10 to 14.


  EPA.09.Ex.21.1.aldicarb.df
  #   Month   Well Aldicarb.ppb
  #1      1 Well.1         19.9
  #2      2 Well.1         29.6
  #3      3 Well.1         18.7
  #4      4 Well.1         24.2
  #5      1 Well.2         23.7
  #6      2 Well.2         21.9
  #7      3 Well.2         26.9
  #8      4 Well.2         26.1
  #9      1 Well.3          5.6
  #10     2 Well.3          3.3
  #11     3 Well.3          2.3
  #12     4 Well.3          6.9

  sigma <- with(EPA.09.Ex.21.1.aldicarb.df, 
    sd(Aldicarb.ppb[Well == "Well.3"]))

  sigma
  #[1] 2.101388

  tTestN(delta.over.sigma = (c(10, 14) - 7)/sigma, 
    alpha = 0.01, sample.type="one", alternative="greater") 
  #[1] 11  5


  # Clean up
  #---------
  rm(sigma)

EnvStats

Package for Environmental Statistics, Including US EPA Guidance

v2.4.0
GPL (>= 3)
Authors
Steven P. Millard [aut], Alexander Kowarik [ctb, cre] (<https://orcid.org/0000-0001-8598-4130>)
Initial release
2020-10-20

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.