Bootstrapped and normal confidence intervals for raw and composite correlations
Although normal theory provides confidence intervals for correlations, this is particularly problematic with Synthetic Aperture Personality Assessment (SAPA) data where the individual items are Massively Missing at Random. Bootstrapped confidence intervals are found for Pearson, Spearman, Kendall, tetrachoric, or polychoric correlations and for scales made from those correlations. If given a correlation matrix and sample size(s), normal theory confidence intervals are provided.
corCi(x, keys = NULL, n.iter = 100, p = 0.05,overlap = FALSE, poly = FALSE, method = "pearson", plot=TRUE,minlength=5,n=NULL,...) cor.ci(x, keys = NULL, n.iter = 100, p = 0.05,overlap = FALSE, poly = FALSE, method = "pearson", plot=TRUE,minlength=5,n=NULL,...)
x |
The raw data, or a correlation matrix if not doing bootstrapping |
keys |
If NULL, then the confidence intervals of the raw correlations are found. Otherwise, composite scales are formed from the keys applied to the correlation matrix (in a logic similar to |
n.iter |
The number of iterations to bootstrap over. This will be very slow if using tetrachoric/or polychoric correlations. |
p |
The upper and lower confidence region will include 1-p of the distribution. |
overlap |
If true, the correlation between overlapping scales is corrected for item overlap. |
poly |
if FALSE, then find the correlations using the method specified (defaults to Pearson). If TRUE, the polychoric correlations will be found (slowly). Because the polychoric function uses multicores (if available), and corCi does as well, the number of cores used is options("mc.cores")^2. |
method |
"pearson","spearman", "kendall" |
plot |
Show the correlation plot with correlations scaled by the probability values. To show the matrix in terms of the confidence intervals, use |
minlength |
What is the minlength to use in abbreviations of the cis? Defaults to 5 |
n |
If finding confidence intervals from a correlation matrix, specify the n |
... |
Other parameters for axis (e.g., cex.axis to change the font size, srt to rotate the numbers in the plot) |
If given a correlation matrix, then confidence intervals are found based upon the sample sizes using the conventional r2z fisher transformation (fisherz
and the normal distribution.
If given raw data, correlations are found. If keys are specified (the normal case), then composite scales based upon the correlations are found and reported. This is the same procedure as done using cluster.cor
or scoreItems
.
Then (with raw data) the data are recreated n.iter times by sampling subjects (rows) with replacement and the correlations (and composite scales) are found again (and again and again). Mean and standard deviations of these values are calculated based upon the Fisher Z transform of the correlations. Summary statistics include the original correlations and their confidence intervals. For those who want the complete set of replications, those are available as an object in the resulting output.
Although particularly useful for SAPA (https://www.sapa-project.org/) type data where we have lots of missing data, this will work for any normal data set as well.
Although the correlations are shown automatically as a cor.plot
, it is possible to show the upper and lower confidence intervals by using cor.plot.upperLowerCi
. This will also return, invisibly, a matrix for printing with the lower and upper bounds of the correlations shown below and above the diagonal (see the first example).
rho |
The original (composite) correlation matrix. |
means |
Mean (of Fisher transformed) correlation retransformed back to the r units |
sds |
Standard deviation of Fisher transformed correlations |
ci |
Mean +/- alpha/2 of the z scores as well as the alpha/2 and 1-alpha/2 quantiles. These are labeled as lower.emp(ircal), lower.norm(al), upper.norm and upper.emp. |
replicates |
The observed replication values so one can do one's own estimates |
William Revelle
For SAPA type data, see Revelle, W., Wilt, J., and Rosenthal, A. (2010) Personality and Cognition: The Personality-Cognition Link. In Gruszka, A. and Matthews, G. and Szymura, B. (Eds.) Handbook of Individual Differences in Cognition: Attention, Memory and Executive Control, Springer.
make.keys
, cluster.cor
, and scoreItems
for forming synthetic correlation matrices from composites of item correlations. See scoreOverlap
for correcting for item overlap in scales. See also corr.test
for standard significance testing of correlation matrices. See also lowerCor
for finding and printing correlation matrices, as well as lowerMat
for displaying them. Also see cor.plot.upperLowerCi
for displaying the confidence intervals graphically.
#find confidence intervals of a correlation matrix with specified sample size ci <- corCi(Thurstone[1:6,1:6],n=213) ci #show them R <- cor.plot.upperLowerCi(ci) #show them graphically R #show them as a matrix #confidence intervals by bootstrapping requires raw data corCi(psychTools::bfi[1:200,1:10]) # just the first 10 variables #The keys have overlapping scales keys <- list(agree=c("-A1","A2","A3","A4","A5"), conscientious= c("C1", "C2","C3","-C4","-C5"),extraversion=c("-E1","-E2","E3","E4","E5"), neuroticism= c("N1", "N2", "N3","N4","N5"), openness = c("O1","-O2","O3","O4","-O5"), alpha=c("-A1","A2","A3","A4","A5","C1","C2","C3","-C4","-C5","N1","N2","N3","N4","N5"), beta = c("-E1","-E2","E3","E4","E5","O1","-O2","O3","O4","-O5") ) #do not correct for item overlap rci <- corCi(psychTools::bfi[1:200,],keys,n.iter=10,main="correlation with overlapping scales") #also shows the graphic -note the overlap #correct for overlap rci <- cor.ci(psychTools::bfi[1:200,],keys,overlap=TRUE, n.iter=10,main="Correct for overlap") #show the confidence intervals ci <- cor.plot.upperLowerCi(rci) #to show the upper and lower confidence intervals ci #print the confidence intervals in matrix form
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.