parameters: principal_components – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

parameters

principal_components

Principal Component Analysis (PCA)

Description

This function performs a principal component analysis (PCA) and returns the loadings as a data frame.

Usage

principal_components(
  x,
  n = "auto",
  rotation = "none",
  sort = FALSE,
  threshold = NULL,
  standardize = TRUE,
  ...
)

closest_component(x)

rotated_data(x)

## S3 method for class 'parameters_efa'
predict(object, newdata = NULL, names = NULL, keep_na = TRUE, ...)

Arguments

`x`	A data frame or a statistical model.
`n`	Number of components to extract. If `n="all"`, then `n` is set as the number of variables minus 1 (`ncol(x)-1`). If `n="auto"` (default) or `n=NULL`, the number of components is selected through `n_factors` resp. `n_components`. In `reduce_parameters`, can also be `"max"`, in which case it will select all the components that are maximally pseudo-loaded (i.e., correlated) by at least one variable.
`rotation`	If not `"none"`, the PCA / FA will be computed using the psych package. Possible options include `"varimax"`, `"quartimax"`, `"promax"`, `"oblimin"`, `"simplimax"`, or `"cluster"` (and more). See `fa` for details.
`sort`	Sort the loadings.
`threshold`	A value between 0 and 1 indicates which (absolute) values from the loadings should be removed. An integer higher than 1 indicates the n strongest loadings to retain. Can also be `"max"`, in which case it will only display the maximum loading per variable (the most simple structure).
`standardize`	A logical value indicating whether the variables should be standardized (centered and scaled) to have unit variance before the analysis (in general, such scaling is advisable).
`...`	Arguments passed to or from other methods.
`object`	An object of class `parameters_pca` or `parameters_efa`
`newdata`	An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used.
`names`	Optional character vector to name columns of the returned data frame.
`keep_na`	Logical, if `TRUE`, predictions also return observations with missing values from the original data, hence the number of rows of predicted data and original data is equal.

Details

Complexity

Complexity represents the number of latent components needed to account for the observed variables. Whereas a perfect simple structure solution has a complexity of 1 in that each item would only load on one factor, a solution with evenly distributed items has a complexity greater than 1 (Hofman, 1978; Pettersson and Turkheimer, 2010) .

Uniqueness

Uniqueness represents the variance that is 'unique' to the variable and not shared with other variables. It is equal to 1 – communality (variance that is shared with other variables). A uniqueness of 0.20 suggests that 20% or that variable's variance is not shared with other variables in the overall factor model. The greater 'uniqueness' the lower the relevance of the variable in the factor model.

MSA

MSA represents the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (Kaiser and Rice, 1974) for each item. It indicates whether there is enough data for each factor give reliable results for the PCA. The value should be > 0.6, and desirable values are > 0.8 (Tabachnick and Fidell, 2013).

PCA or FA?

There is a simplified rule of thumb that may help do decide whether to run a factor analysis or a principal component analysis:

Run factor analysis if you assume or wish to test a theoretical model of latent factors causing observed variables.
Run principal component analysis If you want to simply reduce your correlated observed variables to a smaller set of important independent composite variables.

(Source: CrossValidated)

Computing Item Scores

Use get_scores to compute scores for the "subscales" represented by the extracted principal components. get_scores() takes the results from principal_components() and extracts the variables for each component found by the PCA. Then, for each of these "subscales", raw means are calculated (which equals adding up the single items and dividing by the number of items). This results in a sum score for each component from the PCA, which is on the same scale as the original, single items that were used to compute the PCA. One can also use predict() to back-predict scores for each component, to which one can provide newdata or a vector of names for the components.

Value

A data frame of loadings.

Note

There is a summary()-method that prints the Eigenvalues and (explained) variance for each extracted component. closest_component() will return a numeric vector with the assigned component index for each column from the original data frame. rotated_data() will return the rotated data, including missing values, so it matches the original data frame. There is also a plot()-method implemented in the see-package.

References

Kaiser, H.F. and Rice. J. (1974). Little jiffy, mark iv. Educational and Psychological Measurement, 34(1):111–117
Hofmann, R. (1978). Complexity and simplicity as objective indices descriptive of factor solutions. Multivariate Behavioral Research, 13:2, 247-250, doi: 10.1207/s15327906mbr1302_9
Pettersson, E., & Turkheimer, E. (2010). Item selection, evaluation, and simple structure in personality data. Journal of research in personality, 44(4), 407-420, doi: 10.1016/j.jrp.2010.03.002
Tabachnick, B. G., and Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Boston: Pearson Education.

Examples

library(parameters)
if (require("psych")) {
  principal_components(mtcars[, 1:7], n = "all", threshold = 0.2)
  principal_components(mtcars[, 1:7],
    n = 2, rotation = "oblimin",
    threshold = "max", sort = TRUE
  )
  principal_components(mtcars[, 1:7], n = 2, threshold = 2, sort = TRUE)

  pca <- principal_components(mtcars[, 1:5], n = 2, rotation = "varimax")
  pca # Print loadings
  summary(pca) # Print information about the factors
  predict(pca, names = c("Component1", "Component2")) # Back-predict scores

  # which variables from the original data belong to which extracted component?
  closest_component(pca)
}

# Automated number of components
principal_components(mtcars[, 1:4], n = "auto")

parameters

Processing of Model Parameters

v0.13.0

GPL-3

Authors

Daniel Lüdecke [aut, cre] (<https://orcid.org/0000-0002-8895-3206>, @strengejacke), Dominique Makowski [aut] (<https://orcid.org/0000-0001-5375-9967>), Mattan S. Ben-Shachar [aut] (<https://orcid.org/0000-0002-4287-4801>), Indrajeet Patil [aut] (<https://orcid.org/0000-0003-1995-6531>, @patilindrajeets), Søren Højsgaard [aut], Zen J. Lau [ctb], Vincent Arel-Bundock [ctb] (<https://orcid.org/0000-0003-1995-6531>, @vincentab), Jeffrey Girard [ctb] (<https://orcid.org/0000-0002-7359-3746>, @jeffreymgirard)

Initial release

principal_components

Description

Usage

Arguments

Details

Complexity

Uniqueness

MSA

PCA or FA?

Computing Item Scores

Value

Note

References

See Also

Examples

parameters

We don't support your browser anymore