Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

readingSkills

Reading Skills


Description

A toy data set illustrating the spurious correlation between reading skills and shoe size in school-children.

Usage

data("readingSkills")

Format

A data frame with 200 observations on the following 4 variables.

nativeSpeaker

a factor with levels no and yes, where yes indicates that the child is a native speaker of the language of the reading test.

age

age of the child in years.

shoeSize

shoe size of the child in cm.

score

raw score on the reading test.

Details

In this artificial data set, that was generated by means of a linear model, age and nativeSpeaker are actual predictors of the score, while the spurious correlation between score and shoeSize is merely caused by the fact that both depend on age.

The true predictors can be identified, e.g., by means of partial correlations, standardized beta coefficients in linear models or the conditional random forest variable importance, but not by means of the standard random forest variable importance (see example).

Examples

set.seed(290875)
   readingSkills.cf <- cforest(score ~ ., data = readingSkills,
       control = cforest_unbiased(mtry = 2, ntree = 50))

   # standard importance
   varimp(readingSkills.cf)
   # the same modulo random variation
   varimp(readingSkills.cf, pre1.0_0 = TRUE)

   # conditional importance, may take a while...
   varimp(readingSkills.cf, conditional = TRUE)

party

A Laboratory for Recursive Partytioning

v1.3-10
GPL-2
Authors
Torsten Hothorn [aut, cre] (<https://orcid.org/0000-0001-8301-0471>), Kurt Hornik [aut], Carolin Strobl [aut], Achim Zeileis [aut] (<https://orcid.org/0000-0003-0918-3766>)
Initial release
2022-04-25

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.