Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

GSS7402

US General Social Survey 1974–2002


Description

Cross-section data for 9120 women taken from every fourth year of the US General Social Survey between 1974 and 2002 to investigate the determinants of fertility.

Usage

data("GSS7402")

Format

A data frame containing 9120 observations on 10 variables.

kids

Number of children. This is coded as a numerical variable but note that the value 8 actually encompasses 8 or more children.

age

Age of respondent.

education

Highest year of school completed.

year

GSS year for respondent.

siblings

Number of brothers and sisters.

agefirstbirth

Woman's age at birth of first child.

ethnicity

factor indicating ethnicity. Is the individual Caucasian ("cauc") or not ("other")?

city16

factor. Did the respondent live in a city (with population > 50,000) at age 16?

lowincome16

factor. Was the income below average at age 16?

immigrant

factor. Was the respondent (or both parents) born abroad?

Details

This subset of the US General Social Survey (GSS) for every fourth year between 1974 and 2002 has been selected by Winkelmann and Boes (2009) to investigate the determinants of fertility. To do so they typically restrict their empirical analysis to the women for which the completed fertility is (assumed to be) known, employing the common cutoff of 40 years. Both, the average number of children borne to a woman and the probability of being childless, are of interest.

Source

Online complements to Winkelmann and Boes (2009).

References

Winkelmann, R., and Boes, S. (2009). Analysis of Microdata, 2nd ed. Berlin and Heidelberg: Springer-Verlag.

See Also

Examples

## completed fertility subset
data("GSS7402", package = "AER")
gss40 <- subset(GSS7402, age >= 40)

## Chapter 1
## exploratory statistics
gss_kids <- prop.table(table(gss40$kids))
names(gss_kids)[9] <- "8+"

gss_zoo <- as.matrix(with(gss40, cbind(
  tapply(kids, year, mean),
  tapply(kids, year, function(x) mean(x <= 0)),
  tapply(education, year, mean))))
colnames(gss_zoo) <- c("Number of children",
  "Proportion childless", "Years of schooling")
gss_zoo <- zoo(gss_zoo, sort(unique(gss40$year)))

## visualizations instead of tables
barplot(gss_kids,
  xlab = "Number of children ever borne to women (age 40+)",
  ylab = "Relative frequencies")

library("lattice")
trellis.par.set(theme = canonical.theme(color = FALSE))
print(xyplot(gss_zoo[,3:1], type = "b", xlab = "Year"))


## Chapter 3, Example 3.14
## Table 3.1
gss40$nokids <- factor(gss40$kids <= 0, levels = c(FALSE, TRUE), labels = c("no", "yes"))
gss40$trend <- gss40$year - 1974
nokids_p1 <- glm(nokids ~ 1, data = gss40, family = binomial(link = "probit"))
nokids_p2 <- glm(nokids ~ trend, data = gss40, family = binomial(link = "probit"))
nokids_p3 <- glm(nokids ~ trend + education + ethnicity + siblings,
  data = gss40, family = binomial(link = "probit"))
lrtest(nokids_p1, nokids_p2, nokids_p3)


## Chapter 4, Figure 4.4
library("effects")
nokids_p3_ef <- effect("education", nokids_p3, xlevels = list(education = 0:20))
plot(nokids_p3_ef, rescale.axis = FALSE, ylim = c(0, 0.3))


## Chapter 8, Example 8.11
kids_pois <- glm(kids ~ education + trend + ethnicity + immigrant + lowincome16 + city16,
  data = gss40, family = poisson)
library("MASS")
kids_nb <- glm.nb(kids ~ education + trend + ethnicity + immigrant + lowincome16 + city16,
  data = gss40)
lrtest(kids_pois, kids_nb)


## More examples can be found in:
## help("WinkelmannBoes2009")

AER

Applied Econometrics with R

v1.2-10
GPL-2 | GPL-3
Authors
Christian Kleiber [aut] (<https://orcid.org/0000-0002-6781-4733>), Achim Zeileis [aut, cre] (<https://orcid.org/0000-0003-0918-3766>)
Initial release
2022-06-13

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.