Predict with a survival forest forest
Gets estimates of the conditional survival function S(t, x) using a trained survival forest (estimated using Kaplan-Meier).
## S3 method for class 'survival_forest' predict(object, newdata = NULL, failure.times = NULL, num.threads = NULL, ...)
object |
The trained forest. |
newdata |
Points at which predictions should be made. If NULL, makes out-of-bag predictions on the training set instead (i.e., provides predictions at Xi using only trees that did not use the i-th training example). Note that this matrix should have the number of columns as the training matrix, and that the columns must appear in the same order. |
failure.times |
A vector of failure times to make predictions at. If NULL, then the failure times used for training the forest is used. Default is NULL. |
num.threads |
Number of threads used in training. If set to NULL, the software automatically selects an appropriate amount. |
... |
Additional arguments (currently ignored). |
Vector of predictions.
# Train a standard survival forest.
n <- 2000
p <- 5
X <- matrix(rnorm(n * p), n, p)
failure.time <- exp(0.5 * X[, 1]) * rexp(n)
censor.time <- 2 * rexp(n)
Y <- pmin(failure.time, censor.time)
D <- as.integer(failure.time <= censor.time)
s.forest <- survival_forest(X, Y, D)
# Predict using the forest.
X.test <- matrix(0, 3, p)
X.test[, 1] <- seq(-2, 2, length.out = 3)
s.pred <- predict(s.forest, X.test)
# Plot the survival curve.
plot(NA, NA, xlab = "failure time", ylab = "survival function",
xlim = range(s.pred$failure.times),
ylim = c(0, 1))
for(i in 1:3) {
lines(s.pred$failure.times, s.pred$predictions[i,], col = i)
s.true = exp(-s.pred$failure.times / exp(0.5 * X.test[i, 1]))
lines(s.pred$failure.times, s.true, col = i, lty = 2)
}
# Predict on out-of-bag training samples.
s.pred <- predict(s.forest)
# Plot the survival curve for the first five individuals.
matplot(s.pred$failure.times, t(s.pred$predictions[1:5, ]),
xlab = "failure time", ylab = "survival function (OOB)",
type = "l", lty = 1)
# Train the forest on a less granular grid.
failure.summary <- summary(Y[D == 1])
events <- seq(failure.summary["Min."], failure.summary["Max."], by = 0.1)
s.forest.grid <- survival_forest(X, Y, D, failure.times = events)
s.pred.grid <- predict(s.forest.grid)
matpoints(s.pred.grid$failure.times, t(s.pred.grid$predictions[1:5, ]),
type = "l", lty = 2)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.