Determine the Optimal Cutpoint for Continuous Variables
Determine the optimal cutpoint for one or multiple continuous variables at once, using the maximally selected rank statistics from the 'maxstat' R package. This is an outcome-oriented methods providing a value of a cutpoint that correspond to the most significant relation with outcome (here, survival).
surv_cutpoint()
: Determine the optimal cutpoint for each variable using 'maxstat'.
surv_categorize()
: Divide each variable values based on the cutpoint returned by surv_cutpoint()
.
surv_cutpoint( data, time = "time", event = "event", variables, minprop = 0.1, progressbar = TRUE ) surv_categorize(x, variables = NULL, labels = c("low", "high")) ## S3 method for class 'surv_cutpoint' summary(object, ...) ## S3 method for class 'surv_cutpoint' print(x, ...) ## S3 method for class 'surv_cutpoint' plot(x, variables = NULL, ggtheme = theme_classic(), bins = 30, ...) ## S3 method for class 'plot_surv_cutpoint' print(x, ..., newpage = TRUE)
data |
a data frame containing survival information (time, event) and continuous variables (e.g.: gene expression data). |
time, event |
column names containing time and event data, respectively. Event values sould be 0 or 1. |
variables |
a character vector containing the names of variables of interest, for wich we want to estimate the optimal cutpoint. |
minprop |
the minimal proportion of observations per group. |
progressbar |
logical value. If TRUE, show progress bar. Progressbar is shown only, when the number of variables > 5. |
x, object |
an object of class surv_cutpoint |
labels |
labels for the levels of the resulting category. |
... |
other arguments. For plots, see ?ggpubr::ggpar |
ggtheme |
function, ggplot2 theme name. Default value is theme_classic. Allowed values include ggplot2 official themes. see ?ggplot2::ggtheme. |
bins |
Number of bins for histogram. Defaults to 30. |
newpage |
open a new page. See |
surv_cutpoint(): returns an object of class 'surv_cutpoint', which is a list with the following components:
maxstat results for each variable (see ?maxstat::maxstat)
cutpoint: a data frame containing the optimal cutpoint of each variable. Rows are variable names and columns are c("cutpoint", "statistic").
data: a data frame containing the survival data and the original data for the specified variables.
minprop: the minimal proportion of observations per group.
not_numeric: contains data for non-numeric variables, in the context where the user provided categorical variable names in the argument variables.
Methods defined for surv_cutpoint object are summary, print and plot.
surv_categorize(): returns an object of class 'surv_categorize', which is a data frame containing the survival data and the categorized variables.
Alboukadel Kassambara, alboukadel.kassambara@gmail.com
# 0. Load some data data(myeloma) head(myeloma) # 1. Determine the optimal cutpoint of variables res.cut <- surv_cutpoint(myeloma, time = "time", event = "event", variables = c("DEPDC1", "WHSC1", "CRIM1")) summary(res.cut) # 2. Plot cutpoint for DEPDC1 # palette = "npg" (nature publishing group), see ?ggpubr::ggpar plot(res.cut, "DEPDC1", palette = "npg") # 3. Categorize variables res.cat <- surv_categorize(res.cut) head(res.cat) # 4. Fit survival curves and visualize library("survival") fit <- survfit(Surv(time, event) ~DEPDC1, data = res.cat) ggsurvplot(fit, data = res.cat, risk.table = TRUE, conf.int = TRUE)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.