Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

h2o.train_segments

H2O Segmented-Data Bulk Model Training


Description

Provides a set of functions to train a group of models on different segments (subpopulations) of the training set.

Usage

h2o.train_segments(
  algorithm,
  segment_columns,
  segment_models_id,
  parallelism = 1,
  ...
)

Arguments

algorithm

Name of algorithm to use in training segment models (gbm, randomForest, kmeans, glm, deeplearning, naivebayes, psvm, xgboost, pca, svd, targetencoder, aggregator, word2vec, coxph, isolationforest, kmeans, stackedensemble, glrm, gam).

segment_columns

A list of columns to segment-by. H2O will group the training (and validation) dataset by the segment-by columns and train a separate model for each segment (group of rows).

segment_models_id

Identifier for the returned collection of Segment Models. If not specified it will be automatically generated.

parallelism

Level of parallelism of bulk model building, it is the maximum number of models each H2O node will be building in parallel, defaults to 1.

...

Use to pass along training_frame parameter, x, y, and all non-default parameter values to the algorithm Look at the specific algorithm - h2o.gbm, h2o.glm, h2o.kmeans, h2o.deepLearning - for available parameters.

Details

Start Segmented-Data bulk Model Training for a given algorithm and parameters.

Examples

## Not run: 
library(h2o)
h2o.init()
iris_hf <- as.h2o(iris)
models <- h2o.train_segments(algorithm = "gbm", 
                             segment_columns = "Species",
                             x = c(1:3), y = 4, 
                             training_frame = iris_hf,
                             ntrees = 5, 
                             max_depth = 4)
as.data.frame(models)

## End(Not run)

h2o

R Interface for the 'H2O' Scalable Machine Learning Platform

v3.32.1.2
Apache License (== 2.0)
Authors
Erin LeDell [aut, cre], Navdeep Gill [aut], Spencer Aiello [aut], Anqi Fu [aut], Arno Candel [aut], Cliff Click [aut], Tom Kraljevic [aut], Tomas Nykodym [aut], Patrick Aboyoun [aut], Michal Kurka [aut], Michal Malohlava [aut], Ludi Rehak [ctb], Eric Eckstrand [ctb], Brandon Hill [ctb], Sebastian Vidrio [ctb], Surekha Jadhawani [ctb], Amy Wang [ctb], Raymond Peck [ctb], Wendy Wong [ctb], Jan Gorecki [ctb], Matt Dowle [ctb], Yuan Tang [ctb], Lauren DiPerna [ctb], Tomas Fryda [ctb], H2O.ai [cph, fnd]
Initial release
2021-04-29

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.