Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

h2o.tf_idf

Computes TF-IDF values for each word in given documents.


Description

Computes TF-IDF values for each word in given documents.

Usage

h2o.tf_idf(
  frame,
  document_id_col,
  text_col,
  preprocess = TRUE,
  case_sensitive = TRUE
)

Arguments

frame

documents or words frame for which TF-IDF values should be computed.

document_id_col

index or name of a column containing document IDs.

text_col

index or name of a column containing documents if 'preprocess = TRUE' or words if 'preprocess = FALSE'.

preprocess

whether input text data should be pre-processed. Defaults to 'TRUE'.

case_sensitive

whether input data should be treated as case sensitive. Defaults to 'TRUE'.

Value

resulting frame with TF-IDF values. Row format: documentID, word, TF, IDF, TF-IDF


h2o

R Interface for the 'H2O' Scalable Machine Learning Platform

v3.32.1.2
Apache License (== 2.0)
Authors
Erin LeDell [aut, cre], Navdeep Gill [aut], Spencer Aiello [aut], Anqi Fu [aut], Arno Candel [aut], Cliff Click [aut], Tom Kraljevic [aut], Tomas Nykodym [aut], Patrick Aboyoun [aut], Michal Kurka [aut], Michal Malohlava [aut], Ludi Rehak [ctb], Eric Eckstrand [ctb], Brandon Hill [ctb], Sebastian Vidrio [ctb], Surekha Jadhawani [ctb], Amy Wang [ctb], Raymond Peck [ctb], Wendy Wong [ctb], Jan Gorecki [ctb], Matt Dowle [ctb], Yuan Tang [ctb], Lauren DiPerna [ctb], Tomas Fryda [ctb], H2O.ai [cph, fnd]
Initial release
2021-04-29

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.