Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

text_one_hot

One-hot encode a text into a list of word indexes in a vocabulary of size n.


Description

One-hot encode a text into a list of word indexes in a vocabulary of size n.

Usage

text_one_hot(
  input_text,
  n,
  filters = "!\"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n",
  lower = TRUE,
  split = " ",
  text = NULL
)

Arguments

input_text

Input text (string).

n

Size of vocabulary (integer)

filters

Sequence of characters to filter out such as punctuation. Default includes basic punctuation, tabs, and newlines.

lower

Whether to convert the input to lowercase.

split

Sentence split marker (string).

text

for compatibility purpose. use input_text instead.

Value

List of integers in [1, n]. Each integer encodes a word (unicity non-guaranteed).

See Also


keras

R Interface to 'Keras'

v2.4.0
MIT + file LICENSE
Authors
Daniel Falbel [ctb, cph, cre], JJ Allaire [aut, cph], François Chollet [aut, cph], RStudio [ctb, cph, fnd], Google [ctb, cph, fnd], Yuan Tang [ctb, cph] (<https://orcid.org/0000-0001-5243-233X>), Wouter Van Der Bijl [ctb, cph], Martin Studer [ctb, cph], Sigrid Keydana [ctb]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.