Tokenizes an orthographic transcription.
This function calls the webservice G2P to break up a transcription into tokens, or words. In addition to tokenization, G2P performs normalization of numbers and other special words. A call to this function is usually followed by a call to runBASwebservice_g2pForPronunciation. This function requires an internet connection.
runBASwebservice_g2pForTokenization( handle, transcriptionAttributeDefinitionName, language, orthoAttributeDefinitionName = "ORT", params = list(), patience = 0, resume = FALSE, verbose = TRUE )
handle |
emuDB handle |
transcriptionAttributeDefinitionName |
name of the attribute (not level!) containing an orthographic transcription. |
language |
language(s) to be used. If you pass a single string (e.g. "deu-DE"), this language will be used for all bundles. Alternatively, you can select the language for every bundle individually. To do so, you must pass a data frame with the columns session, bundle, language. This data frame must contain one row for every bundle in your emuDB. Up-to-date lists of the languages accepted by all webservices can be found here: https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/help |
orthoAttributeDefinitionName |
attribute name for orthographic words |
params |
named list of parameters to be passed on to the webservice. It is your own responsibility to
ensure that these parameters are compatible with the webservice API
(see https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/help).
Some options accepted by the API (e.g. output format) cannot be set when calling a webservice from within emuR,
and will be overridden. If file parameters are used please wrap the file path in |
patience |
If a web service call fails, it is repeated a further n times, with n being the value of patience. Must be set to a value between 0 and 3. |
resume |
If a previous call to this function has failed (and you think you have fixed the issue that caused the error), you can set resume=TRUE to recover any progress made up to that point. This will only work if your R temporary directory has not been deleted or emptied in the meantime. |
verbose |
Display progress bars and other information |
All necessary level, link and attribute definitions are created in the process.
Other BAS webservice functions:
runBASwebservice_all()
,
runBASwebservice_chunker()
,
runBASwebservice_g2pForPronunciation()
,
runBASwebservice_maus()
,
runBASwebservice_minni()
,
runBASwebservice_pho2sylCanonical()
,
runBASwebservice_pho2sylSegmental()
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.