udpipe (0.2.1)

Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit.

https://github.com/bnosac/udpipe
http://cran.r-project.org/web/packages/udpipe

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at . The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at .

Maintainer: Jan Wijffels
Author(s): Jan Wijffels [aut, cre, cph], BNOSAC [cph], Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic [cph], Milan Straka [cph], Jana Strakov [cph]

License: MPL-2.0

Uses: data.table, Matrix, Rcpp, topicmodels, knitr

Released 6 days ago.