textTinyR (1.0.1)

Text Processing for Small or Big Data Files.


Processes big text data files in batches efficiently. For this purpose, it offers functions for splitting, parsing, tokenizing and creating a vocabulary. Moreover, it includes functions for building either a document-term matrix or a term-document matrix and extracting information from those (term-associations, most frequent terms). Lastly, it embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. The source code is based on 'C++11' and exported in R through the 'Rcpp', 'RcppArmadillo' and 'BH' packages.

Maintainer: Lampros Mouselimis
Author(s): Lampros Mouselimis <mouselimislampros@gmail.com>

License: GPL-3

Uses: data.table, Matrix, R6, Rcpp, testthat, knitr, rmarkdown, covr

Released almost 3 years ago.