textreuse (0.1.0)

Detect Text Reuse and Document Similarity.

https://github.com/ropensci/textreuse
http://cran.r-project.org/web/packages/textreuse

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Maintainer: Lincoln Mullen
Author(s): Lincoln Mullen [aut, cre]

License: MIT + file LICENSE

Uses: assertthat, digest, dplyr, NLP, Rcpp, RcppProgress, stringr, tidyr, testthat, knitr, rmarkdown
Reverse suggests: textrank

Released over 3 years ago.