textreuse (0.1.4)

0 users

Detect Text Reuse and Document Similarity.

https://github.com/ropensci/textreuse
http://cran.r-project.org/web/packages/textreuse

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Maintainer: Lincoln Mullen
Author(s): Lincoln Mullen [aut, cre]

License: MIT + file LICENSE

Uses: assertthat, digest, dplyr, NLP, Rcpp, RcppProgress, stringr, tidyr, testthat, knitr, rmarkdown, covr

Released 4 months ago.


4 previous versions

Ratings

Overall:

  (0 votes)

Documentation:

  (0 votes)

Log in to vote.

Reviews

No one has written a review of textreuse yet. Want to be the first? Write one now.


Related packages: corpora, gsubfn, kernlab, languageR, lsa, tm, wordnet, zipfR, RWeka, RKEA, openNLP, skmeans, tau, tm.plugin.mail, lda, textcat, topicmodels, tm.plugin.dc, textir, movMF(20 best matches, based on common tags.)


Search for textreuse on google, google scholar, r-help, r-devel.

Visit textreuse on R Graphical Manual.