thr3ads.net - R packages - [R] [R-pkgs] tm 0.1 uploaded to CRAN [Jan 2007]

If this information is useful, please help other people find it:
Share via:

Ingo Feinerer

2007-Jan-11 10:52 UTC

[R] [R-pkgs] tm 0.1 uploaded to CRAN

Dear useRs,

a first version of tm has just been released on CRAN.

tm provides a sophisticated framework for text mining applications
within R.

It offers functionality for managing text documents, abstracts the
process of document manipulation and eases the usage of heterogeneous
text formats in R. An advanced metadata management is
implemented for collections of text documents to alleviate the usage
of large and with metadata enriched document sets.

With the package ships native support for handling
   *) the Reuters 21578 dataset,
   *) the Reuters Corpus Volume 1 dataset,
   *) Gmane RSS feeds,
   *) e-mails, and
   *) several classic file formats (e.g. plain text or CSV text).

tm provides easy access to preprocessing and manipulation mechanisms, like
   *) whitespace removal,
   *) stemming, or
   *) conversion between file formats (e.g., Reuters21578 to plain
   text).

Further a generic filter architecture is available in order to
   *) filter documents for certain criteria,
   *) or perform fulltext search.

The package supports the export from document collections to
term-document matrices as frequently used in the text mining
literature. This allows the straight-forward integration of existing
methods for classification, clustering, visualizations, etc.

The package is designed in a modular way to enable easy integration of
new file formats, parsers, transformations and filter operations.

Best regards,

Ingo Feinerer

_______________________________________________
R-packages mailing list
R-packages at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-packages

Possibly Parallel Threads

Search for more reasonably related threads

R packages - Jan 2007 - tm 0.1 uploaded to CRAN

[R] [R-pkgs] tm 0.1 uploaded to CRAN

Possibly Parallel Threads

Wisdom of the Ancients