thr3ads.net - R help - [R] Regarding your distributed text mining with tm [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Shivani Rao

2010-Oct-23 01:47 UTC

[R] Regarding your distributed text mining with tm

Hello,
I had been using R for text mining already. I wanted to use R for large
scale text processing and for experiments with topic modeling. I started
reading tutorials and working on some of those. I will now put down my
understanding of each of the tools:

1) R text mining toolbox: Meant for local (client side) text processing and
it uses the XML library
2) Hive: Hadoop interative, provides the framework to call map/reduce and
also provides the DFS  interface for storing files on the DFS.
3) RHIPE: R Hadoop integrated environment
4) Elastic MapReduce with R: a MapReduce framework for those who do not have
their own clusters
5) Distributed Text Mining with R: An attempt to make seamless move form
local to server side processing, from R-tm to R-distributed-tm

I have the following questions and confusions about the above packages

1) Hive and RHIPE and the distributed text mining toolbox need you to have
your own clusters. Right?

2) If I have just one computer how would DFS work in case of HIVE

3) Are we facing with the problem of duplication of effort with the above
packages?

I am hoping to get insights on the above questions in the next few days.
Your timely response will be helpful

Thanks and Regards,
Shivani

-- 
Research Scholar,
School of Electrical and Computer Engineering
Purdue University
West Lafayette IN
web.ics.purdue.edu/~sgrao <http://web.ics.purdue.edu/%7Esgrao>



-- 
Research Scholar,
School of Electrical and Computer Engineering
Purdue University
West Lafayette IN
web.ics.purdue.edu/~sgrao <http://web.ics.purdue.edu/%7Esgrao>



-- 
Research Scholar,
School of Electrical and Computer Engineering
Purdue University
West Lafayette IN
web.ics.purdue.edu/~sgrao

	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more possibly parallel threads

R help - Oct 2010 - Regarding your distributed text mining with tm

[R] Regarding your distributed text mining with tm

Reasonably Related Threads

Wisdom of the Ancients