thr3ads.net - R help - [R] How to read plain text documents into a vector? [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Richard Liu

2009-Oct-13 06:30 UTC

[R] How to read plain text documents into a vector?

I'm new to R.  I'm working with the text mining package tm.  I have
several
plain text documents in a directory, and I would like to read all the files
with extension .txt in that directory into a vector, one text document per
vector element.  That is, v[1] would be the first document, v[2] the second,
etc.

I know how to read the documents into a tm Corpus, but that's not what I
want to do.  I would think that this kind of operation should be elementary
and the first step in any text mining.

Thanks,
Richard
-- 
View this message in context:
http://www.nabble.com/How-to-read-plain-text-documents-into-a-vector--tp25867792p25867792.html
Sent from the R help mailing list archive at Nabble.com.

Dieter Menne

2009-Oct-13 06:43 UTC

head link

[R] How to read plain text documents into a vector?

Richard Liu wrote:> 
> I'm new to R.  I'm working with the text mining package tm.  I have
> several plain text documents in a directory, and I would like to read all
> the files with extension .txt in that directory into a vector, one text
> document per vector element.  That is, v[1] would be the first document,
> v[2] the second, etc.
> 
> I know how to read the documents into a tm Corpus, but that's not what
I
> want to do.  I would think that this kind of operation should be
> elementary and the first step in any text mining.
> 
Reading in a non-structured file is not that common in R, so tm provides
special methods. There is a vignette tm.pdf coming with tm that explains it
on the first page.

Dieter


-- 
View this message in context:
http://www.nabble.com/How-to-read-plain-text-documents-into-a-vector--tp25867792p25867914.html
Sent from the R help mailing list archive at Nabble.com.

Paul Hiemstra

2009-Oct-13 09:09 UTC

head link

[R] How to read plain text documents into a vector?

Richard Liu wrote:> I'm new to R.  I'm working with the text mining package tm.  I have
several
> plain text documents in a directory, and I would like to read all the files
> with extension .txt in that directory into a vector, one text document per
> vector element.  That is, v[1] would be the first document, v[2] the
second,
> etc.
>
> I know how to read the documents into a tm Corpus, but that's not what
I
> want to do.  I would think that this kind of operation should be elementary
> and the first step in any text mining.
>
> Thanks,
> Richard
>   Hi Richard,

Try somthing along these lines:

file_list = list.files("/where/are/the/files")
obj_list = lapply(file_list, FUN = yourfunction)

yourfunction is probably either read.table or some read function from 
the tm package. So obj_list will become a list of either data.frame's or 
tm objects.

cheers,
Paul

-- 
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Oct 2009 - How to read plain text documents into a vector?

[R] How to read plain text documents into a vector?

[R] How to read plain text documents into a vector?

[R] How to read plain text documents into a vector?

Seemingly Similar Threads