li li
2013-May-07 22:46 UTC
[R] how to read numeric vector as factors using read.table.ffdf
I have a big data set that includes character variables of many different values. I'm trying to use ff to read the data and then use biglm.big.matrix to build linear models. However, since big.matrix will convert all character vectors to factors and the character labels will be lost. I decided to create a lookup table outside of R for my character columns and use numbers to represent different levels for R. However, I do not know how to tell read.table.ffdf these columns should be considered factors instead of numerics. Please help. thanks. [[alternative HTML version deleted]]
MacQueen, Don
2013-May-08 19:30 UTC
[R] how to read numeric vector as factors using read.table.ffdf
Please read the posting guide. Your question is far from clear. First you're apparently unhappy because character vectors are being converted to factors. Then later you ask how to tell a function that some numeric datas should be considered factors. Which is it that you want? A simple short example would help.> help.search('read.table.ffdf')No vignettes or demos or help files found with alias or concept or title matching 'read.table.ffdf' using regular expression matching.> help.search('biglm.big.matrix')No vignettes or demos or help files found with alias or concept or title matching 'biglm.big.matrix' using regular expression matching. You need to provide information about where these apparent functions come from. Regardless, if after using read.table.ffdf you have a data frame, you can convert any column, numeric or character, to a factor using the factor() function. You can also convert factors back to characters using, for example, format(). In other 'read' functions, such as read.table, when character data is converted to factors, the "labels" are not lost. Perhaps the documentation for read.table.ffdf describes how to prevent conversion to factor. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 5/7/13 3:46 PM, "li li" <zgscgh01 at gmail.com> wrote:>I have a big data set that includes character variables of many different >values. I'm trying to use ff to read the data and then use >biglm.big.matrix >to build linear models. However, since big.matrix will convert all >character vectors to factors and the character labels will be lost. I >decided to create a lookup table outside of R for my character columns and >use numbers to represent different levels for R. However, I do not know >how >to tell read.table.ffdf these columns should be considered factors instead >of numerics. Please help. thanks. > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Maybe Matching Threads
- Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?
- help with read.table.ffdf parameters
- Prediction with two fixed-effects - large number of IDs
- ff package: reading selected columns from csv
- Specifying splits - in read.csv.ffdf