Yeh, Richard C
2007-Jan-22 19:01 UTC
[R] Example function for bigglm (biglm) data input from file
This is to submit a commented example function for use in the data argument to the bigglm(biglm) function, when you want to read the data from a file (instead of a URL), or rescale or modify the data before fitting the model. In the hope that this may be of help to someone out there. make.data <- function (filename, chunksize, ...) { conn<-NULL; function (reset=FALSE) { if (reset) { if (!is.null(conn)) { close(conn); }; # This is for a file. # For other methods, see: help("connections") # and replace the following definition of conn # (and possibly the read.table call). conn <<- file (description=filename, open="r"); } else { # It's best that the file you use has no header # line, because when you use the connection to # read each excerpt, any header won't get re-read. # If you choose to skip the first line, then the # first line of each excerpt will be skipped. rval <- read.table (conn, nrows=chunksize, skip=0, header=FALSE,...); if (nrow(rval)==0) { # Then we have reached the end of the input. # Clean up: close(conn); conn<<-NULL; rval<-NULL; } else { # We did not reach the end of the input, # so this function will return data. # Here, you can define any derived fields # or put instructions to rescale input data # that you want done after the data are read # but before they are used for fitting. # For example: rval$rescaled_column <- rval$original_column / 1000000.0; # If you don't want to do anything like this, # then delete this "else" clause, and make # the end of the function resemble the URL # example in bigglm. }; return(rval); } } }; a <- make.data ( filename = "myfile", chunksize = 1000000, # In our definition of make.data, any remaining # arguments get passed to the read.table function by # the ... argument. # Define column types: colClasses = list ("character", "character", "integer", "numeric", "numeric"), # Define the column names in the call: # (recall that we cannot rely on the file header) col.names = c("fromState", "toState", "first", "original_column", "second") ); library(biglm); bigglm (formula = toState ~ 1 + first + rescaled_column, data = a, family = binomial(link='logit'), weights = ~second); summary(.Last.value) NOTICE TO RECIPIENTS: Any information contained in or attach...{{dropped}}
stephenb
2010-May-25 21:00 UTC
[R] Example function for bigglm (biglm) data input from file
Richard, do you have an example for an ODBC connection? Thank you Stephen -- View this message in context: http://r.789695.n4.nabble.com/R-Example-function-for-bigglm-biglm-data-input-from-file-tp816496p2230710.html Sent from the R help mailing list archive at Nabble.com.