Hello. readBin is designed to read a batch of data with the same spec, e.g. read 10000 floats into a vector. In practise I read into data frame, not vector. For each data frame, I need to read a integer and a float. for (i in 1:1000) { dataframe$int[i] <- readBin(con, integer(), size=2) dataframe$float[i] <- readBin(con, numeric(), size=4) } And I need to read 100 such data files, ending up with a for loop in a for loop. Something feels wrong here, as it is being said if you use double-FOR you are not speaking R. What is the R way of doing this? I can think of writing the content of the loop into a function, and vectorize it -- But, the result would be a list of list, not exactly data-frame, and the list grows incrementally, which is inefficient, since I know the size of my data frame at the outset. I am a new learner, not speaking half of R vocabulary, kindly provide some hint please:) Best.
On 13-08-01 4:36 AM, Zhang Weiwu wrote:> Hello. readBin is designed to read a batch of data with the same spec, e.g. > read 10000 floats into a vector. In practise I read into data frame, not > vector. For each data frame, I need to read a integer and a float. > > for (i in 1:1000) { > dataframe$int[i] <- readBin(con, integer(), size=2) > dataframe$float[i] <- readBin(con, numeric(), size=4) > } > > And I need to read 100 such data files, ending up with a for loop in a for > loop. Something feels wrong here, as it is being said if you use double-FOR > you are not speaking R. > > What is the R way of doing this? I can think of writing the content of the > loop into a function, and vectorize it -- But, the result would be a list of > list, not exactly data-frame, and the list grows incrementally, which is > inefficient, since I know the size of my data frame at the outset. I am a > new learner, not speaking half of R vocabulary, kindly provide some hint > please:)I don't think there are any functions to do this directly. I'd probably use the loop (since the time to read 1000 entries would be small). If it was longer, what I might do is to read the file as raw bytes, then read the integer and float vector from subsets of the bytes. For example, the following untested code: rawvec <- readBin(con, "raw") n <- length(rawvec) / 6 i <- 0:(n-1) # Using sort here is inefficient, but I'm lazy... indices <- sort( c(6*i + 1, 6*i + 2) ) con <- rawConnection(rawvec[indices]) int <- readBin(con, "integer", size=2) close(con) indices <- sort( c(6*i + 3, 6*i + 4, 6*i + 5, 6*i + 6) ) con <- rawConnection(rawvec[indices]) float <- readBin(con, "numeric", 4) close(con) dataframe <- data.frame(int=int, float=float) The other way to do this is to read the data in a C function, using .Call or .C to get it into R. Duncan Murdoch
On 01.08.2013 10:36, Zhang Weiwu wrote:> Hello. readBin is designed to read a batch of data with the same spec, > e.g. read 10000 floats into a vector. In practise I read into data > frame, not vector. For each data frame, I need to read a integer and a > float. > > for (i in 1:1000) { > dataframe$int[i] <- readBin(con, integer(), size=2) > dataframe$float[i] <- readBin(con, numeric(), size=4) > }Ideally one would read bunches of identical types within R. This seems not to be possible here, hence I'd suggest to read it via some C code. Best, Uwe Ligges> > And I need to read 100 such data files, ending up with a for loop in a > for loop. Something feels wrong here, as it is being said if you use > double-FOR you are not speaking R. > > What is the R way of doing this? I can think of writing the content of > the loop into a function, and vectorize it -- But, the result would be a > list of list, not exactly data-frame, and the list grows incrementally, > which is inefficient, since I know the size of my data frame at the > outset. I am a new learner, not speaking half of R vocabulary, kindly > provide some hint please:) > > Best. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.