Hi Jim, this is exactly the answer I was look for. Many thanks. I didn?t R had a pack function, as in PERL. To answer your earlier question, I am trying to update legacy code to read a binary file with unknown size, over a network, slice up it into rows each containing an integer, an integer, a long, a short, a float and a float, and stuff the rows into a matrix. Best regards, Philippe> Le 17 sept. 2016 ? 20:38, jim holtman <jholtman at gmail.com> a ?crit : > > Here is an example of how to do it: > > x <- 1:10 # integer values > xf <- seq(1.0, 2, by = 0.1) # floating point > > setwd("d:/temp") > > # create file to write to > output <- file('integer.bin', 'wb') > writeBin(x, output) # write integer > writeBin(xf, output) # write reals > close(output) > > > library(pack) > library(readr) > > # read all the data at once > allbin <- read_file_raw('integer.bin') > > # decode the data into a list > (result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin)) > > > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenismail at gmail.com <mailto:sezenismail at gmail.com>> wrote: > I noticed same issue but didnt care much :) > > On Sat, Sep 17, 2016, 18:01 jim holtman <jholtman at gmail.com <mailto:jholtman at gmail.com>> wrote: > Your example was not reproducible. Also how do you "break" out of the > "while" loop? > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phiroc at free.fr <mailto:phiroc at free.fr>> > wrote: > > > Hello, > > the following function, which stores numeric values extracted from a > > binary file, into an R matrix, is very slow, especially when the said file > > is several MB in size. > > Should I rewrite the function in inline C or in C/C++ using Rcpp? If the > > latter case is true, how do you ? readBin ? in Rcpp (I?m a total Rcpp > > newbie)? > > Many thanks. > > Best regards, > > phiroc > > > > > > ------------- > > > > # inputPath is something like http://myintranet/getData <http://myintranet/getData>? > > pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData <http://myintranet/getData>? > > pathToFile=/usr/lib/xxx/yyy/data.bin> > > > > PLTreader <- function(inputPath){ > > URL <- file(inputPath, "rb") > > PLT <- matrix(nrow=0, ncol=6) > > compteurDePrints = 0 > > compteurDeLignes <- 0 > > maxiPrints = 5 > > displayData <- FALSE > > while (TRUE) { > > periodIndex <- readBin(URL, integer(), size=4, n=1, > > endian="little") # int (4 bytes) > > eventId <- readBin(URL, integer(), size=4, n=1, > > endian="little") # int (4 bytes) > > dword1 <- readBin(URL, integer(), size=4, signed=FALSE, > > n=1, endian="little") # int > > dword2 <- readBin(URL, integer(), size=4, signed=FALSE, > > n=1, endian="little") # int > > if (dword1 < 0) { > > dword1 = dword1 + 2^32-1; > > } > > eventDate = (dword2*2^32 + dword1)/1000 > > repNum <- readBin(URL, integer(), size=2, n=1, > > endian="little") # short (2 bytes) > > exp <- readBin(URL, numeric(), size=4, n=1, > > endian="little") # float (4 bytes, strangely enough, would expect 8) > > loss <- readBin(URL, numeric(), size=4, n=1, > > endian="little") # float (4 bytes) > > PLT <- rbind(PLT, c(periodIndex, eventId, eventDate, > > repNum, exp, loss)) > > } # end while > > return(PLT) > > close(URL) > > } > > > > ---------------- > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide http://www.R-project.org/ <http://www.r-project.org/> > > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
The only difference between the below code and my program is that the former assumes that the file only contains one row of 10 ints + 10 floats , whereas my program doesn?t know in advance how many rows the file contains, unless it downloads it first and computes the potential number of rows based on its size.> Le 17 sept. 2016 ? 20:45, Philippe de Rochambeau <phiroc at free.fr> a ?crit : > > Hi Jim, > this is exactly the answer I was look for. Many thanks. I didn?t R had a pack function, as in PERL. > To answer your earlier question, I am trying to update legacy code to read a binary file with unknown size, over a network, slice up it into rows each containing an integer, an integer, a long, a short, a float and a float, and stuff the rows into a matrix. > Best regards, > Philippe > >> Le 17 sept. 2016 ? 20:38, jim holtman <jholtman at gmail.com <mailto:jholtman at gmail.com>> a ?crit : >> >> Here is an example of how to do it: >> >> x <- 1:10 # integer values >> xf <- seq(1.0, 2, by = 0.1) # floating point >> >> setwd("d:/temp") >> >> # create file to write to >> output <- file('integer.bin', 'wb') >> writeBin(x, output) # write integer >> writeBin(xf, output) # write reals >> close(output) >> >> >> library(pack) >> library(readr) >> >> # read all the data at once >> allbin <- read_file_raw('integer.bin') >> >> # decode the data into a list >> (result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin)) >> >> >> >> >> Jim Holtman >> Data Munger Guru >> >> What is the problem that you are trying to solve? >> Tell me what you want to do, not how you want to do it. >> >> On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenismail at gmail.com <mailto:sezenismail at gmail.com>> wrote: >> I noticed same issue but didnt care much :) >> >> On Sat, Sep 17, 2016, 18:01 jim holtman <jholtman at gmail.com <mailto:jholtman at gmail.com>> wrote: >> Your example was not reproducible. Also how do you "break" out of the >> "while" loop? >> >> >> Jim Holtman >> Data Munger Guru >> >> What is the problem that you are trying to solve? >> Tell me what you want to do, not how you want to do it. >> >> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phiroc at free.fr <mailto:phiroc at free.fr>> >> wrote: >> >> > Hello, >> > the following function, which stores numeric values extracted from a >> > binary file, into an R matrix, is very slow, especially when the said file >> > is several MB in size. >> > Should I rewrite the function in inline C or in C/C++ using Rcpp? If the >> > latter case is true, how do you ? readBin ? in Rcpp (I?m a total Rcpp >> > newbie)? >> > Many thanks. >> > Best regards, >> > phiroc >> > >> > >> > ------------- >> > >> > # inputPath is something like http://myintranet/getData <http://myintranet/getData>? >> > pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData <http://myintranet/getData>? >> > pathToFile=/usr/lib/xxx/yyy/data.bin> >> > >> > PLTreader <- function(inputPath){ >> > URL <- file(inputPath, "rb") >> > PLT <- matrix(nrow=0, ncol=6) >> > compteurDePrints = 0 >> > compteurDeLignes <- 0 >> > maxiPrints = 5 >> > displayData <- FALSE >> > while (TRUE) { >> > periodIndex <- readBin(URL, integer(), size=4, n=1, >> > endian="little") # int (4 bytes) >> > eventId <- readBin(URL, integer(), size=4, n=1, >> > endian="little") # int (4 bytes) >> > dword1 <- readBin(URL, integer(), size=4, signed=FALSE, >> > n=1, endian="little") # int >> > dword2 <- readBin(URL, integer(), size=4, signed=FALSE, >> > n=1, endian="little") # int >> > if (dword1 < 0) { >> > dword1 = dword1 + 2^32-1; >> > } >> > eventDate = (dword2*2^32 + dword1)/1000 >> > repNum <- readBin(URL, integer(), size=2, n=1, >> > endian="little") # short (2 bytes) >> > exp <- readBin(URL, numeric(), size=4, n=1, >> > endian="little") # float (4 bytes, strangely enough, would expect 8) >> > loss <- readBin(URL, numeric(), size=4, n=1, >> > endian="little") # float (4 bytes) >> > PLT <- rbind(PLT, c(periodIndex, eventId, eventDate, >> > repNum, exp, loss)) >> > } # end while >> > return(PLT) >> > close(URL) >> > } >> > >> > ---------------- >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> >> > PLEASE do read the posting guide http://www.R-project.org/ <http://www.r-project.org/> >> > posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Please find below code that attempts to read ints, longs and floats from a
binary file (which is a simplification of my original program).
Please disregard the R inefficiencies, such as using rbind, for now.
I?ve also included Java code to generate the binary file.
The output shows that, at one point, anInt becomes undefined. Unfortunately, I
couldn?t find the correct R function to determine whether inInt is undefined or
not, as is.null, is.nan, and is.infinite don?t work.
Any help would be much appreciated.
Many thanks in advance.
Philippe
???????
[1] "anInt = 1"
[1] "is.null FALSE"
[1] "is.nan FALSE"
[1] "is.infinite FALSE"
[1] "aLong = 2"
[1] "aFloat = 3.44440007209778"
[1] "--------------------------"
[1] "anInt = 2"
[1] "is.null FALSE"
[1] "is.nan FALSE"
[1] "is.infinite FALSE"
[1] "aLong = 22"
[1] "aFloat = 13.4644002914429"
[1] "--------------------------"
[1] "anInt = 3"
[1] "is.null FALSE"
[1] "is.nan FALSE"
[1] "is.infinite FALSE"
[1] "aLong = 55"
[1] "aFloat = 45.4444007873535"
[1] "--------------------------"
[1] "anInt = "
[1] "is.null FALSE"
[1] "is.nan "
[1] "is.infinite "
[1] "aLong = "
[1] "aFloat = "
[1] "--------------------------"
[,1] [,2] [,3]
[1,] 1 2 3.4444
[2,] 2 22 13.4644
[3,] 3 55 45.4444
[4,] Integer,0 Integer,0 Numeric,0>
-----------
?????????????????????
readFile <- function(inputPath) {
URL <- file(inputPath, "rb")
PLT <- matrix(nrow=0, ncol=3)
counte <- 0
max <- 4
while (counte < max) {
anInt <- readBin(con=URL, what=integer(), size=4, n=1,
endian="big")
print(paste("anInt =", anInt))
#if (! (anInt == 0)) { print(paste("empty int")); break }
print(paste("is.null ", is.null(anInt)))
print(paste("is.nan ", is.nan(anInt)))
print(paste("is.infinite ", is.infinite(anInt)))
aLong <- readBin(URL, integer(), size=8, n=1, endian="big")
print(paste("aLong =", aLong))
aFloat <- readBin(URL, numeric(), size=4, n=1, endian="big")
print(paste("aFloat =", aFloat))
print("--------------------------")
PLT <- rbind(PLT, list(anInt, aLong, aFloat))
counte <- counte + 1
} # end while
close(URL)
PLT
}
fichier <- "/Users/philippe/Desktop/datatests/data0.bin"
PLT2 <- readFile(fichier)
print(PLT2)
?????????????????????
import java.io.*;
public class Main {
Main() {
writeData();
}
public static void main(String[] args) {
new Main();
}
public void writeData() {
final String path = "/Users/philippe/Desktop/datatests/data0.bin";
DataOutputStream dos;
try {
dos = new DataOutputStream(new BufferedOutputStream(new
FileOutputStream(path)));
// big endian write! ("high byte first") , see
https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html
dos.writeInt(1);
dos.writeLong(2L);
dos.writeFloat(3.4444F);
dos.writeInt(2);
dos.writeLong(22L);
dos.writeFloat(13.4644F);
dos.writeInt(3);
dos.writeLong(55L);
dos.writeFloat(45.4444F);
dos.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
?????????????????????
> Le 17 sept. 2016 ? 20:45, Philippe de Rochambeau <phiroc at free.fr>
a ?crit :
>
> Hi Jim,
> this is exactly the answer I was look for. Many thanks. I didn?t R had a
pack function, as in PERL.
> To answer your earlier question, I am trying to update legacy code to read
a binary file with unknown size, over a network, slice up it into rows each
containing an integer, an integer, a long, a short, a float and a float, and
stuff the rows into a matrix.
> Best regards,
> Philippe
>
>> Le 17 sept. 2016 ? 20:38, jim holtman <jholtman at gmail.com
<mailto:jholtman at gmail.com>> a ?crit :
>>
>> Here is an example of how to do it:
>>
>> x <- 1:10 # integer values
>> xf <- seq(1.0, 2, by = 0.1) # floating point
>>
>> setwd("d:/temp")
>>
>> # create file to write to
>> output <- file('integer.bin', 'wb')
>> writeBin(x, output) # write integer
>> writeBin(xf, output) # write reals
>> close(output)
>>
>>
>> library(pack)
>> library(readr)
>>
>> # read all the data at once
>> allbin <- read_file_raw('integer.bin')
>>
>> # decode the data into a list
>> (result <- unpack("V V V V V V V V V V d d d d d d d d d
d", allbin))
>>
>>
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenismail at
gmail.com <mailto:sezenismail at gmail.com><mailto:sezenismail at
gmail.com <mailto:sezenismail at gmail.com>>> wrote:
>> I noticed same issue but didnt care much :)
>>
>> On Sat, Sep 17, 2016, 18:01 jim holtman <jholtman at gmail.com
<mailto:jholtman at gmail.com> <mailto:jholtman at gmail.com
<mailto:jholtman at gmail.com>>> wrote:
>> Your example was not reproducible. Also how do you "break"
out of the
>> "while" loop?
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phiroc at
free.fr <mailto:phiroc at free.fr> <mailto:phiroc at free.fr
<mailto:phiroc at free.fr>>>
>> wrote:
>>
>>> Hello,
>>> the following function, which stores numeric values extracted from
a
>>> binary file, into an R matrix, is very slow, especially when the
said file
>>> is several MB in size.
>>> Should I rewrite the function in inline C or in C/C++ using Rcpp?
If the
>>> latter case is true, how do you ? readBin ? in Rcpp (I?m a total
Rcpp
>>> newbie)?
>>> Many thanks.
>>> Best regards,
>>> phiroc
>>>
>>>
>>> -------------
>>>
>>> # inputPath is something like http://myintranet/getData
<http://myintranet/getData><http://myintranet/getData
<http://myintranet/getData>>?
>>> pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData
<http://myintranet/getData> <http://myintranet/getData
<http://myintranet/getData>>?
>>> pathToFile=/usr/lib/xxx/yyy/data.bin>
>>>
>>> PLTreader <- function(inputPath){
>>> URL <- file(inputPath, "rb")
>>> PLT <- matrix(nrow=0, ncol=6)
>>> compteurDePrints = 0
>>> compteurDeLignes <- 0
>>> maxiPrints = 5
>>> displayData <- FALSE
>>> while (TRUE) {
>>> periodIndex <- readBin(URL, integer(), size=4,
n=1,
>>> endian="little") # int (4 bytes)
>>> eventId <- readBin(URL, integer(), size=4, n=1,
>>> endian="little") # int (4 bytes)
>>> dword1 <- readBin(URL, integer(), size=4,
signed=FALSE,
>>> n=1, endian="little") # int
>>> dword2 <- readBin(URL, integer(), size=4,
signed=FALSE,
>>> n=1, endian="little") # int
>>> if (dword1 < 0) {
>>> dword1 = dword1 + 2^32-1;
>>> }
>>> eventDate = (dword2*2^32 + dword1)/1000
>>> repNum <- readBin(URL, integer(), size=2, n=1,
>>> endian="little") # short (2 bytes)
>>> exp <- readBin(URL, numeric(), size=4, n=1,
>>> endian="little") # float (4 bytes, strangely enough,
would expect 8)
>>> loss <- readBin(URL, numeric(), size=4, n=1,
>>> endian="little") # float (4 bytes)
>>> PLT <- rbind(PLT, c(periodIndex, eventId,
eventDate,
>>> repNum, exp, loss))
>>> } # end while
>>> return(PLT)
>>> close(URL)
>>> }
>>>
>>> ----------------
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org <mailto:R-help at r-project.org>
<mailto:R-help at r-project.org <mailto:R-help at r-project.org>>
mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help><https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>>
>>> PLEASE do read the posting guide http://www.R-project.org/
<http://www.r-project.org/> <http://www.r-project.org/
<http://www.r-project.org/>>
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org>
<mailto:R-help at r-project.org <mailto:R-help at r-project.org>>
mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help><https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>>
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.r-project.org/posting-guide.html>
<http://www.r-project.org/posting-guide.html
<http://www.r-project.org/posting-guide.html>>
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
On Sun, 18 Sep 2016, 19:04 Philippe de Rochambeau <phiroc at free.fr> wrote:> Please find below code that attempts to read ints, longs and floats from a > binary file (which is a simplification of my original program). > Please disregard the R inefficiencies, such as using rbind, for now. > I?ve also included Java code to generate the binary file. > The output shows that, at one point, anInt becomes undefined. > Unfortunately, I couldn?t find the correct R function to determine whether > inInt is undefined or not, as is.null, is.nan, and is.infinite don?t work. > Any help would be much appreciated. > Many thanks in advance. > Philippe > > ??????? > [1] "anInt = 1" > [1] "is.null FALSE" > [1] "is.nan FALSE" > [1] "is.infinite FALSE" > [1] "aLong = 2" > [1] "aFloat = 3.44440007209778" > [1] "--------------------------" > [1] "anInt = 2" > [1] "is.null FALSE" > [1] "is.nan FALSE" > [1] "is.infinite FALSE" > [1] "aLong = 22" > [1] "aFloat = 13.4644002914429" > [1] "--------------------------" > [1] "anInt = 3" > [1] "is.null FALSE" > [1] "is.nan FALSE" > [1] "is.infinite FALSE" > [1] "aLong = 55" > [1] "aFloat = 45.4444007873535" > [1] "--------------------------" > [1] "anInt = " > [1] "is.null FALSE" > [1] "is.nan " > [1] "is.infinite " > [1] "aLong = " > [1] "aFloat = " > [1] "--------------------------" > [,1] [,2] [,3] > [1,] 1 2 3.4444 > [2,] 2 22 13.4644 > [3,] 3 55 45.4444 > [4,] Integer,0 Integer,0 Numeric,0 > > > > ----------- > > > ????????????????????? > > readFile <- function(inputPath) { > URL <- file(inputPath, "rb") > PLT <- matrix(nrow=0, ncol=3) > counte <- 0 > max <- 4 > while (counte < max) { > anInt <- readBin(con=URL, what=integer(), size=4, n=1, endian="big") > print(paste("anInt =", anInt)) > #if (! (anInt == 0)) { print(paste("empty int")); break } > print(paste("is.null ", is.null(anInt))) > print(paste("is.nan ", is.nan(anInt))) > print(paste("is.infinite ", is.infinite(anInt))) > aLong <- readBin(URL, integer(), size=8, n=1, endian="big") > print(paste("aLong =", aLong)) > aFloat <- readBin(URL, numeric(), size=4, n=1, endian="big") > print(paste("aFloat =", aFloat)) > print("--------------------------") > PLT <- rbind(PLT, list(anInt, aLong, aFloat)) > counte <- counte + 1 > } # end while > close(URL) > PLT > } > fichier <- "/Users/philippe/Desktop/datatests/data0.bin" > PLT2 <- readFile(fichier) > print(PLT2) > ????????????????????? > > import java.io.*; > > public class Main { > > Main() { > writeData(); > } > > public static void main(String[] args) { > new Main(); > } > > public void writeData() { > > final String path > "/Users/philippe/Desktop/datatests/data0.bin"; > > DataOutputStream dos; > try { > dos = new DataOutputStream(new > BufferedOutputStream(new FileOutputStream(path))); > // big endian write! ("high byte first") , see > https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html > dos.writeInt(1); > dos.writeLong(2L); > dos.writeFloat(3.4444F); > > dos.writeInt(2); > dos.writeLong(22L); > dos.writeFloat(13.4644F); > > dos.writeInt(3); > dos.writeLong(55L); > dos.writeFloat(45.4444F); > > dos.close(); > } catch (FileNotFoundException e) { > e.printStackTrace(); > } catch (IOException ioe) { > ioe.printStackTrace(); > } > > } > > } > > > ????????????????????? > > > > > > > > Le 17 sept. 2016 ? 20:45, Philippe de Rochambeau <phiroc at free.fr> a > ?crit : > > > > Hi Jim, > > this is exactly the answer I was look for. Many thanks. I didn?t R had a > pack function, as in PERL. > > To answer your earlier question, I am trying to update legacy code to > read a binary file with unknown size, over a network, slice up it into rows > each containing an integer, an integer, a long, a short, a float and a > float, and stuff the rows into a matrix. >It's possible to read all rows fast as raw(), then parse in a vectorised way with matrix indexing to group the bytes appropriately. There is an example on the mailing list somewhere, but otherwise I can show an example if that's of interest. Cheers, Mike> Best regards, > > Philippe > > > >> Le 17 sept. 2016 ? 20:38, jim holtman <jholtman at gmail.com <mailto: > jholtman at gmail.com>> a ?crit : > >> > >> Here is an example of how to do it: > >> > >> x <- 1:10 # integer values > >> xf <- seq(1.0, 2, by = 0.1) # floating point > >> > >> setwd("d:/temp") > >> > >> # create file to write to > >> output <- file('integer.bin', 'wb') > >> writeBin(x, output) # write integer > >> writeBin(xf, output) # write reals > >> close(output) > >> > >> > >> library(pack) > >> library(readr) > >> > >> # read all the data at once > >> allbin <- read_file_raw('integer.bin') > >> > >> # decode the data into a list > >> (result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin)) > >> > >> > >> > >> > >> Jim Holtman > >> Data Munger Guru > >> > >> What is the problem that you are trying to solve? > >> Tell me what you want to do, not how you want to do it. > >> > >> On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenismail at gmail.com > <mailto:sezenismail at gmail.com><mailto:sezenismail at gmail.com <mailto: > sezenismail at gmail.com>>> wrote: > >> I noticed same issue but didnt care much :) > >> > >> On Sat, Sep 17, 2016, 18:01 jim holtman <jholtman at gmail.com <mailto: > jholtman at gmail.com> <mailto:jholtman at gmail.com <mailto:jholtman at gmail.com>>> > wrote: > >> Your example was not reproducible. Also how do you "break" out of the > >> "while" loop? > >> > >> > >> Jim Holtman > >> Data Munger Guru > >> > >> What is the problem that you are trying to solve? > >> Tell me what you want to do, not how you want to do it. > >> > >> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phiroc at free.fr > <mailto:phiroc at free.fr> <mailto:phiroc at free.fr <mailto:phiroc at free.fr>>> > >> wrote: > >> > >>> Hello, > >>> the following function, which stores numeric values extracted from a > >>> binary file, into an R matrix, is very slow, especially when the said > file > >>> is several MB in size. > >>> Should I rewrite the function in inline C or in C/C++ using Rcpp? If > the > >>> latter case is true, how do you ? readBin ? in Rcpp (I?m a total Rcpp > >>> newbie)? > >>> Many thanks. > >>> Best regards, > >>> phiroc > >>> > >>> > >>> ------------- > >>> > >>> # inputPath is something like http://myintranet/getData < > http://myintranet/getData><http://myintranet/getData < > http://myintranet/getData>>? > >>> pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData < > http://myintranet/getData> <http://myintranet/getData < > http://myintranet/getData>>? > >>> pathToFile=/usr/lib/xxx/yyy/data.bin> > >>> > >>> PLTreader <- function(inputPath){ > >>> URL <- file(inputPath, "rb") > >>> PLT <- matrix(nrow=0, ncol=6) > >>> compteurDePrints = 0 > >>> compteurDeLignes <- 0 > >>> maxiPrints = 5 > >>> displayData <- FALSE > >>> while (TRUE) { > >>> periodIndex <- readBin(URL, integer(), size=4, n=1, > >>> endian="little") # int (4 bytes) > >>> eventId <- readBin(URL, integer(), size=4, n=1, > >>> endian="little") # int (4 bytes) > >>> dword1 <- readBin(URL, integer(), size=4, signed=FALSE, > >>> n=1, endian="little") # int > >>> dword2 <- readBin(URL, integer(), size=4, signed=FALSE, > >>> n=1, endian="little") # int > >>> if (dword1 < 0) { > >>> dword1 = dword1 + 2^32-1; > >>> } > >>> eventDate = (dword2*2^32 + dword1)/1000 > >>> repNum <- readBin(URL, integer(), size=2, n=1, > >>> endian="little") # short (2 bytes) > >>> exp <- readBin(URL, numeric(), size=4, n=1, > >>> endian="little") # float (4 bytes, strangely enough, would expect 8) > >>> loss <- readBin(URL, numeric(), size=4, n=1, > >>> endian="little") # float (4 bytes) > >>> PLT <- rbind(PLT, c(periodIndex, eventId, eventDate, > >>> repNum, exp, loss)) > >>> } # end while > >>> return(PLT) > >>> close(URL) > >>> } > >>> > >>> ---------------- > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-help at r-project.org <mailto:R-help at r-project.org> <mailto: > R-help at r-project.org <mailto:R-help at r-project.org>> mailing list -- To > UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>< > https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>> > >>> PLEASE do read the posting guide http://www.R-project.org/ < > http://www.r-project.org/> <http://www.r-project.org/ < > http://www.r-project.org/>> > >>> posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help at r-project.org <mailto:R-help at r-project.org> <mailto: > R-help at r-project.org <mailto:R-help at r-project.org>> mailing list -- To > UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>< > https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>> > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html < > http://www.r-project.org/posting-guide.html> < > http://www.r-project.org/posting-guide.html < > http://www.r-project.org/posting-guide.html>> > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To > UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html < > http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Dr. Michael Sumner Software and Database Engineer Australian Antarctic Division 203 Channel Highway Kingston Tasmania 7050 Australia [[alternative HTML version deleted]]