Leeds, Mark (IED)
2006-Nov-21 09:22 UTC
[R] Is there any way to know when a field is blank
I have many text files in the format below and in certain rare instances
such as below there can be nothing in one of the fields so
a double comma is written but I won't know this because I am reading in
many,many files sequentially.
# TEXT FILE
2004-02-10 00:01:31.00000,,105.60000000
2004-02-10 00:01:32.00001,,105.60000000
2004-02-10 00:01:45.00000,,105.60000000
2004-02-10 00:01:49.00000,,105.61000000
2004-02-10 00:02:08.00000,,105.60000000
2004-02-10 00:02:15.00000,,105.60000000
2004-02-10 00:02:23.00000,,105.60000000
2004-02-10 00:02:41.00000,,105.60000000
2004-02-10 00:03:09.00000,,105.59000000
2004-02-10 00:03:16.00000,,105.60000000
2004-02-10 00:03:19.00000,,105.59000000
2004-02-10 00:03:25.00000,,105.60000000
2004-02-10 00:03:39.00000,,105.59000000
2004-02-10 00:03:52.00000,,105.60000000
2004-02-10 00:03:54.00000,,105.60000000
# LINES OF CODE
fxdata<-read.zoo(file=fxfile,FUN=as.POSIXct,sep=",",col.names=c("date","
bid","ask"))
fxdata<-fxdata[( fxdata[,"bid"] > 0.0 ) & (
fxdata[,"ask"] > 0.0 ),]
aggfxdata<-as.zoo(aggregatebyminutes(zooobj=fxdata,aggtimeframe=aggtimef
rame))
#=========================================================================================
Even with the double comma being there, the fxdata<-read.zoo line and
the fxdata<-fxdata line still work but then on
the aggfxdata<-as.zoo line , I get the error :
"Error in rep.int(seq(1:d[i]), prod(d[seq(length = i - 1)]) * rep.int(1,
:
invalid number of copies in rep()"
This error is reasonable because the routines, aggregatebyminutes,
probably has a problem with nothing
being in the bid field. My question is if there is some way tha I can
know that nothing
is in the bid field so that I can skip this file altogether and go onto
the next one ?
I'm not showing the details of the function because I'm not interested
in the error. I am only interested in knowing
that the "bid" field does not exist.
I ask only because I am unsure how often this double comma/missing field
scenario can happen so it would
be better to automate the skipping of the file.
Thanks.
--------------------------------------------------------
This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
Leeds, Mark (IED) wrote:> I have many text files in the format below and in certain rare instances > such as below there can be nothing in one of the fields so > a double comma is written but I won't know this because I am reading in > many,many files sequentially. > > # TEXT FILE > > 2004-02-10 00:01:31.00000,,105.60000000 > 2004-02-10 00:01:32.00001,,105.60000000 > 2004-02-10 00:01:45.00000,,105.60000000 > 2004-02-10 00:01:49.00000,,105.61000000 > 2004-02-10 00:02:08.00000,,105.60000000 > 2004-02-10 00:02:15.00000,,105.60000000 > 2004-02-10 00:02:23.00000,,105.60000000 > 2004-02-10 00:02:41.00000,,105.60000000 > 2004-02-10 00:03:09.00000,,105.59000000 > 2004-02-10 00:03:16.00000,,105.60000000 > 2004-02-10 00:03:19.00000,,105.59000000 > 2004-02-10 00:03:25.00000,,105.60000000 > 2004-02-10 00:03:39.00000,,105.59000000 > 2004-02-10 00:03:52.00000,,105.60000000 > 2004-02-10 00:03:54.00000,,105.60000000 > > # LINES OF CODE > > fxdata<-read.zoo(file=fxfile,FUN=as.POSIXct,sep=",",col.names=c("date"," > bid","ask")) > fxdata<-fxdata[( fxdata[,"bid"] > 0.0 ) & ( fxdata[,"ask"] > 0.0 ),] > aggfxdata<-as.zoo(aggregatebyminutes(zooobj=fxdata,aggtimeframe=aggtimef > rame)) > > #======================================================================> ===================> > Even with the double comma being there, the fxdata<-read.zoo line and > the fxdata<-fxdata line still work but then on > the aggfxdata<-as.zoo line , I get the error : > > "Error in rep.int(seq(1:d[i]), prod(d[seq(length = i - 1)]) * rep.int(1, > : > invalid number of copies in rep()" > > This error is reasonable because the routines, aggregatebyminutes, > probably has a problem with nothing > being in the bid field. My question is if there is some way tha I can > know that nothing > is in the bid field so that I can skip this file altogether and go onto > the next one ? > I'm not showing the details of the function because I'm not interested > in the error. I am only interested in knowing > that the "bid" field does not exist. > > I ask only because I am unsure how often this double comma/missing field > scenario can happen so it would > be better to automate the skipping of the file.You could avoid the as.zoo() part when dim(fxdata)[1] is equal to zero with something like this: library(zoo) fxdata <- read.zoo(file="fxfile", FUN=as.POSIXct, sep=",", col.names=c("date","bid","ask")) fxdata <- fxdata[(fxdata[,"bid"] > 0.0) & (fxdata[,"ask"] > 0.0),] if(dim(fxdata)[1] == 0) cat("\n All missing! \n") else{ aggfxdata <- as.zoo(aggregatebyminutes(zooobj=fxdata, aggtimeframe=aggtimeframe)) } I'm not sure how to know without reading the file whether you have this problem, but once you have read it and know that dim(fxdata)[1] =0, you can remove fxdata with rm().> Thanks. > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894