Luca Meyer
2014-Jan-02 06:17 UTC
[R] How to verify char variables contain at least one value
Happy new year fellows, I am trying to do something I believe should be fairly straightforward but I cannot find my way out. My dataset d2 is 26 rows by 245 columns, exclusively char variables. I would like to check whether at least one column from V13 till V239 (they are in numerical sequence) has been filled in, so I try d2$check <- c(d2$V13:d2$V239) and/or d2$check <- paste(d2$V13:d2$V239,sep="") but I get (translated from Italian): Error in d2$V13:d2$V239 : argument NA/NaN I have tried nchar but the same error occurs. I have also tried to run the above functions on a smaller variable subset (V13, V14, V15, see below for details) just to double check in case some variable would erroneously be in another format, but the same occur.> d2$V13[1] "" "" "" "" "" "" "da -5.1% a -10%" "" [9] "" "" "" "" "" "" "" "" [17] "" "" "" "" "" "" "" "" [25] "" ""> d2$V14[1] "" "" "" "" "" "" "da -10.1% a -15%" "" [9] "" "" "" "" "" "" "" "" [17] "" "" "" "" "" "" "" "" [25] "" ""> d2$V15[1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" Can anyone suggest an alternative function for me to create a variable that checks whether there is at least one value for each of the 26 records I need to analyze? Thank you in advance, Luca [[alternative HTML version deleted]]
Jim Lemon
2014-Jan-02 07:28 UTC
[R] How to verify char variables contain at least one value
On 01/02/2014 05:17 PM, Luca Meyer wrote:> Happy new year fellows, > > I am trying to do something I believe should be fairly straightforward but > I cannot find my way out. > > My dataset d2 is 26 rows by 245 columns, exclusively char variables. I > would like to check whether at least one column from V13 till V239 (they > are in numerical sequence) has been filled in, so I try > > d2$check<- c(d2$V13:d2$V239) > > and/or > > d2$check<- paste(d2$V13:d2$V239,sep="") > > but I get (translated from Italian): > > Error in d2$V13:d2$V239 : argument NA/NaN > > I have tried nchar but the same error occurs. I have also tried to run the > above functions on a smaller variable subset (V13, V14, V15, see below for > details) just to double check in case some variable would erroneously be in > another format, but the same occur. > >> d2$V13 > [1] "" "" "" > "" "" "" "da -5.1% a -10%" > "" > [9] "" "" "" > "" "" "" "" > "" > [17] "" "" "" > "" "" "" "" > "" > [25] "" "" >> d2$V14 > [1] "" "" "" > "" "" "" "da -10.1% a -15%" > "" > [9] "" "" "" > "" "" "" "" > "" > [17] "" "" "" > "" "" "" "" > "" > [25] "" "" >> d2$V15 > [1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" > "" "" "" > > Can anyone suggest an alternative function for me to create a variable that > checks whether there is at least one value for each of the 26 records I > need to analyze? >Hi Luca, Perhaps you are looking for something like this: d2check<-unlist(apply(as.matrix(d2[,paste("V",13:239,sep="")]),1,nchar)) # to test for any non empty rows any(d2check) Jim
Gerrit Eichner
2014-Jan-02 07:37 UTC
[R] How to verify char variables contain at least one value
Hello, Luca, also a happy new year! It's not quite clear to me what you want to do, but note first that the ":"-operator is a short-cut for seq() with by = 1 (look at ?seq), and that it usually (!) does not work on columns of data frames. Exception: when used for the argument subset of function subset(). Second, you seem to want to check in each row of d2 if there is any entry different from "", right? So, does> apply( subset( d2, subset = V13:V239), 1, function( x) any( x != ""))what you want? Hth -- Gerrit On Thu, 2 Jan 2014, Luca Meyer wrote:> Happy new year fellows, > > I am trying to do something I believe should be fairly straightforward but > I cannot find my way out. > > My dataset d2 is 26 rows by 245 columns, exclusively char variables. I > would like to check whether at least one column from V13 till V239 (they > are in numerical sequence) has been filled in, so I try > > d2$check <- c(d2$V13:d2$V239) > > and/or > > d2$check <- paste(d2$V13:d2$V239,sep="") > > but I get (translated from Italian): > > Error in d2$V13:d2$V239 : argument NA/NaN > > I have tried nchar but the same error occurs. I have also tried to run the > above functions on a smaller variable subset (V13, V14, V15, see below for > details) just to double check in case some variable would erroneously be in > another format, but the same occur. > >> d2$V13 > [1] "" "" "" > "" "" "" "da -5.1% a -10%" > "" > [9] "" "" "" > "" "" "" "" > "" > [17] "" "" "" > "" "" "" "" > "" > [25] "" "" >> d2$V14 > [1] "" "" "" > "" "" "" "da -10.1% a -15%" > "" > [9] "" "" "" > "" "" "" "" > "" > [17] "" "" "" > "" "" "" "" > "" > [25] "" "" >> d2$V15 > [1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" > "" "" "" > > Can anyone suggest an alternative function for me to create a variable that > checks whether there is at least one value for each of the 26 records I > need to analyze? > > Thank you in advance, > > Luca > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
HI, If I understand correctly, you could also try: set.seed(48) ?d2 <- as.data.frame(matrix(sample(c("",letters[1:2]),26*245,replace=TRUE),26,245)) ?d2[3,] <- "" ?names1 <-paste0("V",13:239) res <- d2[rowSums(d2[,names1]=="") < ncol(d2[,names1]),names1] A.K. On Thursday, January 2, 2014 1:20 AM, Luca Meyer <lucam1968 at gmail.com> wrote: Happy new year fellows, I am trying to do something I believe should be fairly straightforward but I cannot find my way out. My dataset d2 is 26 rows by 245 columns, exclusively char variables. I would like to check whether at least one column from V13 till V239 (they are in numerical sequence) has been filled in, so I try d2$check <- c(d2$V13:d2$V239) and/or d2$check <- paste(d2$V13:d2$V239,sep="") but I get (translated from Italian): Error in d2$V13:d2$V239 : argument NA/NaN I have tried nchar but the same error occurs. I have also tried to run the above functions on a smaller variable subset (V13, V14, V15, see below for details) just to double check in case some variable would erroneously be in another format, but the same occur.> d2$V13[1] ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "da -5.1% a -10%" "" [9] ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" "" [17] ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" "" [25] ""? ? ? ? ? ? ? ? ""> d2$V14[1] ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "da -10.1% a -15%" "" [9] ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" "" [17] ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? ""? ? ? ? ? ? ? ? "" "" [25] ""? ? ? ? ? ? ? ? ""> d2$V15[1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" Can anyone suggest an alternative function for me to create a variable that checks whether there is at least one value for each of the 26 records I need to analyze? Thank you in advance, Luca ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.