Hello, I have a vector which looks like x$ART ... [35415] 00 01-1;02-1;05-1; [35417] 01-1; 01-1;02-1; [35419] 01-1; 00 [35421] 01-1;04-1; 05-1; [35423] 02-1; 01-1;02-1; [35425] 01-1;02-1; <NA> [35427] 01-1; <NA> ... This is a vector I got in this format. To explain it: there are several categories (00,01,02 etc) and its counts (values after -) So I have to split each value and create new dataframe-columns/vectors for each categories one column and the value should be then in the corresponding cell. I know that this vector has 7 categories (00-06) and NA values but each case (row) has not all the categories (as you can see). How can do such as split? In the end I should get: x$ART_00, x$ART_01, x$ART_03,... with its values. In the case of <NA> all the categories should have also <NA>. Maybe someone can help. Thank you, Best regards Johannes -- "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Hi, Johannes, maybe X <- unlist( strsplit( as.character( x$ART), split = ";", fixed = TRUE)) X <- strsplit( X, split = "-", fixed = TRUE) X <- sapply( X, function( x) if( length(x) == 2) rep( x[1], as.numeric( x[2])) else x[1] ) table(X, useNA = "always") comes close to what you want. Hth -- Gerrit On Thu, 19 Jan 2012, Johannes Radinger wrote:> Hello, > > I have a vector which looks like > > x$ART > ...> [35415] 00 01-1;02-1;05-1; > [35417] 01-1; 01-1;02-1; > [35419] 01-1; 00 > [35421] 01-1;04-1; 05-1; > [35423] 02-1; 01-1;02-1; > [35425] 01-1;02-1; <NA> > [35427] 01-1; <NA> > ... > > > This is a vector I got in this format. To explain it: > there are several categories (00,01,02 etc) and its counts (values after -) > So I have to split each value and create new dataframe-columns/vectors > for each categories one column and the value should be then in the > corresponding cell. I know that this vector has 7 categories (00-06) > and NA values but each case (row) has not all the categories (as you can see). How can do such as split? > > In the end I should get: > x$ART_00, x$ART_01, x$ART_03,... with its values. In the case of <NA> all the categories should have also <NA>. > > Maybe someone can help. > > Thank you, > > Best regards > > Johannes > > > > -- > "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ... > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hello all, I think I am now on the way to correctly split the vector as I want it using for loops. I got now to a point where I got stucked....So maybe someone can help me out... Remember the result I am looking for should look like (for the input vector I want to split see below: var3) var1 var2 var3_00 var3_01 var3_02 var3_04 1 A 1 0 0 0 2 B 0 1 3 1 3 C 0 2 1 0 4 D 0 0 0 12 5 E NA NA NA NA The input and my approach so far: It is probably not the most elegant solution but I think I will get where I want..I am very open for your improvements: var1 <-seq(1,5) var2 <-c("A","B","C","D","E") var3 <-c("00","01-1;02-3;04-1","01-2;02-1","01-0;04-2",NA) x <- data.frame(var1,var2,var3) #create new columns and prefill with 0 x$var3_01 <- 0 x$var3_02 <- 0 x$var3_03 <- 0 x$var3_04 <- 0 a <- strsplit(as.character(x$var3), split = ";", fixed = TRUE) for (i in 1:length(a)){ A <- length(a[[i]]) for (j in 1:A){ column <- (unlist(strsplit((a[[i]][j]), split="-",fixed=TRUE))[1]) if(column!="00"){ value <- (unlist(strsplit((a[[i]][j]), split="-",fixed=TRUE))[2]) print(column) print(value) if(is.na(column)) { x$var3_01[i] <- NA x$var3_02[i] <- NA x$var3_03[i] <- NA x$var3_04[i] <- NA } else if(column %in% c("01","02","03","04")) { #print(paste("x$var3_",column,sep="")) (paste("x$var3_",column,sep=""))[i]<- as.numeric(value) } else print("Problem with category") } } } I think there is a problme with (paste("x$var3_",column,sep=""))[i] which is not recognized correctly as it is interpreted as a string. Thank you... best regards, /johannes -------- Original-Nachricht --------> Datum: Thu, 19 Jan 2012 13:42:24 +0100 (MET) > Von: Gerrit Eichner <Gerrit.Eichner at math.uni-giessen.de> > An: Johannes Radinger <JRadinger at gmx.at> > CC: R-help at r-project.org > Betreff: Re: [R] Split values in vector> Hi, Johannes, > > maybe > > X <- unlist( strsplit( as.character( x$ART), split = ";", fixed = TRUE)) > X <- strsplit( X, split = "-", fixed = TRUE) > > X <- sapply( X, function( x) > if( length(x) == 2) > rep( x[1], as.numeric( x[2])) else x[1] > ) > > table(X, useNA = "always") > > > comes close to what you want. > > Hth -- Gerrit > > > On Thu, 19 Jan 2012, Johannes Radinger wrote: > > > Hello, > > > > I have a vector which looks like > > > > x$ART > > ... > > > [35415] 00 01-1;02-1;05-1; > > [35417] 01-1; 01-1;02-1; > > [35419] 01-1; 00 > > [35421] 01-1;04-1; 05-1; > > [35423] 02-1; 01-1;02-1; > > [35425] 01-1;02-1; <NA> > > [35427] 01-1; <NA> > > ... > > > > > > This is a vector I got in this format. To explain it: > > there are several categories (00,01,02 etc) and its counts (values after > -) > > So I have to split each value and create new dataframe-columns/vectors > > for each categories one column and the value should be then in the > > corresponding cell. I know that this vector has 7 categories (00-06) > > and NA values but each case (row) has not all the categories (as you can > see). How can do such as split? > > > > In the end I should get: > > x$ART_00, x$ART_01, x$ART_03,... with its values. In the case of <NA> > all the categories should have also <NA>. > > > > Maybe someone can help. > > > > Thank you, > > > > Best regards > > > > Johannes > > > > > > > > -- > > "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ... > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.-- "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Hello again, No I managed to do everything correctly... the code now looks like: var1 <-seq(1,5) var2 <-c("A","B","C","D","E") var3 <-c("00","01-1;02-3;04-1","01-2;02-1","01-0;04-2",NA) x <- data.frame(var1,var2,var3) #create new columns and prefill with 0 x$var3_01 <- 0 x$var3_02 <- 0 x$var3_03 <- 0 x$var3_04 <- 0 a <- strsplit(as.character(x$var3), split = ";", fixed = TRUE) for (i in 1:length(a)){ A <- length(a[[i]]) for (j in 1:A){ column <- (unlist(strsplit((a[[i]][j]), split="-",fixed=TRUE))[1]) if(column!="00"|is.na(column)){ value <- (unlist(strsplit((a[[i]][j]), split="-",fixed=TRUE))[2]) if(is.na(column)) { x$var3_01[i] <- NA x$var3_02[i] <- NA x$var3_03[i] <- NA x$var3_04[i] <- NA } else if(column %in% c("01","02","03","04")) { x[i,paste("var3_",column,sep="")]<- as.numeric(value) } else print("Problem with category") } } } -------- Original-Nachricht --------> Datum: Thu, 19 Jan 2012 13:42:24 +0100 (MET) > Von: Gerrit Eichner <Gerrit.Eichner at math.uni-giessen.de> > An: Johannes Radinger <JRadinger at gmx.at> > CC: R-help at r-project.org > Betreff: Re: [R] Split values in vector> Hi, Johannes, > > maybe > > X <- unlist( strsplit( as.character( x$ART), split = ";", fixed = TRUE)) > X <- strsplit( X, split = "-", fixed = TRUE) > > X <- sapply( X, function( x) > if( length(x) == 2) > rep( x[1], as.numeric( x[2])) else x[1] > ) > > table(X, useNA = "always") > > > comes close to what you want. > > Hth -- Gerrit > > > On Thu, 19 Jan 2012, Johannes Radinger wrote: > > > Hello, > > > > I have a vector which looks like > > > > x$ART > > ... > > > [35415] 00 01-1;02-1;05-1; > > [35417] 01-1; 01-1;02-1; > > [35419] 01-1; 00 > > [35421] 01-1;04-1; 05-1; > > [35423] 02-1; 01-1;02-1; > > [35425] 01-1;02-1; <NA> > > [35427] 01-1; <NA> > > ... > > > > > > This is a vector I got in this format. To explain it: > > there are several categories (00,01,02 etc) and its counts (values after > -) > > So I have to split each value and create new dataframe-columns/vectors > > for each categories one column and the value should be then in the > > corresponding cell. I know that this vector has 7 categories (00-06) > > and NA values but each case (row) has not all the categories (as you can > see). How can do such as split? > > > > In the end I should get: > > x$ART_00, x$ART_01, x$ART_03,... with its values. In the case of <NA> > all the categories should have also <NA>. > > > > Maybe someone can help. > > > > Thank you, > > > > Best regards > > > > Johannes > > > > > > > > -- > > "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ... > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.-- "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ... -- "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...