Hi, A column of my df looks like A 10/20/30 40/20 60/10/10/5 80/10 I want to split it such that the last column has the last composition and if there are not enough the middle columns get the 0s. That way my df would look like A1 A2 A3 A4 10 20 0 30 40 0 0 20 60 10 10 5 80 0 0 10 How can I do that ?? [[alternative HTML version deleted]]
Hello, Try the following. A <- c( "10/20/30", "40/20", "60/10/10/5", "80/10") fun <- function(X){ xname <- deparse(substitute(X)) s <- strsplit(X, "/") n <- max(sapply(s, length)) tmp <- numeric(n) f <- function(x){ x <- as.numeric(x) m <- length(x) tmp[n] <- x[m] tmp[seq_len(m - 1)] <- x[seq_len(m - 1)] tmp } res <- do.call(rbind, lapply(s, f)) colnames(res) <- paste0(xname, 1:ncol(res)) data.frame(res) } fun(A) Hope this helps, Rui Barradas Em 31-08-2012 17:10, Sapana Lohani escreveu:> Hi, > > A column of my df looks like > > A > 10/20/30 > > 40/20 > 60/10/10/5 > 80/10 > > I want to split it such that the last column has the last composition and if there are not enough the middle columns get the 0s. That way my df would look like > > A1 A2 A3 A4 > 10 20 0 30 > 40 0 0 20 > 60 10 10 5 > 80 0 0 10 > > How can I do that ?? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hello, It means you should update your version of R. paste0 was introduced with R 2.15.0 as is the shorter equivalent of paste(..., sep = "") I'm sending you a small correction just in case some of the elements in A only have 1 element. fun <- function(X){ xname <- deparse(substitute(X)) s <- strsplit(X, "/") n <- max(sapply(s, length)) tmp <- numeric(n) f <- function(x){ x <- as.numeric(x) m <- length(x) if(m > 1){ tmp[n] <- x[m] tmp[seq_len(m - 1)] <- x[seq_len(m - 1)] }else tmp[1] <- x tmp } res <- do.call(rbind, lapply(s, f)) colnames(res) <- paste(xname, seq_along(s), sep = "") data.frame(res) } Rui Barradas Em 31-08-2012 21:58, Sapana Lohani escreveu:> Hi Rui, > > I am getting some error message in the code you sent. It says "Error in fun(A) : could not find function "paste0" > > I am a new in R so could not fix it. Can you help me fix that ? > > Thanks > > > > > ________________________________ > From: Rui Barradas <ruipbarradas at sapo.pt> > To: Sapana Lohani <lohani.sapana at ymail.com> > Cc: r-help <r-help at r-project.org> > Sent: Friday, August 31, 2012 12:35 PM > Subject: Re: [R] splits with 0s in middle columns > > Hello, > > Try the following. > > A <- c( > "10/20/30", > "40/20", > "60/10/10/5", > "80/10") > > fun <- function(X){ > xname <- deparse(substitute(X)) > s <- strsplit(X, "/") > n <- max(sapply(s, length)) > tmp <- numeric(n) > > f <- function(x){ > x <- as.numeric(x) > m <- length(x) > tmp[n] <- x[m] > tmp[seq_len(m - 1)] <- x[seq_len(m - 1)] > tmp > } > > res <- do.call(rbind, lapply(s, f)) > colnames(res) <- paste0(xname, 1:ncol(res)) > data.frame(res) > } > fun(A) > > Hope this helps, > > Rui Barradas > > Em 31-08-2012 17:10, Sapana Lohani escreveu: >> Hi, >> >> A column of my df looks like >> >> A >> 10/20/30 >> >> 40/20 >> 60/10/10/5 >> 80/10 >> >> I want to split it such that the last column has the last composition and if there are not enough the middle columns get the 0s. That way my df would look like >> >> A1 A2 A3 A4 >> 10 20 0 30 >> 40 0 0 20 >> 60 10 10 5 >> 80 0 0 10 >> >> How can I do that ?? >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
Hi, Try this: dat1<-read.table(text=" 10/20/30 40/20 60/10/10/5 80/10 ",sep="",header=FALSE,stringsAsFactors=FALSE) dat2<-gsub("(.*)/(.*)","\\1 0 0 \\2", gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3", gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",dat1$V1))) dat3<-data.frame(do.call(rbind,strsplit(dat2, " "))) ?colnames(dat3)<-paste0("A",1:4) ?dat3 #? A1 A2 A3 A4 #1 10 20? 0 30 #2 40? 0? 0 20 #3 60 10 10? 5 #4 80? 0? 0 10 A.K. ----- Original Message ----- From: Sapana Lohani <lohani.sapana at ymail.com> To: R help <r-help at r-project.org> Cc: Sent: Friday, August 31, 2012 12:10 PM Subject: [R] splits with 0s in middle columns Hi, A column of my df looks like A 10/20/30 40/20 60/10/10/5 80/10 I want to split it such that the last column has the last composition and if there are not enough the middle columns get the 0s. That way my df would look like A1 A2 A3 A4 10 20 0 30 40 0 0 20 60 10 10 5 80 0 0 10 How can I do that ?? ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
HI, You can also use ifelse: dat1<-read.table(text=" 10/20/30 40/20 60/10/10/5 80/10 ",sep="",header=FALSE,stringsAsFactors=FALSE) dat2<-ifelse(nchar(gsub("[^/]","",dat1$V1))==1,gsub("(.*)/(.*)","\\1 0 0 \\2",dat1$V1),ifelse(nchar(gsub("[^/]","",dat1$V1))==2,gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3",dat1$V1),gsub("(.*)/(.*)/(.*)/(.*)", "\\1 \\2 \\3 \\4",dat1$V1))) ?dat3<-data.frame(do.call(rbind,strsplit(dat2, " "))) ?colnames(dat3)<-paste0("A",1:4) ?dat3 #? A1 A2 A3 A4 #1 10 20? 0 30 #2 40? 0? 0 20 #3 60 10 10? 5 #4 80? 0? 0 10 A.K. ----- Original Message ----- From: Sapana Lohani <lohani.sapana at ymail.com> To: R help <r-help at r-project.org> Cc: Sent: Friday, August 31, 2012 12:10 PM Subject: [R] splits with 0s in middle columns Hi, A column of my df looks like A 10/20/30 40/20 60/10/10/5 80/10 I want to split it such that the last column has the last composition and if there are not enough the middle columns get the 0s. That way my df would look like A1 A2 A3 A4 10 20 0 30 40 0 0 20 60 10 10 5 80 0 0 10 How can I do that ?? ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, You don't need a new function, what you need is to prepare your data in such a way that the function can process it. A <- c("percent (10/20/30)", "percent (40/20)", "percent (60/10/10/5)", "percent (80/10)") B <- gsub("\\(|\\)|percent| ", "", A) fun(B) Also, please use dput to post the data examples, dput(A) c("percent (10/20/30)", "percent (40/20)", "percent (60/10/10/5)", "percent (80/10)") Then copy&paste in your post. Rui Barradas Em 02-09-2012 04:22, Sapana Lohani escreveu:> Dear Rui, > > The new code works fine for what I wanted. I have another similar column but it looks like > > A > percent (10/20/30) > percent (40/20) > percent (60/10/10/5) > percent (80/10) > > I want a similar split but delete the percent in the front. The output should look like > > A1 A2 A3 A4 > 10 20 0 30 > 40 0 0 20 > 60 10 10 5 > 80 0 0 10 > > Could you please make the small change in the code that you gave me. It must be a small edition but I could not figure that out. FYI the code that worked was > > fun <- function(X){ > xname <- deparse(substitute(X)) > s <- strsplit(X, "/") > n <- max(sapply(s, length)) > tmp <- numeric(n) > > f <- function(x){ > x <- as.numeric(x) > m <- length(x) > if(m > 1){ > tmp[n] <- x[m] > tmp[seq_len(m - 1)] <- x[seq_len(m - 1)] > }else tmp[1] <- x > tmp > } > > res <- do.call(rbind, lapply(s, f)) > colnames(res) <- paste(xname, seq_along(s), sep = "") > data.frame(res) > } > > fun(A) > > Thank you so very much.
Hello, You should Cc the list, there are others presenting solutions. What's going on should be obvious, your data example had "percent" in it, and your data file has "slope"! How could you expect it to work? Just in case, I'm changing the regular expression to removing everything but bars and digits. dat <- read.csv("test.csv") B <- gsub("[^/[:digit:]]+", "", dat$Composition_percent_part) Rui Barradas Em 03-09-2012 01:38, Sapana Lohani escreveu:> Hello Rui, > > > I do not know whats wrong with my data, so am sending you the whole column I wanted to split. Could you please have a look and suggest me the error? I ma totally stuck at this point of my analysis > > > > ________________________________ > From: Rui Barradas <ruipbarradas at sapo.pt> > To: Sapana Lohani <lohani.sapana at ymail.com> > Cc: r-help <r-help at r-project.org> > Sent: Sunday, September 2, 2012 10:05 AM > Subject: Re: [R] splits with 0s in middle columns > > Hello, > > You don't need a new function, what you need is to prepare your data in > such a way that the function can process it. > > > A <- c("percent (10/20/30)", "percent (40/20)", "percent (60/10/10/5)", > "percent (80/10)") > B <- gsub("\\(|\\)|percent| ", "", A) > fun(B) > > Also, please use dput to post the data examples, > > dput(A) > c("percent (10/20/30)", "percent (40/20)", "percent (60/10/10/5)", > "percent (80/10)") > > Then copy&paste in your post. > > Rui Barradas > > Em 02-09-2012 04:22, Sapana Lohani escreveu: >> Dear Rui, >> >> The new code works fine for what I wanted. I have another similar column but it looks like >> >> A >> percent (10/20/30) >> percent (40/20) >> percent (60/10/10/5) >> percent (80/10) >> >> I want a similar split but delete the percent in the front. The output should look like >> >> A1 A2 A3 A4 >> 10 20 0 30 >> 40 0 0 20 >> 60 10 10 5 >> 80 0 0 10 >> >> Could you please make the small change in the code that you gave me. It must be a small edition but I could not figure that out. FYI the code that worked was >> >> fun <- function(X){ >> xname <- deparse(substitute(X)) >> s <- strsplit(X, "/") >> n <- max(sapply(s, length)) >> tmp <- numeric(n) >> >> f <- function(x){ >> x <- as.numeric(x) >> m <- length(x) >> if(m > 1){ >> tmp[n] <- x[m] >> tmp[seq_len(m - 1)] <- x[seq_len(m - 1)] >> }else tmp[1] <- x >> tmp >> } >> >> res <- do.call(rbind, lapply(s, f)) >> colnames(res) <- paste(xname, seq_along(s), sep = "") >> data.frame(res) >> } >> >> fun(A) >> >> Thank you so very much.
Hi, It's working for me. Try this: dat1<-read.csv("test.csv") dat2<-na.omit(dat1) ? nrow(dat1) #[1] 635 ?nrow(dat2) #[1] 627 B<-gsub(" slope{0,1}\\s\\((.*)\\)","\\1",dat2$Composition_percent_part) fun1<-function(x){ C<-gsub("(.*)/(.*)","\\1 0 0 \\2", gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3", gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",x))) res<-data.frame(do.call(rbind,strsplit(C," "))) colnames(res)<-paste("A",1:4,sep="") res } fun1(B) head(fun1(B),15) ?#? A1 A2 A3 A4 #1? 60 25? 0 15 #2? 60 25? 0 15 #3? 40 35 15 10 #4? 40 35 15 10 #5? 40 35 15 10 #6? 40 35 15 10 #7? 40 35 15 10 #8? 50 40? 0 10 #9? 50 40? 0 10 #10 50 40? 0 10 #11 50 40? 0 10 #12 50 30? 0 20 #13 50 30? 0 20 #14 50 30? 0 20 #15 50 30? 0 20 #or, fun1<-function(x){ B<-gsub(" slope{0,1}\\s\\((.*)\\)","\\1",x) C<-gsub("(.*)/(.*)","\\1 0 0 \\2", gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3", gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",B))) res<-data.frame(do.call(rbind,strsplit(C," "))) colnames(res)<-paste("A",1:4,sep="") res } fun1(dat2$Composition_percent_part)??? ?head(fun1(dat2$Composition_percent_part),5) #? A1 A2 A3 A4 #1 60 25? 0 15 #2 60 25? 0 15 #3 40 35 15 10 #4 40 35 15 10 #5 40 35 15 10 A.K. ________________________________ From: Sapana Lohani <lohani.sapana at ymail.com> To: arun <smartpink111 at yahoo.com> Sent: Sunday, September 2, 2012 8:39 PM Subject: Re: [R] splits with 0s in middle columns Hi Arun, I do not know whats wrong with my data, so am sending you the whole column I wanted to split. Could you please have a look and suggest me the error? I ma totally stuck at this point of my analysis ________________________________ From: arun <smartpink111 at yahoo.com> To: Rui Barradas <ruipbarradas at sapo.pt> Cc: Sapana Lohani <lohani.sapana at ymail.com>; R help <r-help at r-project.org> Sent: Sunday, September 2, 2012 1:54 PM Subject: Re: [R] splits with 0s in middle columns HI, You can also try this as a function: fun1<-function(x){ B<-gsub("percent{0,1}\\s\\((.*)\\)","\\1",x) C<-gsub("(.*)/(.*)","\\1 0 0 \\2", gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3", gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",B))) dat1<-data.frame(do.call(rbind,strsplit(C," "))) colnames(dat1)<-paste0("A",1:4) dat1 } ?fun1(A) #? A1 A2 A3 A4 #1 10 20? 0 30 #2 40? 0? 0 20 #3 60 10 10? 5 #4 80? 0? 0 10 A.K. ----- Original Message ----- From: Rui Barradas <ruipbarradas at sapo.pt> To: Sapana Lohani <lohani.sapana at ymail.com> Cc: r-help <r-help at r-project.org> Sent: Sunday, September 2, 2012 1:05 PM Subject: Re: [R] splits with 0s in middle columns Hello, You don't need a new function, what you need is to prepare your data in such a way that the function can process it. A <- c("percent (10/20/30)", "percent (40/20)", "percent (60/10/10/5)", "percent (80/10)") B <- gsub("\\(|\\)|percent| ", "", A) fun(B) Also, please use dput to post the data examples, dput(A) c("percent (10/20/30)", "percent (40/20)", "percent (60/10/10/5)", "percent (80/10)") Then copy&paste in your post. Rui Barradas Em 02-09-2012 04:22, Sapana Lohani escreveu:> Dear Rui, > > The new code works fine for what I wanted. I have another similar column but it looks like > > A > percent (10/20/30) > percent (40/20) > percent (60/10/10/5) > percent (80/10) > > I want a similar split but delete the percentin the front. The output should look like> > A1 A2 A3 A4 > 10 20 0 30 > 40 0 0 20 > 60 10 10 5 > 80 0 0 10 > > Could you please make the small change in the code that you gave me. It must be a small edition but I could not figure that out. FYI the code that worked was > > fun <- function(X){ >? ? ?? xname <- deparse(substitute(X)) >? ? ?? s <- strsplit(X, "/") >? ? ?? n <- max(sapply(s, length)) >? ? ?? tmp <- numeric(n) > >? ? ?? f <- function(x){ >? ? ? ? ?? x <- as.numeric(x) >? ? ? ? ?? m <- length(x) >? ? ? ? ?? if(m > 1){ >? ? ? ? ? ? ?? tmp[n] <- x[m] >? ?? ? ? ? ?? tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]>? ? ? ? ?? }else tmp[1] <- x >? ? ? ? ?? tmp >? ? ?? } > >? ? ?? res <- do.call(rbind, lapply(s, f)) >? ? ?? colnames(res) <- paste(xname, seq_along(s), sep = "") >? ? ?? data.frame(res) > } > > fun(A) > > Thank you so very much.______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.