Hi, A column of my df looks like A 10/20/30 40/20 60/10/10/5 80/10 I want to split it such that the last column has the last composition and if there are not enough the middle columns get the 0s. That way my df would look like A1 A2 A3 A4 10 20 0 30 40 0 0 20 60 10 10 5 80 0 0 10 How can I do that ?? [[alternative HTML version deleted]]
Hello,
Try the following.
A <- c(
"10/20/30",
"40/20",
"60/10/10/5",
"80/10")
fun <- function(X){
xname <- deparse(substitute(X))
s <- strsplit(X, "/")
n <- max(sapply(s, length))
tmp <- numeric(n)
f <- function(x){
x <- as.numeric(x)
m <- length(x)
tmp[n] <- x[m]
tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
tmp
}
res <- do.call(rbind, lapply(s, f))
colnames(res) <- paste0(xname, 1:ncol(res))
data.frame(res)
}
fun(A)
Hope this helps,
Rui Barradas
Em 31-08-2012 17:10, Sapana Lohani escreveu:> Hi,
>
> A column of my df looks like
>
> A
> 10/20/30
>
> 40/20
> 60/10/10/5
> 80/10
>
> I want to split it such that the last column has the last composition and
if there are not enough the middle columns get the 0s. That way my df would look
like
>
> A1 A2 A3 A4
> 10 20 0 30
> 40 0 0 20
> 60 10 10 5
> 80 0 0 10
>
> How can I do that ??
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,
It means you should update your version of R. paste0 was introduced with
R 2.15.0 as is the shorter equivalent of
paste(..., sep = "")
I'm sending you a small correction just in case some of the elements in
A only have 1 element.
fun <- function(X){
xname <- deparse(substitute(X))
s <- strsplit(X, "/")
n <- max(sapply(s, length))
tmp <- numeric(n)
f <- function(x){
x <- as.numeric(x)
m <- length(x)
if(m > 1){
tmp[n] <- x[m]
tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
}else tmp[1] <- x
tmp
}
res <- do.call(rbind, lapply(s, f))
colnames(res) <- paste(xname, seq_along(s), sep = "")
data.frame(res)
}
Rui Barradas
Em 31-08-2012 21:58, Sapana Lohani escreveu:> Hi Rui,
>
> I am getting some error message in the code you sent. It says "Error
in fun(A) : could not find function "paste0"
>
> I am a new in R so could not fix it. Can you help me fix that ?
>
> Thanks
>
>
>
>
> ________________________________
> From: Rui Barradas <ruipbarradas at sapo.pt>
> To: Sapana Lohani <lohani.sapana at ymail.com>
> Cc: r-help <r-help at r-project.org>
> Sent: Friday, August 31, 2012 12:35 PM
> Subject: Re: [R] splits with 0s in middle columns
>
> Hello,
>
> Try the following.
>
> A <- c(
> "10/20/30",
> "40/20",
> "60/10/10/5",
> "80/10")
>
> fun <- function(X){
> xname <- deparse(substitute(X))
> s <- strsplit(X, "/")
> n <- max(sapply(s, length))
> tmp <- numeric(n)
>
> f <- function(x){
> x <- as.numeric(x)
> m <- length(x)
> tmp[n] <- x[m]
> tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
> tmp
> }
>
> res <- do.call(rbind, lapply(s, f))
> colnames(res) <- paste0(xname, 1:ncol(res))
> data.frame(res)
> }
> fun(A)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 31-08-2012 17:10, Sapana Lohani escreveu:
>> Hi,
>>
>> A column of my df looks like
>>
>> A
>> 10/20/30
>>
>> 40/20
>> 60/10/10/5
>> 80/10
>>
>> I want to split it such that the last column has the last composition
and if there are not enough the middle columns get the 0s. That way my df would
look like
>>
>> A1 A2 A3 A4
>> 10 20 0 30
>> 40 0 0 20
>> 60 10 10 5
>> 80 0 0 10
>>
>> How can I do that ??
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
Hi,
Try this:
dat1<-read.table(text="
10/20/30
40/20
60/10/10/5
80/10
",sep="",header=FALSE,stringsAsFactors=FALSE)
dat2<-gsub("(.*)/(.*)","\\1 0 0 \\2",
gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3",
gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",dat1$V1)))
dat3<-data.frame(do.call(rbind,strsplit(dat2, " ")))
?colnames(dat3)<-paste0("A",1:4)
?dat3
#? A1 A2 A3 A4
#1 10 20? 0 30
#2 40? 0? 0 20
#3 60 10 10? 5
#4 80? 0? 0 10
A.K.
----- Original Message -----
From: Sapana Lohani <lohani.sapana at ymail.com>
To: R help <r-help at r-project.org>
Cc:
Sent: Friday, August 31, 2012 12:10 PM
Subject: [R] splits with 0s in middle columns
Hi,
A column of my df looks like
A
10/20/30
40/20
60/10/10/5
80/10
I want to split it such that the last column has the last composition and if
there are not enough the middle columns get the 0s. That way my df would look
like
A1 A2 A3 A4
10 20 0 30
40 0 0 20
60 10 10 5
80 0 0 10
How can I do that ??
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
HI,
You can also use ifelse:
dat1<-read.table(text="
10/20/30
40/20
60/10/10/5
80/10
",sep="",header=FALSE,stringsAsFactors=FALSE)
dat2<-ifelse(nchar(gsub("[^/]","",dat1$V1))==1,gsub("(.*)/(.*)","\\1
0 0
\\2",dat1$V1),ifelse(nchar(gsub("[^/]","",dat1$V1))==2,gsub("(.*)/(.*)/(.*)","\\1
\\2 0 \\3",dat1$V1),gsub("(.*)/(.*)/(.*)/(.*)", "\\1 \\2 \\3
\\4",dat1$V1)))
?dat3<-data.frame(do.call(rbind,strsplit(dat2, " ")))
?colnames(dat3)<-paste0("A",1:4)
?dat3
#? A1 A2 A3 A4
#1 10 20? 0 30
#2 40? 0? 0 20
#3 60 10 10? 5
#4 80? 0? 0 10
A.K.
----- Original Message -----
From: Sapana Lohani <lohani.sapana at ymail.com>
To: R help <r-help at r-project.org>
Cc:
Sent: Friday, August 31, 2012 12:10 PM
Subject: [R] splits with 0s in middle columns
Hi,
A column of my df looks like
A
10/20/30
40/20
60/10/10/5
80/10
I want to split it such that the last column has the last composition and if
there are not enough the middle columns get the 0s. That way my df would look
like
A1 A2 A3 A4
10 20 0 30
40 0 0 20
60 10 10 5
80 0 0 10
How can I do that ??
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hello,
You don't need a new function, what you need is to prepare your data in
such a way that the function can process it.
A <- c("percent (10/20/30)", "percent (40/20)",
"percent (60/10/10/5)",
"percent (80/10)")
B <- gsub("\\(|\\)|percent| ", "", A)
fun(B)
Also, please use dput to post the data examples,
dput(A)
c("percent (10/20/30)", "percent (40/20)", "percent
(60/10/10/5)",
"percent (80/10)")
Then copy&paste in your post.
Rui Barradas
Em 02-09-2012 04:22, Sapana Lohani escreveu:> Dear Rui,
>
> The new code works fine for what I wanted. I have another similar column
but it looks like
>
> A
> percent (10/20/30)
> percent (40/20)
> percent (60/10/10/5)
> percent (80/10)
>
> I want a similar split but delete the percent in the front. The output
should look like
>
> A1 A2 A3 A4
> 10 20 0 30
> 40 0 0 20
> 60 10 10 5
> 80 0 0 10
>
> Could you please make the small change in the code that you gave me. It
must be a small edition but I could not figure that out. FYI the code that
worked was
>
> fun <- function(X){
> xname <- deparse(substitute(X))
> s <- strsplit(X, "/")
> n <- max(sapply(s, length))
> tmp <- numeric(n)
>
> f <- function(x){
> x <- as.numeric(x)
> m <- length(x)
> if(m > 1){
> tmp[n] <- x[m]
> tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
> }else tmp[1] <- x
> tmp
> }
>
> res <- do.call(rbind, lapply(s, f))
> colnames(res) <- paste(xname, seq_along(s), sep = "")
> data.frame(res)
> }
>
> fun(A)
>
> Thank you so very much.
Hello,
You should Cc the list, there are others presenting solutions.
What's going on should be obvious, your data example had "percent"
in
it, and your data file has "slope"!
How could you expect it to work?
Just in case, I'm changing the regular expression to removing everything
but bars and digits.
dat <- read.csv("test.csv")
B <- gsub("[^/[:digit:]]+", "",
dat$Composition_percent_part)
Rui Barradas
Em 03-09-2012 01:38, Sapana Lohani escreveu:> Hello Rui,
>
>
> I do not know whats wrong with my data, so am sending you the whole column
I wanted to split. Could you please have a look and suggest me the error? I ma
totally stuck at this point of my analysis
>
>
>
> ________________________________
> From: Rui Barradas <ruipbarradas at sapo.pt>
> To: Sapana Lohani <lohani.sapana at ymail.com>
> Cc: r-help <r-help at r-project.org>
> Sent: Sunday, September 2, 2012 10:05 AM
> Subject: Re: [R] splits with 0s in middle columns
>
> Hello,
>
> You don't need a new function, what you need is to prepare your data in
> such a way that the function can process it.
>
>
> A <- c("percent (10/20/30)", "percent (40/20)",
"percent (60/10/10/5)",
> "percent (80/10)")
> B <- gsub("\\(|\\)|percent| ", "", A)
> fun(B)
>
> Also, please use dput to post the data examples,
>
> dput(A)
> c("percent (10/20/30)", "percent (40/20)",
"percent (60/10/10/5)",
> "percent (80/10)")
>
> Then copy&paste in your post.
>
> Rui Barradas
>
> Em 02-09-2012 04:22, Sapana Lohani escreveu:
>> Dear Rui,
>>
>> The new code works fine for what I wanted. I have another similar
column but it looks like
>>
>> A
>> percent (10/20/30)
>> percent (40/20)
>> percent (60/10/10/5)
>> percent (80/10)
>>
>> I want a similar split but delete the percent in the front. The output
should look like
>>
>> A1 A2 A3 A4
>> 10 20 0 30
>> 40 0 0 20
>> 60 10 10 5
>> 80 0 0 10
>>
>> Could you please make the small change in the code that you gave me. It
must be a small edition but I could not figure that out. FYI the code that
worked was
>>
>> fun <- function(X){
>> xname <- deparse(substitute(X))
>> s <- strsplit(X, "/")
>> n <- max(sapply(s, length))
>> tmp <- numeric(n)
>>
>> f <- function(x){
>> x <- as.numeric(x)
>> m <- length(x)
>> if(m > 1){
>> tmp[n] <- x[m]
>> tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
>> }else tmp[1] <- x
>> tmp
>> }
>>
>> res <- do.call(rbind, lapply(s, f))
>> colnames(res) <- paste(xname, seq_along(s), sep =
"")
>> data.frame(res)
>> }
>>
>> fun(A)
>>
>> Thank you so very much.
Hi,
It's working for me.
Try this:
dat1<-read.csv("test.csv")
dat2<-na.omit(dat1)
?
nrow(dat1)
#[1] 635
?nrow(dat2)
#[1] 627
B<-gsub("
slope{0,1}\\s\\((.*)\\)","\\1",dat2$Composition_percent_part)
fun1<-function(x){
C<-gsub("(.*)/(.*)","\\1 0 0 \\2",
gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3",
gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",x)))
res<-data.frame(do.call(rbind,strsplit(C," ")))
colnames(res)<-paste("A",1:4,sep="")
res
}
fun1(B)
head(fun1(B),15)
?#? A1 A2 A3 A4
#1? 60 25? 0 15
#2? 60 25? 0 15
#3? 40 35 15 10
#4? 40 35 15 10
#5? 40 35 15 10
#6? 40 35 15 10
#7? 40 35 15 10
#8? 50 40? 0 10
#9? 50 40? 0 10
#10 50 40? 0 10
#11 50 40? 0 10
#12 50 30? 0 20
#13 50 30? 0 20
#14 50 30? 0 20
#15 50 30? 0 20
#or,
fun1<-function(x){
B<-gsub(" slope{0,1}\\s\\((.*)\\)","\\1",x)
C<-gsub("(.*)/(.*)","\\1 0 0 \\2",
gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3",
gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",B)))
res<-data.frame(do.call(rbind,strsplit(C," ")))
colnames(res)<-paste("A",1:4,sep="")
res
}
fun1(dat2$Composition_percent_part)???
?head(fun1(dat2$Composition_percent_part),5)
#? A1 A2 A3 A4
#1 60 25? 0 15
#2 60 25? 0 15
#3 40 35 15 10
#4 40 35 15 10
#5 40 35 15 10
A.K.
________________________________
From: Sapana Lohani <lohani.sapana at ymail.com>
To: arun <smartpink111 at yahoo.com>
Sent: Sunday, September 2, 2012 8:39 PM
Subject: Re: [R] splits with 0s in middle columns
Hi Arun,
I do not know whats wrong with
my data, so am sending you the whole column I wanted to split. Could you please
have a look and suggest me the error? I ma totally stuck at this point of my
analysis
________________________________
From: arun <smartpink111 at yahoo.com>
To: Rui Barradas <ruipbarradas at sapo.pt>
Cc: Sapana Lohani <lohani.sapana at ymail.com>; R help <r-help at
r-project.org>
Sent: Sunday, September 2, 2012 1:54 PM
Subject: Re: [R] splits with 0s in middle columns
HI,
You can also try this as a function:
fun1<-function(x){
B<-gsub("percent{0,1}\\s\\((.*)\\)","\\1",x)
C<-gsub("(.*)/(.*)","\\1 0 0 \\2",
gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3",
gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",B)))
dat1<-data.frame(do.call(rbind,strsplit(C," ")))
colnames(dat1)<-paste0("A",1:4)
dat1
}
?fun1(A)
#? A1 A2 A3 A4
#1 10 20? 0 30
#2 40? 0? 0 20
#3 60 10 10? 5
#4 80? 0? 0 10
A.K.
----- Original Message -----
From: Rui Barradas <ruipbarradas at sapo.pt>
To: Sapana Lohani <lohani.sapana at ymail.com>
Cc: r-help <r-help at r-project.org>
Sent: Sunday,
September 2, 2012 1:05 PM
Subject: Re: [R] splits with 0s in middle columns
Hello,
You don't need a new function, what you need is to prepare your data in
such a way that the function can process it.
A <- c("percent (10/20/30)", "percent (40/20)",
"percent (60/10/10/5)",
"percent (80/10)")
B <- gsub("\\(|\\)|percent| ", "", A)
fun(B)
Also, please use dput to post the data examples,
dput(A)
c("percent (10/20/30)", "percent (40/20)", "percent
(60/10/10/5)",
"percent (80/10)")
Then copy&paste in your post.
Rui Barradas
Em 02-09-2012 04:22, Sapana Lohani escreveu:> Dear Rui,
>
> The new code works fine for what I wanted. I have another similar column
but it looks like
>
> A
> percent (10/20/30)
> percent (40/20)
> percent (60/10/10/5)
> percent (80/10)
>
> I want a similar split but delete the percent
in the front. The output should look like>
> A1 A2 A3 A4
> 10 20 0 30
> 40 0 0 20
> 60 10 10 5
> 80 0 0 10
>
> Could you please make the small change in the code that you gave me. It
must be a small edition but I could not figure that out. FYI the code that
worked was
>
> fun <- function(X){
>? ? ?? xname <- deparse(substitute(X))
>? ? ?? s <- strsplit(X, "/")
>? ? ?? n <- max(sapply(s, length))
>? ? ?? tmp <- numeric(n)
>
>? ? ?? f <- function(x){
>? ? ? ? ?? x <- as.numeric(x)
>? ? ? ? ?? m <- length(x)
>? ? ? ? ?? if(m > 1){
>? ? ? ? ? ? ?? tmp[n] <- x[m]
>? ?
? ? ? ? ?? tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]>? ? ? ? ?? }else tmp[1] <- x
>? ? ? ? ?? tmp
>? ? ?? }
>
>? ? ?? res <- do.call(rbind, lapply(s, f))
>? ? ?? colnames(res) <- paste(xname, seq_along(s), sep = "")
>? ? ?? data.frame(res)
> }
>
> fun(A)
>
> Thank you so very much.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.