siddu479
2012-Oct-14 16:13 UTC
[R] transforming a .csv file column names as per a particular column rows using R code
Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com.
Jeff Newmiller
2012-Oct-14 17:16 UTC
[R] transforming a .csv file column names as per a particular column rows using R code
Not clear to me what your successive rows indicate in your output. However, you may get something similar to what you want using the reshape2 library: library(reshape2) dta1 <- read.csv( text"Tool,Step_Number,Data1,Data2 A,1,0,1 A,2,3,2 A,3,2,3 B,1,3,2 B,2,1,2 B,3,3,2 ") # put into long form dta1l <- melt(dta1,c("Tool","Step_Number")) # re-cast to wide form # assuming first row is for tool A, second row is for tool B, etc. dta2 <- dcast(dta1l,Tool~variable+Step_Number) On Sun, 14 Oct 2012, siddu479 wrote:> Hello all, > I have a .csv file like below. > Tool,Step_Number,Data1,Data2... etc up to 100 columns. > A,1,0,1 > A,2,3,1 > A,3,2,1 > . > . > B,1,3,2 > B,2,1,2 > B,3,3,2 > . > . > ...... so on upto 50 rows > where the column "*Tool*" has distinct steps in second column > "*Step_Number*",but both have same entries in Step_Number column. > I want the output like below. > > Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so > on > A,0,1,A,3,1,A,2,1 > B,3,2,B,1,2,B,3,2 > ...... > so on. > > basically I am transposing entire data based on a specific column row values > and renaming the column headers. > I have a shell script based on awk which can do this task, but the script is > taking exceptionally higher processing time. > > So I am looking for a script in R which can save the time. > "Please revert to me if the problem description is not clear." > > Regards > Sidda > > > > > ----- > Sidda > Business Analyst Lead > Applied Materials Inc. > > -- > View this message in context: r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
arun
2012-Oct-14 17:23 UTC
[R] transforming a .csv file column names as per a particular column rows using R code
Hi, If you don't want the column "Tool" to be repeated, try this: dat1<-read.table(text=" Tool Step_Number Data1 Data2 A 1 0 1 A 2 3 1 A 3 2 1 B 1 3 2 B 2 1 2 B 3 3 2 ",sep="",header=TRUE,stringsAsFactors=FALSE)??? ?with(dat1,aggregate(cbind(Data1,Data2),list(Tool=Tool),function(x) x)) ? Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 #1??? A?????? 0?????? 3?????? 2?????? 1?????? 1?????? 1 #2??? B?????? 3?????? 1?????? 3?????? 2?????? 2?????? 2A.K. ----- Original Message ----- From: siddu479 <onlyfordigitalstuff at gmail.com> To: r-help at r-project.org Cc: Sent: Sunday, October 14, 2012 12:13 PM Subject: [R] transforming a .csv file column names as per a particular column rows using R code Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2012-Oct-14 18:22 UTC
[R] transforming a .csv file column names as per a particular column rows using R code
Hello, Try the following. dta1 <- read.csv( text"Tool,Step_Number,Data1,Data2 A,1,0,1 A,2,3,2 A,3,2,3 B,1,3,2 B,2,1,2 B,3,3,2 ") sp <- split(dta1[-2], dta1$Tool) dta2 <- do.call(rbind, lapply(sp, function(x) as.vector(unlist(t(x))))) dta2 <- data.frame(dta2, stringsAsFactors = FALSE) idx <- rep(c(FALSE, TRUE, TRUE), ncol(dta2) %/% 3) dta2[, idx] <- sapply(dta2[, idx], as.integer) Hope this helps, Rui Barradas Em 14-10-2012 17:13, siddu479 escreveu:> Hello all, > I have a .csv file like below. > Tool,Step_Number,Data1,Data2... etc up to 100 columns. > A,1,0,1 > A,2,3,1 > A,3,2,1 > . > . > B,1,3,2 > B,2,1,2 > B,3,3,2 > . > . > ...... so on upto 50 rows > where the column "*Tool*" has distinct steps in second column > "*Step_Number*",but both have same entries in Step_Number column. > I want the output like below. > > Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so > on > A,0,1,A,3,1,A,2,1 > B,3,2,B,1,2,B,3,2 > ...... > so on. > > basically I am transposing entire data based on a specific column row values > and renaming the column headers. > I have a shell script based on awk which can do this task, but the script is > taking exceptionally higher processing time. > > So I am looking for a script in R which can save the time. > "Please revert to me if the problem description is not clear." > > Regards > Sidda > > > > > ----- > Sidda > Business Analyst Lead > Applied Materials Inc. > > -- > View this message in context: r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
arun
2012-Oct-14 18:29 UTC
[R] transforming a .csv file column names as per a particular column rows using R code
HI, If you need the "Tool" column repeated, Try this: dat1<-read.table(text=" Tool Step_Number Data1 Data2 Data3 A 1 0 1 2 A 2 3 1 4? A 3 2 1 3 B 1 3 2 1 B 2 1 2 3 B 3 3 2 0 ",sep="",header=TRUE,stringsAsFactors=FALSE)?? ? dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x)) dat3 #Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 Data3.1 Data3.2 Data3.3 #1??? A?????? 0?????? 3?????? 2?????? 1?????? 1?????? 1?????? 2?????? 4?????? 3 #2??? B?????? 3?????? 1?????? 3?????? 2?????? 2?????? 2?????? 1?????? 3?????? 0 dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x))) colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4))) dat4[grepl("Data.*",colnames(dat4))]<-sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.character(x))) ?dat4 #? Tool Data1.2 Data1.3 Data1.4 Tool Data2.2 Data2.3 Data2.4 Tool Data3.2 #1??? A?????? 0?????? 3?????? 2??? A?????? 1?????? 1?????? 1??? A?????? 2 #2??? B?????? 3?????? 1?????? 3??? B?????? 2?????? 2?????? 2??? B?????? 1 ? Data3.3 Data3.4 1?????? 4?????? 3 2?????? 3?????? 0 ?str(dat4) #'data.frame':??? 2 obs. of? 12 variables: # $ Tool?? : Factor w/ 2 levels "A","B": 1 2 # $ Data1.2: num? 0 3 # $ Data1.3: num? 3 1 ?#$ Data1.4: num? 2 3 ?#$ Tool?? : Factor w/ 2 levels "A","B": 1 2 ?#$ Data2.2: num? 1 2 ?#$ Data2.3: num? 1 2 ?#$ Data2.4: num? 1 2 # $ Tool?? : Factor w/ 2 levels "A","B": 1 2 # $ Data3.2: num? 2 1 # $ Data3.3: num? 4 3 # $ Data3.4: num? 3 0 A.K. ----- Original Message ----- From: siddu479 <onlyfordigitalstuff at gmail.com> To: r-help at r-project.org Cc: Sent: Sunday, October 14, 2012 12:13 PM Subject: [R] transforming a .csv file column names as per a particular column rows using R code Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
arun
2012-Oct-14 20:25 UTC
[R] transforming a .csv file column names as per a particular column rows using R code
HI, Just a modification of the column names.? I noticed that the "Data." column starts with "2".? Try this: dat1<-read.table(text=" Tool Step_Number Data1 Data2 Data3 A 1 0 1 2 A 2 3 1 4 A 3 2 1 3 B 1 3 2 1 B 2 1 2 3 B 3 3 2 0 ",sep="",header=TRUE,stringsAsFactors=FALSE)? ? dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x)) dat3 dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x))) colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4))) dat4[grepl("Data.*",colnames(dat4))]<-sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.character(x))) colnames(dat4)[colnames(dat4)%in%colnames(dat4[grep("Data",colnames(dat4))])]<-paste(rep(names(lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x))),each=length(unique(dat1$Step_Number))),"_",1:length(unique(dat1$Step_Number)),sep="") dat4 # Tool Data1_1 Data1_2 Data1_3 Tool Data2_1 Data2_2 Data2_3 Tool Data3_1 #1??? A?????? 0?????? 3?????? 2??? A?????? 1?????? 1?????? 1??? A?????? 2 #2??? B?????? 3?????? 1?????? 3??? B?????? 2?????? 2?????? 2??? B?????? 1 ?# Data3_2 Data3_3 #1?????? 4?????? 3 #2?????? 3?????? 0 A.K. ----- Original Message ----- From: siddu479 <onlyfordigitalstuff at gmail.com> To: r-help at r-project.org Cc: Sent: Sunday, October 14, 2012 12:13 PM Subject: [R] transforming a .csv file column names as per a particular column rows using R code Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
siddu479
2012-Oct-15 02:43 UTC
[R] transforming a .csv file column names as per a particular column rows using R code
Hi Jeff Newmiller, The script given by you is perfect for my requirement. The rest of the scripts given by other authors are failing to format the file header which are giving like eachdatacolumn.1 or eachdatacolumn.2 etc, whereas I need them as eachdatacolumn_firstactualstepnumber, eachdatacolumn_secondactualstepnumber etc. in my output. ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137p4646188.html Sent from the R help mailing list archive at Nabble.com.
Apparently Analagous Threads
- Excluding all missing values with dcast ("reshape2" package)
- listing the files in a directory using regular expressions
- Getting error while running unix commands within R using system() function
- Excluding all teh columns from a data frame if the standard deviation of that column is zero(0).
- why the same values cannot be judged to be the same in R