Dear R Forum, I have a data.frame as df = data.frame(date = c("2013-04-15", "2013-04-14", "2013-04-13", "2013-04-12", "2013-04-11"), ABC_f = c(62.80739769,81.04525895,84.65712455,12.78237251,57.61345256), LMN_d = c(21.16794336,54.6580401,63.8923307,87.59880367,87.07693716), XYZ_p = c(55.8885464,94.1358684,84.0089114,98.99746696,64.71083712), LMN_a = c(56.6768395,25.81530198,40.12268441,35.74175237,47.95892209), ABC_e = c(11.36783959,62.29651784,47.63481552,32.27820673,52.12561419), LMN_c = c(45.4484695,17.72362438,36.7690054,68.58912931,35.80767235), XYZ_zz = c(85.74755089,63.48582415,81.61107212,58.1572924,27.44132817), PQR = c(71.22867519,95.09994812,83.62437819,30.18524735,25.81804865), ABC_d c(38.71089816,93.48216193,93.14432203,78.2738731,31.87170019), ABC_m = c(40.28473769,43.97076327,47.38761559,97.33573412,22.06884976))> dfdate ABC_f LMN_d XYZ_p LMN_a ABC_e 1 2013-04-15 62.80740 21.16794 55.88855 56.67684 11.36784 2 2013-04-14 81.04526 54.65804 94.13587 25.81530 62.29652 3 2013-04-13 84.65712 63.89233 84.00891 40.12268 47.63482 4 2013-04-12 12.78237 87.59880 98.99747 35.74175 32.27821 5 2013-04-11 57.61345 87.07694 64.71084 47.95892 52.12561 LMN_c XYZ_zz PQR ABC_d ABC_m 1 45.44847 85.74755 71.22868 38.71090 40.28474 2 17.72362 63.48582 95.09995 93.48216 43.97076 3 36.76901 81.61107 83.62438 93.14432 47.38762 4 68.58913 58.15729 30.18525 78.27387 97.33573 5 35.80767 27.44133 25.81805 31.87170 22.06885 I need to identify columns with same labels and along-with the dates in the first column, save the columns in different csv files. E.g. in the above data frame, I have 4 columns beginning with ABC so I need to save these four columns with the date in the first column as ABC.csv, then LMN_d, LMN_a, LMN_c in the LMN.csv file as date, LMN_a, LMN_c, LMN_d and so on. In my actual data.frame, I won't be aware how many such rates combinations are available. If there is no matching column as "PQR", the PQR.csv file should have only date and PQR column. Kindly guide how do I split the data.frame and save the respective csv files. Regards Katherine [[alternative HTML version deleted]]
Hint: nm <- substring(names(df). 1,3) gives the first 3 letters of the names, assuming this is the info needed for classifying the names -- you were not explicit about this. If some sort of pattern is used, ?grep may be what you need. You can then pick columns from df by e.g. loopingt through unique(nm)... etc. -- Bert On Fri, Apr 26, 2013 at 6:21 AM, Katherine Gobin <katherine_gobin at yahoo.com> wrote:> Dear R Forum, > > I have a data.frame as > > df = data.frame(date = c("2013-04-15", "2013-04-14", "2013-04-13", "2013-04-12", "2013-04-11"), > ABC_f = c(62.80739769,81.04525895,84.65712455,12.78237251,57.61345256), > LMN_d = c(21.16794336,54.6580401,63.8923307,87.59880367,87.07693716), > XYZ_p = c(55.8885464,94.1358684,84.0089114,98.99746696,64.71083712), > LMN_a = c(56.6768395,25.81530198,40.12268441,35.74175237,47.95892209), > ABC_e = c(11.36783959,62.29651784,47.63481552,32.27820673,52.12561419), > LMN_c = c(45.4484695,17.72362438,36.7690054,68.58912931,35.80767235), > XYZ_zz = c(85.74755089,63.48582415,81.61107212,58.1572924,27.44132817), > PQR = c(71.22867519,95.09994812,83.62437819,30.18524735,25.81804865), > ABC_d > c(38.71089816,93.48216193,93.14432203,78.2738731,31.87170019), > ABC_m = c(40.28473769,43.97076327,47.38761559,97.33573412,22.06884976)) > > >> df > date ABC_f LMN_d XYZ_p LMN_a ABC_e > 1 2013-04-15 62.80740 21.16794 55.88855 56.67684 11.36784 > 2 2013-04-14 81.04526 54.65804 94.13587 25.81530 62.29652 > 3 2013-04-13 84.65712 63.89233 84.00891 40.12268 47.63482 > 4 2013-04-12 12.78237 87.59880 98.99747 35.74175 32.27821 > 5 2013-04-11 57.61345 87.07694 64.71084 47.95892 52.12561 > LMN_c XYZ_zz PQR ABC_d ABC_m > 1 45.44847 85.74755 71.22868 38.71090 40.28474 > 2 17.72362 63.48582 95.09995 93.48216 43.97076 > 3 36.76901 81.61107 83.62438 93.14432 47.38762 > 4 68.58913 58.15729 30.18525 78.27387 > 97.33573 > 5 35.80767 27.44133 25.81805 31.87170 22.06885 > > I need to identify columns with same labels and along-with the dates in the first column, save the columns in different csv files. > > E.g. in the above data frame, I have 4 columns beginning with ABC so I need to save these four columns with the date in the first column as ABC.csv, then LMN_d, LMN_a, LMN_c in the LMN.csv file as date, LMN_a, LMN_c, LMN_d and so on. In my actual data.frame, I won't be aware how many such rates combinations are available. If there is no matching column as "PQR", the PQR.csv file should have only date and PQR column. > > Kindly guide how do I split the data.frame and save the respective csv files. > > Regards > > Katherine > > > > > > > > > > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Hi, You can do this: ?lst1<-lapply(split(colnames(df)[-1],gsub("_.*","",colnames(df)[-1])),function(x) {x1<-cbind(date=df[,1],df[,x]);colnames(x1)[-1]<- x;x1}) ?lst1 #$ABC ?# ????? date??? ABC_f??? ABC_e??? ABC_d??? ABC_m #1 2013-04-15 62.80740 11.36784 38.71090 40.28474 #2 2013-04-14 81.04526 62.29652 93.48216 43.97076 #3 2013-04-13 84.65712 47.63482 93.14432 47.38762 #4 2013-04-12 12.78237 32.27821 78.27387 97.33573 #5 2013-04-11 57.61345 52.12561 31.87170 22.06885 # #$LMN ?# ????? date??? LMN_d??? LMN_a??? LMN_c #1 2013-04-15 21.16794 56.67684 45.44847 #2 2013-04-14 54.65804 25.81530 17.72362 #3 2013-04-13 63.89233 40.12268 36.76901 #4 2013-04-12 87.59880 35.74175 68.58913 #5 2013-04-11 87.07694 47.95892 35.80767 # #$PQR ?# ?? date????? PQR #[1,]??? 5 71.22868 #[2,]??? 4 95.09995 #[3,]??? 3 83.62438 #[4,]??? 2 30.18525 #[5,]??? 1 25.81805 # #$XYZ ?# ????? date??? XYZ_p?? XYZ_zz #1 2013-04-15 55.88855 85.74755 #2 2013-04-14 94.13587 63.48582 #3 2013-04-13 84.00891 81.61107 #4 2013-04-12 98.99747 58.15729 #5 2013-04-11 64.71084 27.44133 ?lapply(seq_along(lst1),function(i) write.csv(lst1[[i]],file=paste0(names(lst1[i]),".csv"),row.names=FALSE)) A.K. ----- Original Message ----- From: Katherine Gobin <katherine_gobin at yahoo.com> To: r-help at r-project.org Cc: Sent: Friday, April 26, 2013 9:21 AM Subject: [R] Splitting data.frame and saving to csv files Dear R Forum, I have a data.frame as df = data.frame(date = c("2013-04-15", "2013-04-14", "2013-04-13", "2013-04-12", "2013-04-11"), ABC_f = c(62.80739769,81.04525895,84.65712455,12.78237251,57.61345256), LMN_d = c(21.16794336,54.6580401,63.8923307,87.59880367,87.07693716), XYZ_p = c(55.8885464,94.1358684,84.0089114,98.99746696,64.71083712), LMN_a = c(56.6768395,25.81530198,40.12268441,35.74175237,47.95892209), ABC_e = c(11.36783959,62.29651784,47.63481552,32.27820673,52.12561419), LMN_c = c(45.4484695,17.72362438,36.7690054,68.58912931,35.80767235), XYZ_zz = c(85.74755089,63.48582415,81.61107212,58.1572924,27.44132817), PQR = c(71.22867519,95.09994812,83.62437819,30.18524735,25.81804865), ABC_d c(38.71089816,93.48216193,93.14432203,78.2738731,31.87170019), ABC_m = c(40.28473769,43.97076327,47.38761559,97.33573412,22.06884976))> df??????? date??? ABC_f??? LMN_d??? XYZ_p??? LMN_a??? ABC_e 1 2013-04-15 62.80740 21.16794 55.88855 56.67684 11.36784 2 2013-04-14 81.04526 54.65804 94.13587 25.81530 62.29652 3 2013-04-13 84.65712 63.89233 84.00891 40.12268 47.63482 4 2013-04-12 12.78237 87.59880 98.99747 35.74175 32.27821 5 2013-04-11 57.61345 87.07694 64.71084 47.95892 52.12561 ???? LMN_c?? XYZ_zz????? PQR??? ABC_d??? ABC_m 1 45.44847 85.74755 71.22868 38.71090 40.28474 2 17.72362 63.48582 95.09995 93.48216 43.97076 3 36.76901 81.61107 83.62438 93.14432 47.38762 4 68.58913 58.15729 30.18525 78.27387 97.33573 5 35.80767 27.44133 25.81805 31.87170 22.06885 I need to identify columns with same labels and along-with the dates in the first column, save the columns in different csv files. E.g. in the above data frame, I have 4 columns beginning with ABC so I need to save these four columns with the date in the first column as ABC.csv, then LMN_d, LMN_a, LMN_c in the LMN.csv file as date, LMN_a, LMN_c, LMN_d and so on. In my actual data.frame, I won't be aware how many such rates combinations are available. If there is no matching column as "PQR", the PQR.csv file should have only date and PQR column. Kindly guide how do I split the data.frame and save the respective csv files. Regards Katherine ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.