Nick Duncan
2012-Dec-12 17:00 UTC
[R] Adding a value to one dataframe from another dataframe
Dear All the problem I have is as follows. I have attribute data in a number of data.frames identified 1:58 (the column Null is just there to stop it becoming a list, which caused me trouble, the data I am interested in is in column 1, although other data frames have multiple categories/cols). The attribute data frames are always organised with 58 rows. ID Null 1 NA 1 2 86.11111 2 3 88.88889 3 4 100.00000 4 5 80.55556 5 6 97.22222 6 In another data frame I have clusters in which the subjects 1:58 appear in different combinations. SCD.TD cluster 1 1 4 2 1 17 3 2 4 4 2 13 5 2 17 6 3 4 The ID attribute of 2 is 86.1111 and the subject appears in three clusters. The data frames have different sizes but with the same number of subjects (row 1 is always the same subject) I need to analyse the clusters by the values of a number of attributes. For example I would like to be able to get descriptive statistics of attribute ID by cluster. I'd like to have a general approach if possible. There are NA's in attribute data frames and should be transferred. What I need to get to is below SCD.TD cluster ID 1 1 4 NA 2 1 17 NA 3 2 4 86.11111 4 2 13 86.11111 5 2 17 86.11111 6 3 4 88.88889>From that point I can treat the cluster as a factor and presumably theanalysis is straightforward from there. Many thanks in advance for any assistance. Nick
Hi, Try ?merge() or ?join() dat1<-read.table(text=" ????????? ID??? Null 1??????? NA????? 1 2? 86.11111??? 2 3? 88.88889??? 3 4 100.00000??? 4 5? 80.55556??? 5 6? 97.22222??? 6 ",sep="",header=TRUE) dat2<-read.table(text=" SCD.TD cluster 1????? 1????? 4 2????? 1????? 17 3????? 2????? 4 4????? 2????? 13 5????? 2????? 17 6????? 3????? 4 ",sep="",header=TRUE) ?#merge(dat2,dat1,by.x="SCD.TD",by.y="Null",all=TRUE) ?# SCD.TD cluster??????? ID #1????? 1?????? 4??????? NA #2????? 1????? 17??????? NA #3????? 2?????? 4? 86.11111 #4????? 2????? 13? 86.11111 #5????? 2????? 17? 86.11111 #6????? 3?????? 4? 88.88889 #7????? 4????? NA 100.00000 #8????? 5????? NA? 80.55556 #9????? 6????? NA? 97.22222 A.K. ----- Original Message ----- From: Nick Duncan <nickdunc at gmail.com> To: r-help at r-project.org Cc: Sent: Wednesday, December 12, 2012 12:00 PM Subject: [R] Adding a value to one dataframe from another dataframe Dear All the problem I have is as follows. I have attribute data in a number of data.frames identified 1:58 (the column Null is just there to stop it becoming a list, which caused me trouble, the data I am interested in is in column 1, although other data frames have multiple categories/cols). The attribute data frames are always organised with 58 rows. ? ? ? ? ? ID? ? Null 1? ? ? ? NA? ? ? 1 2? 86.11111? ? 2 3? 88.88889? ? 3 4 100.00000? ? 4 5? 80.55556? ? 5 6? 97.22222? ? 6 In another data frame I have clusters in which the subjects 1:58 appear in different combinations. SCD.TD cluster 1? ? ? 1? ? ? 4 2? ? ? 1? ? ? 17 3? ? ? 2? ? ? 4 4? ? ? 2? ? ? 13 5? ? ? 2? ? ? 17 6? ? ? 3? ? ? 4 The ID attribute of 2 is 86.1111 and the subject appears in three clusters. The data frames have different sizes but with the same number of subjects (row 1 is always the same subject) I need to analyse the clusters by the values of a number of attributes. For example I would like to be able to get descriptive statistics of attribute ID by cluster. I'd like to have a general approach if possible. There are NA's in attribute data frames and should be transferred. What I need to get to is below SCD.TD cluster ID 1? ? ? 1? ? ? 4? NA 2? ? ? 1? ? ? 17? NA 3? ? ? 2? ? ? 4? ? 86.11111 4? ? ? 2? ? ? 13? 86.11111 5? ? ? 2? ? ? 17? 86.11111 6? ? ? 3? ? ? 4? ? 88.88889>From that point I can treat the cluster as a factor and presumably theanalysis is straightforward from there. Many thanks in advance for any assistance. Nick ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.