Rainer M. Krug
2007-Mar-05 09:04 UTC
[R] Identifying last record in individual growth data over different time intervalls
Hi I have a plist t which contains size measurements of individual plants, identified by the field "plate". It contains, among other, a field "year" indicating the year in which the individual was measured and the "height". The number of measurements range from 1 to 4 measurements in different years. My problem is that I would need the LAST measurement. I only came up with the solution below which is probably way to complicated, but I can't think of another solution. Does anybody has an idea how to do this more effectively? Finally I would like to have a data.frame t2 which only contains the entries of the last measurements. Thanks in advance, Rainer > unlist( sapply( split(t, t$plate), function(i) { i[i$year==max(i$year),]$id } ) ) 15 20 33 43 44 47 64 D72 S200 S201 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 S202 S203 S204 S205 S206 S207 S208 S209 S210 S211 2004095 2006019 2006020 2006021 2006022 2006023 2006024 2006025 2006026 2006027 S212 S213 S214 S215 S216 S217 S218 S219 S220 S222 2006028 2006029 2006030 2006031 2006032 2006033 2006034 2006035 2006036 2006037 S223 S224 S225 S226 S227 S228 S229 S230 S231 S232 2006038 2006039 2006040 2006041 2006042 2006043 2006044 2006045 2006046 2006047 > > t id plate year height 2004007 2004007 15 2004 0.40 2005024 2005024 15 2005 0.43 2006001 2006001 15 2006 0.44 2004012 2004012 20 2004 0.90 2005026 2005026 20 2005 0.94 2006003 2006003 20 2006 0.98 2004025 2004025 33 2004 0.15 2005027 2005027 33 2005 0.15 2006005 2006005 33 2006 0.16 2004035 2004035 43 2004 0.26 2005038 2005038 43 2005 0.30 2006007 2006007 43 2006 0.38 2004036 2004036 44 2004 0.32 2005030 2005030 44 2005 0.39 2006008 2006008 44 2006 0.46 2004039 2004039 47 2004 0.50 2005025 2005025 47 2005 0.55 2006009 2006009 47 2006 0.63 2004055 2004055 64 2004 0.45 2005029 2005029 64 2005 0.58 2006014 2006014 64 2006 0.67 2006015 2006015 D72 2006 0.30 2004093 2004093 S200 2004 0.68 2005040 2005040 S200 2005 0.74 2006016 2006016 S200 2006 0.84 2004094 2004094 S201 2004 0.46 2005041 2005041 S201 2005 0.49 2006017 2006017 S201 2006 0.53 2004095 2004095 S202 2004 0.17 2004096 2004096 S203 2004 0.23 2005032 2005032 S203 2005 0.23 2006019 2006019 S203 2006 0.23 2004097 2004097 S204 2004 0.25 2005031 2005031 S204 2005 0.29 2006020 2006020 S204 2006 0.41 2004098 2004098 S205 2004 0.22 2005039 2005039 S205 2005 0.26 2006021 2006021 S205 2006 0.37 2004099 2004099 S206 2004 0.19 2005035 2005035 S206 2005 0.25 2006022 2006022 S206 2006 0.37 2004100 2004100 S207 2004 0.29 2005003 2005003 S207 2005 0.36 2006023 2006023 S207 2006 0.41 2004101 2004101 S208 2004 0.17 2005005 2005005 S208 2005 0.20 2006024 2006024 S208 2006 0.16 2004102 2004102 S209 2004 0.16 2005008 2005008 S209 2005 0.19 2006025 2006025 S209 2006 0.24 2004103 2004103 S210 2004 0.09 2005007 2005007 S210 2005 0.14 2006026 2006026 S210 2006 0.15 2004104 2004104 S211 2004 0.12 2005006 2005006 S211 2005 0.12 2006027 2006027 S211 2006 0.22 2004105 2004105 S212 2004 0.61 2005011 2005011 S212 2005 0.71 2006028 2006028 S212 2006 0.81 2004106 2004106 S213 2004 0.28 2005010 2005010 S213 2005 0.37 2006029 2006029 S213 2006 0.44 2004107 2004107 S214 2004 0.47 2005009 2005009 S214 2005 0.59 2006030 2006030 S214 2006 0.67 2004108 2004108 S215 2004 0.43 2005004 2005004 S215 2005 0.53 2006031 2006031 S215 2006 0.66 2004109 2004109 S216 2004 0.35 2005019 2005019 S216 2005 0.38 2006032 2006032 S216 2006 0.41 2004110 2004110 S217 2004 0.20 2005018 2005018 S217 2005 0.21 2006033 2006033 S217 2006 0.32 2004111 2004111 S218 2004 0.19 2005014 2005014 S218 2005 0.21 2006034 2006034 S218 2006 0.27 2004112 2004112 S219 2004 0.21 2005034 2005034 S219 2005 0.24 2006035 2006035 S219 2006 0.24 2004113 2004113 S220 2004 0.19 2005021 2005021 S220 2005 0.19 2006036 2006036 S220 2006 0.25 2004114 2004114 S222 2004 0.34 2005020 2005020 S222 2005 0.35 2006037 2006037 S222 2006 0.46 2005013 2005013 S223 2005 0.04 2006038 2006038 S223 2006 0.04 2005012 2005012 S224 2005 0.13 2006039 2006039 S224 2006 0.14 -- NEW EMAIL ADDRESS AND ADDRESS: Rainer.Krug at uct.ac.za RKrug at sun.ac.za WILL BE DISCONTINUED END OF MARCH Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Leslie Hill Institute for Plant Conservation University of Cape Town Rondebosch 7701 South Africa Fax: +27 - (0)86 516 2782 Fax: +27 - (0)21 650 2440 (w) Cell: +27 - (0)83 9479 042 Skype: RMkrug email: Rainer.Krug at uct.ac.za Rainer at krugs.de
jim holtman
2007-Mar-05 15:58 UTC
[R] Identifying last record in individual growth data over different time intervalls
What is wrong with the method that you have? It looks reasonable efficient. As with other languages, there are always other ways of doing it. Here is another to consider, but it is basically the same:> sapply(split(t, t$plate), function(x) x$id[which.max(x$year)])15 20 33 43 44 47 64 D72 S200 S201 S202 S203 S204 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 2004095 2006019 2006020 S205 S206 S207 S208 S209 S210 S211 S212 S213 S214 S215 S216 S217 2006021 2006022 2006023 2006024 2006025 2006026 2006027 2006028 2006029 2006030 2006031 2006032 2006033 S218 S219 S220 S222 S223 S224 2006034 2006035 2006036 2006037 2006038 2006039>On 3/5/07, Rainer M. Krug <Rainer.Krug@uct.ac.za> wrote:> > Hi > > I have a plist t which contains size measurements of individual plants, > identified by the field "plate". It contains, among other, a field > "year" indicating the year in which the individual was measured and the > "height". The number of measurements range from 1 to 4 measurements in > different years. > My problem is that I would need the LAST measurement. I only came up > with the solution below which is probably way to complicated, but I > can't think of another solution. > > Does anybody has an idea how to do this more effectively? > > Finally I would like to have a data.frame t2 which only contains the > entries of the last measurements. > > Thanks in advance, > > Rainer > > > > > unlist( > sapply( > split(t, t$plate), > function(i) > { > i[i$year==max(i$year),]$id > } > ) > ) > > 15 20 33 43 44 47 64 D72 S200 > S201 > 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 > 2006017 > S202 S203 S204 S205 S206 S207 S208 S209 S210 > S211 > 2004095 2006019 2006020 2006021 2006022 2006023 2006024 2006025 2006026 > 2006027 > S212 S213 S214 S215 S216 S217 S218 S219 S220 > S222 > 2006028 2006029 2006030 2006031 2006032 2006033 2006034 2006035 2006036 > 2006037 > S223 S224 S225 S226 S227 S228 S229 S230 S231 > S232 > 2006038 2006039 2006040 2006041 2006042 2006043 2006044 2006045 2006046 > 2006047 > > > > t > id plate year height > 2004007 2004007 15 2004 0.40 > 2005024 2005024 15 2005 0.43 > 2006001 2006001 15 2006 0.44 > 2004012 2004012 20 2004 0.90 > 2005026 2005026 20 2005 0.94 > 2006003 2006003 20 2006 0.98 > 2004025 2004025 33 2004 0.15 > 2005027 2005027 33 2005 0.15 > 2006005 2006005 33 2006 0.16 > 2004035 2004035 43 2004 0.26 > 2005038 2005038 43 2005 0.30 > 2006007 2006007 43 2006 0.38 > 2004036 2004036 44 2004 0.32 > 2005030 2005030 44 2005 0.39 > 2006008 2006008 44 2006 0.46 > 2004039 2004039 47 2004 0.50 > 2005025 2005025 47 2005 0.55 > 2006009 2006009 47 2006 0.63 > 2004055 2004055 64 2004 0.45 > 2005029 2005029 64 2005 0.58 > 2006014 2006014 64 2006 0.67 > 2006015 2006015 D72 2006 0.30 > 2004093 2004093 S200 2004 0.68 > 2005040 2005040 S200 2005 0.74 > 2006016 2006016 S200 2006 0.84 > 2004094 2004094 S201 2004 0.46 > 2005041 2005041 S201 2005 0.49 > 2006017 2006017 S201 2006 0.53 > 2004095 2004095 S202 2004 0.17 > 2004096 2004096 S203 2004 0.23 > 2005032 2005032 S203 2005 0.23 > 2006019 2006019 S203 2006 0.23 > 2004097 2004097 S204 2004 0.25 > 2005031 2005031 S204 2005 0.29 > 2006020 2006020 S204 2006 0.41 > 2004098 2004098 S205 2004 0.22 > 2005039 2005039 S205 2005 0.26 > 2006021 2006021 S205 2006 0.37 > 2004099 2004099 S206 2004 0.19 > 2005035 2005035 S206 2005 0.25 > 2006022 2006022 S206 2006 0.37 > 2004100 2004100 S207 2004 0.29 > 2005003 2005003 S207 2005 0.36 > 2006023 2006023 S207 2006 0.41 > 2004101 2004101 S208 2004 0.17 > 2005005 2005005 S208 2005 0.20 > 2006024 2006024 S208 2006 0.16 > 2004102 2004102 S209 2004 0.16 > 2005008 2005008 S209 2005 0.19 > 2006025 2006025 S209 2006 0.24 > 2004103 2004103 S210 2004 0.09 > 2005007 2005007 S210 2005 0.14 > 2006026 2006026 S210 2006 0.15 > 2004104 2004104 S211 2004 0.12 > 2005006 2005006 S211 2005 0.12 > 2006027 2006027 S211 2006 0.22 > 2004105 2004105 S212 2004 0.61 > 2005011 2005011 S212 2005 0.71 > 2006028 2006028 S212 2006 0.81 > 2004106 2004106 S213 2004 0.28 > 2005010 2005010 S213 2005 0.37 > 2006029 2006029 S213 2006 0.44 > 2004107 2004107 S214 2004 0.47 > 2005009 2005009 S214 2005 0.59 > 2006030 2006030 S214 2006 0.67 > 2004108 2004108 S215 2004 0.43 > 2005004 2005004 S215 2005 0.53 > 2006031 2006031 S215 2006 0.66 > 2004109 2004109 S216 2004 0.35 > 2005019 2005019 S216 2005 0.38 > 2006032 2006032 S216 2006 0.41 > 2004110 2004110 S217 2004 0.20 > 2005018 2005018 S217 2005 0.21 > 2006033 2006033 S217 2006 0.32 > 2004111 2004111 S218 2004 0.19 > 2005014 2005014 S218 2005 0.21 > 2006034 2006034 S218 2006 0.27 > 2004112 2004112 S219 2004 0.21 > 2005034 2005034 S219 2005 0.24 > 2006035 2006035 S219 2006 0.24 > 2004113 2004113 S220 2004 0.19 > 2005021 2005021 S220 2005 0.19 > 2006036 2006036 S220 2006 0.25 > 2004114 2004114 S222 2004 0.34 > 2005020 2005020 S222 2005 0.35 > 2006037 2006037 S222 2006 0.46 > 2005013 2005013 S223 2005 0.04 > 2006038 2006038 S223 2006 0.04 > 2005012 2005012 S224 2005 0.13 > 2006039 2006039 S224 2006 0.14 > -- > NEW EMAIL ADDRESS AND ADDRESS: > > Rainer.Krug@uct.ac.za > > RKrug@sun.ac.za WILL BE DISCONTINUED END OF MARCH > > Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation > Biology (UCT) > > Leslie Hill Institute for Plant Conservation > University of Cape Town > Rondebosch 7701 > South Africa > > Fax: +27 - (0)86 516 2782 > Fax: +27 - (0)21 650 2440 (w) > Cell: +27 - (0)83 9479 042 > > Skype: RMkrug > > email: Rainer.Krug@uct.ac.za > Rainer@krugs.de > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
jim holtman
2007-Mar-05 16:03 UTC
[R] Identifying last record in individual growth data over different time intervalls
If you were worried about efficiency and the structure/size of the dataframe was complex/big, then you could work with the indices only which would be more efficient:> sapply(split(seq(nrow(t)), t$plate), function(x) t$id[x][which.max(t$year[x])]) 15 20 33 43 44 47 64 D72 S200 S201 S202 S203 S204 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 2004095 2006019 2006020 S205 S206 S207 S208 S209 S210 S211 S212 S213 S214 S215 S216 S217 2006021 2006022 2006023 2006024 2006025 2006026 2006027 2006028 2006029 2006030 2006031 2006032 2006033 S218 S219 S220 S222 S223 S224 2006034 2006035 2006036 2006037 2006038 2006039>On 3/5/07, Rainer M. Krug <Rainer.Krug@uct.ac.za> wrote:> > Hi > > I have a plist t which contains size measurements of individual plants, > identified by the field "plate". It contains, among other, a field > "year" indicating the year in which the individual was measured and the > "height". The number of measurements range from 1 to 4 measurements in > different years. > My problem is that I would need the LAST measurement. I only came up > with the solution below which is probably way to complicated, but I > can't think of another solution. > > Does anybody has an idea how to do this more effectively? > > Finally I would like to have a data.frame t2 which only contains the > entries of the last measurements. > > Thanks in advance, > > Rainer > > > > > unlist( > sapply( > split(t, t$plate), > function(i) > { > i[i$year==max(i$year),]$id > } > ) > ) > > 15 20 33 43 44 47 64 D72 S200 > S201 > 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 > 2006017 > S202 S203 S204 S205 S206 S207 S208 S209 S210 > S211 > 2004095 2006019 2006020 2006021 2006022 2006023 2006024 2006025 2006026 > 2006027 > S212 S213 S214 S215 S216 S217 S218 S219 S220 > S222 > 2006028 2006029 2006030 2006031 2006032 2006033 2006034 2006035 2006036 > 2006037 > S223 S224 S225 S226 S227 S228 S229 S230 S231 > S232 > 2006038 2006039 2006040 2006041 2006042 2006043 2006044 2006045 2006046 > 2006047 > > > > t > id plate year height > 2004007 2004007 15 2004 0.40 > 2005024 2005024 15 2005 0.43 > 2006001 2006001 15 2006 0.44 > 2004012 2004012 20 2004 0.90 > 2005026 2005026 20 2005 0.94 > 2006003 2006003 20 2006 0.98 > 2004025 2004025 33 2004 0.15 > 2005027 2005027 33 2005 0.15 > 2006005 2006005 33 2006 0.16 > 2004035 2004035 43 2004 0.26 > 2005038 2005038 43 2005 0.30 > 2006007 2006007 43 2006 0.38 > 2004036 2004036 44 2004 0.32 > 2005030 2005030 44 2005 0.39 > 2006008 2006008 44 2006 0.46 > 2004039 2004039 47 2004 0.50 > 2005025 2005025 47 2005 0.55 > 2006009 2006009 47 2006 0.63 > 2004055 2004055 64 2004 0.45 > 2005029 2005029 64 2005 0.58 > 2006014 2006014 64 2006 0.67 > 2006015 2006015 D72 2006 0.30 > 2004093 2004093 S200 2004 0.68 > 2005040 2005040 S200 2005 0.74 > 2006016 2006016 S200 2006 0.84 > 2004094 2004094 S201 2004 0.46 > 2005041 2005041 S201 2005 0.49 > 2006017 2006017 S201 2006 0.53 > 2004095 2004095 S202 2004 0.17 > 2004096 2004096 S203 2004 0.23 > 2005032 2005032 S203 2005 0.23 > 2006019 2006019 S203 2006 0.23 > 2004097 2004097 S204 2004 0.25 > 2005031 2005031 S204 2005 0.29 > 2006020 2006020 S204 2006 0.41 > 2004098 2004098 S205 2004 0.22 > 2005039 2005039 S205 2005 0.26 > 2006021 2006021 S205 2006 0.37 > 2004099 2004099 S206 2004 0.19 > 2005035 2005035 S206 2005 0.25 > 2006022 2006022 S206 2006 0.37 > 2004100 2004100 S207 2004 0.29 > 2005003 2005003 S207 2005 0.36 > 2006023 2006023 S207 2006 0.41 > 2004101 2004101 S208 2004 0.17 > 2005005 2005005 S208 2005 0.20 > 2006024 2006024 S208 2006 0.16 > 2004102 2004102 S209 2004 0.16 > 2005008 2005008 S209 2005 0.19 > 2006025 2006025 S209 2006 0.24 > 2004103 2004103 S210 2004 0.09 > 2005007 2005007 S210 2005 0.14 > 2006026 2006026 S210 2006 0.15 > 2004104 2004104 S211 2004 0.12 > 2005006 2005006 S211 2005 0.12 > 2006027 2006027 S211 2006 0.22 > 2004105 2004105 S212 2004 0.61 > 2005011 2005011 S212 2005 0.71 > 2006028 2006028 S212 2006 0.81 > 2004106 2004106 S213 2004 0.28 > 2005010 2005010 S213 2005 0.37 > 2006029 2006029 S213 2006 0.44 > 2004107 2004107 S214 2004 0.47 > 2005009 2005009 S214 2005 0.59 > 2006030 2006030 S214 2006 0.67 > 2004108 2004108 S215 2004 0.43 > 2005004 2005004 S215 2005 0.53 > 2006031 2006031 S215 2006 0.66 > 2004109 2004109 S216 2004 0.35 > 2005019 2005019 S216 2005 0.38 > 2006032 2006032 S216 2006 0.41 > 2004110 2004110 S217 2004 0.20 > 2005018 2005018 S217 2005 0.21 > 2006033 2006033 S217 2006 0.32 > 2004111 2004111 S218 2004 0.19 > 2005014 2005014 S218 2005 0.21 > 2006034 2006034 S218 2006 0.27 > 2004112 2004112 S219 2004 0.21 > 2005034 2005034 S219 2005 0.24 > 2006035 2006035 S219 2006 0.24 > 2004113 2004113 S220 2004 0.19 > 2005021 2005021 S220 2005 0.19 > 2006036 2006036 S220 2006 0.25 > 2004114 2004114 S222 2004 0.34 > 2005020 2005020 S222 2005 0.35 > 2006037 2006037 S222 2006 0.46 > 2005013 2005013 S223 2005 0.04 > 2006038 2006038 S223 2006 0.04 > 2005012 2005012 S224 2005 0.13 > 2006039 2006039 S224 2006 0.14 > -- > NEW EMAIL ADDRESS AND ADDRESS: > > Rainer.Krug@uct.ac.za > > RKrug@sun.ac.za WILL BE DISCONTINUED END OF MARCH > > Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation > Biology (UCT) > > Leslie Hill Institute for Plant Conservation > University of Cape Town > Rondebosch 7701 > South Africa > > Fax: +27 - (0)86 516 2782 > Fax: +27 - (0)21 650 2440 (w) > Cell: +27 - (0)83 9479 042 > > Skype: RMkrug > > email: Rainer.Krug@uct.ac.za > Rainer@krugs.de > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
Chris Stubben
2007-Mar-05 17:37 UTC
[R] Identifying last record in individual growth data over different time intervalls
> Finally I would like to have a data.frame t2 which only contains the > entries of the last measurements. >You could also use aggregate to get the max year per plate then join that back to the original dataframe using merge on year and plate (common columns in both dataframes). x<-data.frame(id=(1:8), plate=c(15,15,15,20,20,33,43,43), year=c(2004,2005,2006,2004,2005,2004,2005,2006), height=c(0.40,0.43,0.44,0.90,0.94,0.15,0.30,0.38)) merge(x, aggregate(list(year=x$year), list(plate=x$plate), max)) plate year id height 1 15 2006 3 0.44 2 20 2005 5 0.94 3 33 2004 6 0.15 4 43 2006 8 0.38
Rainer M. Krug
2007-Mar-06 06:53 UTC
[R] Identifying last record in individual growth data over different time intervalls
Hi Chris Chris Stubben wrote:>> Finally I would like to have a data.frame t2 which only contains the >> entries of the last measurements. >> > > You could also use aggregate to get the max year per plate then join that back > to the original dataframe using merge on year and plate (common columns in both > dataframes). >Thanks for the idea to use aggregate and merge - as I like SQL, this seems to be a nice approach. Rainer> > > x<-data.frame(id=(1:8), plate=c(15,15,15,20,20,33,43,43), > year=c(2004,2005,2006,2004,2005,2004,2005,2006), > height=c(0.40,0.43,0.44,0.90,0.94,0.15,0.30,0.38)) > > merge(x, aggregate(list(year=x$year), list(plate=x$plate), max)) > > > plate year id height > 1 15 2006 3 0.44 > 2 20 2005 5 0.94 > 3 33 2004 6 0.15 > 4 43 2006 8 0.38 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- NEW EMAIL ADDRESS AND ADDRESS: Rainer.Krug at uct.ac.za RKrug at sun.ac.za WILL BE DISCONTINUED END OF MARCH Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Leslie Hill Institute for Plant Conservation University of Cape Town Rondebosch 7701 South Africa Fax: +27 - (0)86 516 2782 Fax: +27 - (0)21 650 2440 (w) Cell: +27 - (0)83 9479 042 Skype: RMkrug email: Rainer.Krug at uct.ac.za Rainer at krugs.de