Hello there, I have a data frame a small version of which could look like the following: x1 x2 1 A 1 2 B 2 3 B 3 Now I need to remove rows which are duplicate in x1, i.e. in the example above I would remove row 3. I have an ugly solution with for and while loops and ifs. ... And of course my data set is much larger and my solution takes along time. Any ideas what could be the best way to do this in R? Better yet: I actually would like to sort of collapse row 2 and 3 in the example above by replacing 2 and 3 with a new row 2 which has in x2 the mean of old x2 of row 2 and 3 (maybe this is poorly said). Anyways, thanks a lot in advance for suggestions. -- D --------------------------------- [[alternative HTML version deleted]]
Here is one way to get the means:> xx1 x2 1 A 1 2 B 2 3 B 3> aggregate(x$x2, list(x$x1),mean)Group.1 x 1 A 1.0 2 B 2.5>On 10/2/07, Dieter Best <dieterbest_2000 at yahoo.com> wrote:> Hello there, > > I have a data frame a small version of which could look like the following: > > x1 x2 > 1 A 1 > 2 B 2 > 3 B 3 > > Now I need to remove rows which are duplicate in x1, i.e. in the example above I would remove row 3. > > I have an ugly solution with for and while loops and ifs. ... And of course my data set is much larger and my solution takes along time. > > Any ideas what could be the best way to do this in R? > > Better yet: I actually would like to sort of collapse row 2 and 3 in the example above by replacing 2 and 3 with a new row 2 which has in x2 the mean of old x2 of row 2 and 3 (maybe this is poorly said). > > Anyways, thanks a lot in advance for suggestions. > > -- D > > > > --------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
?aggregate x <- data.frame(a= as.factor(c("A", "B" , "B" ,"C" ,"B", "A", "D")), b = c(3, 2, 1, 1, 2, 3, 7)) aggregate(x[,2], list(x[,1]), mean) --- Dieter Best <dieterbest_2000 at yahoo.com> wrote:> Hello there, > > I have a data frame a small version of which could > look like the following: > > x1 x2 > 1 A 1 > 2 B 2 > 3 B 3 > > Now I need to remove rows which are duplicate in > x1, i.e. in the example above I would remove row 3. > > I have an ugly solution with for and while loops > and ifs. ... And of course my data set is much > larger and my solution takes along time. > > Any ideas what could be the best way to do this in > R? > > Better yet: I actually would like to sort of > collapse row 2 and 3 in the example above by > replacing 2 and 3 with a new row 2 which has in x2 > the mean of old x2 of row 2 and 3 (maybe this is > poorly said). > > Anyways, thanks a lot in advance for suggestions. > > -- D > > > > --------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
Try this:> DF[!duplicated(DF$x1), ]x1 x2 1 A 1 2 B 2> # or > subset(DF, !duplicated(x1))x1 x2 1 A 1 2 B 2 On 10/2/07, Dieter Best <dieterbest_2000 at yahoo.com> wrote:> Hello there, > > I have a data frame a small version of which could look like the following: > > x1 x2 > 1 A 1 > 2 B 2 > 3 B 3 > > Now I need to remove rows which are duplicate in x1, i.e. in the example above I would remove row 3. > > I have an ugly solution with for and while loops and ifs. ... And of course my data set is much larger and my solution takes along time. > > Any ideas what could be the best way to do this in R? > > Better yet: I actually would like to sort of collapse row 2 and 3 in the example above by replacing 2 and 3 with a new row 2 which has in x2 the mean of old x2 of row 2 and 3 (maybe this is poorly said). > > Anyways, thanks a lot in advance for suggestions. > > -- D > > > > --------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >