Yi
2010-Jun-29 01:30 UTC
[R] How to delete rows based on replicate values in one column with some extra calcuation
Hi, folks,
Please let me address the problem by the following codes:
first=c('u','b','e','k','j','c','u','f','c','e')
second=c('usa','Brazil','England','Korea','Japan','China','usa','France','China','England')
third=1:10
data=data.frame(first,second,third)
## You may understand values in the first column are the unique codes for
those in the second column.
####So 'u' is only for usa. Replicate values appear the same rows for
the
first and second columns.
### Now I want to delete replicate rows with the same values in first
(sceond) rows
####and sum up values in the third column for the same values.
mm=melt(data,id='first')
sum=cast(mm,first~variable,sum)
### This does not work. ><
But the expected dataframe is like this:
1 u third 8
2 b third 2
3 e third 13
4 k third 4
5 j third 5
6 c third 15
8 f third 8
Thanks in advance.
Yi
[[alternative HTML version deleted]]
Nikhil Kaza
2010-Jun-29 02:58 UTC
[R] How to delete rows based on replicate values in one column with some extra calcuation
aggregate(data$third, by=list(data$first), sum) or reqiure(reshape) cast(melt(data), ~first, sum) On Jun 28, 2010, at 9:30 PM, Yi wrote:> > first=c('u','b','e','k','j','c','u','f','c','e') > second > > c > ('usa > ','Brazil > ','England','Korea','Japan','China','usa','France','China','England') > third=1:10 > data=data.frame(first,second,third)
David Winsemius
2010-Jun-29 03:16 UTC
[R] How to delete rows based on replicate values in one column with some extra calcuation
On Jun 28, 2010, at 9:30 PM, Yi wrote:> Hi, folks, > > Please let me address the problem by the following codes: > > first=c('u','b','e','k','j','c','u','f','c','e') > second > > c > ('usa > ','Brazil > ','England','Korea','Japan','China','usa','France','China','England') > third=1:10 > data=data.frame(first,second,third) > > ## You may understand values in the first column are the unique > codes for > those in the second column. > ####So 'u' is only for usa. Replicate values appear the same rows > for the > first and second columns. > ### Now I want to delete replicate rows with the same values in first > (sceond) rows > ####and sum up values in the third column for the same values.> with( data[!duplicated(data$first), ], tapply(third, first, sum) ) b c e f j k u 2 6 3 8 5 4 1 Cannot quite figure out how you got you supposed correct result if you are removing duplicates. So maybe you are incorrectly stating the problem and want: > with( data, tapply(third, first, sum) ) b c e f j k u 2 15 13 8 5 4 8 -- David> > mm=melt(data,id='first') > sum=cast(mm,first~variable,sum) > > ### This does not work. >< > > But the expected dataframe is like this: > > 1 u third 8 > 2 b third 2 > 3 e third 13 > 4 k third 4 > 5 j third 5 > 6 c third 15 > 8 f third 8 > Thanks in advance. > > Yi > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Seemingly Similar Threads
- How to delete the replicate rows by summing up the numeric columns
- count data with a specific range
- How to predict the mean and variance of the dependent variable after regression
- Verify the linear regression model used in R ( fundamental theory)
- Variance of the prediction in the linear regression model (Theory and programming)