as a r noob i am having another problem: i have a big dataframe where each row corresponds to one entry and each column is a field... for instance, i have the column ID and time and many more... Id like to get a dataframe where all IDs are just included once (some users with that ID might have several entries but Id like to kepp only one).. when i use unique I only get a list of the levels (or different IDs) could someone help me out and tell me how to get the dataframe with only one entry for each ID? -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123939.html Sent from the R help mailing list archive at Nabble.com.
Did you try: if x is the data frame, unique(x)? ----- Lanna Jin lannajin at gmail.com 510-898-8525 -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123956.html Sent from the R help mailing list archive at Nabble.com.
Try, unique(dataset[,1:a]), where a is the number of columns that you have. 1:a would apply the unique to all columns. ----- Lanna Jin lannajin at gmail.com 510-898-8525 -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123976.html Sent from the R help mailing list archive at Nabble.com.
I dont want to apply the unique for all columns but just the ID column. -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2124011.html Sent from the R help mailing list archive at Nabble.com.
names() ----- Lanna Jin lannajin at gmail.com 510-898-8525 -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2124036.html Sent from the R help mailing list archive at Nabble.com.
could you please elaborate a little more on that? -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2124055.html Sent from the R help mailing list archive at Nabble.com.
Hi:
Here are three solutions; since this question comes up fairly often, you can
find
other solutions in the R-help archives.
(1) Use functions from base R: split the data frame by ID, extract the first
record from each split and slurp them together with rbind():
> do.call(rbind, lapply(split(df, df$ID), head, 1))
ID x
1 1 -0.7736769
2 2 -0.7906979
3 3 0.3889229
4 4 -1.2277544
5 5 0.2820819
(2) Use the plyr package and function ddply():
library(plyr)> ddply(df, .(ID), head, 1)
ID x
1 1 -0.7736769
2 2 -0.7906979
3 3 0.3889229
4 4 -1.2277544
5 5 0.2820819
(3) A third solution using package doBy:
library(doBy)> df[firstobs(~ ID, data = df), ]
ID x
1 1 -0.7736769
2 2 -0.7906979
5 3 0.3889229
10 4 -1.2277544
12 5 0.2820819
HTH,
Dennis
On Mon, May 3, 2010 at 6:04 AM, someone <vonhoffen@t-online.de> wrote:
>
> as a r noob i am having another problem:
> i have a big dataframe where each row corresponds to one entry and each
> column is a field...
> for instance, i have the column ID and time and many more...
> Id like to get a dataframe where all IDs are just included once (some users
> with that ID might have several entries but Id like to kepp only one)..
> when i use unique I only get a list of the levels (or different IDs)
> could someone help me out and tell me how to get the dataframe with only
> one
> entry for each ID?
>
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123939.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
How about yourdata[ !duplicated(yourdata$ID), ] ? See ?duplicated for more information. HTH, Jorge On Mon, May 3, 2010 at 9:04 AM, someone <> wrote:> > as a r noob i am having another problem: > i have a big dataframe where each row corresponds to one entry and each > column is a field... > for instance, i have the column ID and time and many more... > Id like to get a dataframe where all IDs are just included once (some users > with that ID might have several entries but Id like to kepp only one).. > when i use unique I only get a list of the levels (or different IDs) > could someone help me out and tell me how to get the dataframe with only > one > entry for each ID? > > -- > View this message in context: > http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123939.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]