as a r noob i am having another problem: i have a big dataframe where each row corresponds to one entry and each column is a field... for instance, i have the column ID and time and many more... Id like to get a dataframe where all IDs are just included once (some users with that ID might have several entries but Id like to kepp only one).. when i use unique I only get a list of the levels (or different IDs) could someone help me out and tell me how to get the dataframe with only one entry for each ID? -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123939.html Sent from the R help mailing list archive at Nabble.com.
Did you try: if x is the data frame, unique(x)? ----- Lanna Jin lannajin at gmail.com 510-898-8525 -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123956.html Sent from the R help mailing list archive at Nabble.com.
Try, unique(dataset[,1:a]), where a is the number of columns that you have. 1:a would apply the unique to all columns. ----- Lanna Jin lannajin at gmail.com 510-898-8525 -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123976.html Sent from the R help mailing list archive at Nabble.com.
I dont want to apply the unique for all columns but just the ID column. -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2124011.html Sent from the R help mailing list archive at Nabble.com.
names() ----- Lanna Jin lannajin at gmail.com 510-898-8525 -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2124036.html Sent from the R help mailing list archive at Nabble.com.
could you please elaborate a little more on that? -- View this message in context: http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2124055.html Sent from the R help mailing list archive at Nabble.com.
Hi: Here are three solutions; since this question comes up fairly often, you can find other solutions in the R-help archives. (1) Use functions from base R: split the data frame by ID, extract the first record from each split and slurp them together with rbind():> do.call(rbind, lapply(split(df, df$ID), head, 1))ID x 1 1 -0.7736769 2 2 -0.7906979 3 3 0.3889229 4 4 -1.2277544 5 5 0.2820819 (2) Use the plyr package and function ddply(): library(plyr)> ddply(df, .(ID), head, 1)ID x 1 1 -0.7736769 2 2 -0.7906979 3 3 0.3889229 4 4 -1.2277544 5 5 0.2820819 (3) A third solution using package doBy: library(doBy)> df[firstobs(~ ID, data = df), ]ID x 1 1 -0.7736769 2 2 -0.7906979 5 3 0.3889229 10 4 -1.2277544 12 5 0.2820819 HTH, Dennis On Mon, May 3, 2010 at 6:04 AM, someone <vonhoffen@t-online.de> wrote:> > as a r noob i am having another problem: > i have a big dataframe where each row corresponds to one entry and each > column is a field... > for instance, i have the column ID and time and many more... > Id like to get a dataframe where all IDs are just included once (some users > with that ID might have several entries but Id like to kepp only one).. > when i use unique I only get a list of the levels (or different IDs) > could someone help me out and tell me how to get the dataframe with only > one > entry for each ID? > > -- > View this message in context: > http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123939.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
How about yourdata[ !duplicated(yourdata$ID), ] ? See ?duplicated for more information. HTH, Jorge On Mon, May 3, 2010 at 9:04 AM, someone <> wrote:> > as a r noob i am having another problem: > i have a big dataframe where each row corresponds to one entry and each > column is a field... > for instance, i have the column ID and time and many more... > Id like to get a dataframe where all IDs are just included once (some users > with that ID might have several entries but Id like to kepp only one).. > when i use unique I only get a list of the levels (or different IDs) > could someone help me out and tell me how to get the dataframe with only > one > entry for each ID? > > -- > View this message in context: > http://r.789695.n4.nabble.com/Delete-rows-with-duplicate-field-tp2123939p2123939.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]