Tuatara
2010-Jan-20 02:47 UTC
[R] Deleting rows based on duplicate entries in one columns in a data matrix
Hi everybody, I would like to delete rows based on duplicate entries in column 3 in the data matrix X (size 60000 x 57). I have tried the unique(x) command as> data <- X[unique(X[,3]),]however, for some reason the command introduces a lot of NA's into the dataset. So, now I'm looking for a function that eliminates rows, if they have duplicate values in column 3. Does anyone have an idea how to do this? Cheers and thanks a lot for the help, Fran -- View this message in context: http://n4.nabble.com/Deleting-rows-based-on-duplicate-entries-in-one-columns-in-a-data-matrix-tp1018110p1018110.html Sent from the R help mailing list archive at Nabble.com.
Henrique Dallazuanna
2010-Jan-20 03:09 UTC
[R] Deleting rows based on duplicate entries in one columns in a data matrix
Use duplicated indeed of unique. On Wed, Jan 20, 2010 at 12:47 AM, Tuatara <franziskabroell at gmail.com> wrote:> > Hi everybody, > > I would like to delete rows based on duplicate entries in column 3 in the > data matrix X (size 60000 x 57). I have tried the unique(x) command as > >> data <- X[unique(X[,3]),] > > however, for some reason the command introduces a lot of NA's into the > dataset. > So, now I'm looking for a function that eliminates rows, if they have > duplicate values in column 3. > > Does anyone have an idea how to do this? > > Cheers and thanks a lot for the help, > Fran > -- > View this message in context: http://n4.nabble.com/Deleting-rows-based-on-duplicate-entries-in-one-columns-in-a-data-matrix-tp1018110p1018110.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Gabor Grothendieck
2010-Jan-20 03:10 UTC
[R] Deleting rows based on duplicate entries in one columns in a data matrix
See: ?duplicated On Tue, Jan 19, 2010 at 9:47 PM, Tuatara <franziskabroell at gmail.com> wrote:> > Hi everybody, > > I would like to delete rows based on duplicate entries in column 3 in the > data matrix X (size 60000 x 57). I have tried the unique(x) command as > >> data <- X[unique(X[,3]),] > > however, for some reason the command introduces a lot of NA's into the > dataset. > So, now I'm looking for a function that eliminates rows, if they have > duplicate values in column 3. > > Does anyone have an idea how to do this? > > Cheers and thanks a lot for the help, > Fran > -- > View this message in context: http://n4.nabble.com/Deleting-rows-based-on-duplicate-entries-in-one-columns-in-a-data-matrix-tp1018110p1018110.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Tuatara
2010-Jan-20 03:23 UTC
[R] Deleting rows based on duplicate entries in one columns in a data matrix
Thanks for the swift replies. I have found this to work for my purpose: data <- subset(X, !duplicated(X[,3]) -- View this message in context: http://n4.nabble.com/Deleting-rows-based-on-duplicate-entries-in-one-columns-in-a-data-matrix-tp1018110p1018126.html Sent from the R help mailing list archive at Nabble.com.