similar to: Select Original and Duplicates

Displaying 20 results from an estimated 30000 matches similar to: "Select Original and Duplicates"

2012 Jul 23
1
duplicated() variation that goes both ways to capture all duplicates
Dear all The trouble with the current duplicated() function in is that it can report duplicates while searching fromFirst _or_ fromLast, but not both ways. Often users will want to identify and extract all the copies of the item that has duplicates, not only the duplicates themselves. To take the example from the man page: > data(iris) > iris[duplicated(iris), ] ##duplicates while
2012 Sep 27
3
Keep rows in a dataset if one value in a column is duplicated
Hi, I have a data set of observations by either one person or a pair of people. I want to only keep the pair observations, and was using the code below until it gave me the error " $ operator is invalid for atomic vectors". I am just beginning to learn R, so I apologize if the code is really rough. Basically I want to keep all the rows in the data set for which the value of
2012 Sep 10
4
Identifying duplicate rows?
Hi, I am trying to identify duplicate values in a column in a date frame. The duplicated function identifies the duplicate rows in the data frame but it only does this for the second record, not both records. Is there a way to mark both rows in the data frame as TRUE? dfA$dups<-duplicated(dfA$Value) dfA Site State Value dups 929 VA 73 FALSE 929 VA 73 TRUE 930 VA 76 FALSE 930 VA 76 TRUE 931
2008 Jan 10
5
Extracting last time value
I have a dataframe as follows: Date time value 20110620 11:18:00 7 20110620 11:39:00 9 20110621 11:41:00 8 20110621 11:40:00 6 20110622 14:05:00 8 20110622 14:06:00 6 For every date, I want to extract the row that has the greatest time. Therefore, ending up like: 20110620 11:39:00 9 20110621 11:41:00 8 20110622 14:07:00 6 I am using for loops (for every date, find largest time value) to do
2011 Jan 20
6
Identify duplicate numbers and to increase a value
Hi everybody. I want to identify duplicate numbers and to increase a value of 0.01 for each time that it is duplicated. Example: x=c(1,2,3,5,6,2,8,9,2,2) I want to do this: 1 2 + 0.01 3 5 6 2 + 0.02 8 9 2 + 0.03 2 + 0.04 I am trying to get something like this: 1 2.01 3 5 6 2.02 8 9 2.03 2.04 Actually I just know the way to identify the duplicated numbers rbind(x, duplicated(x) |
2012 Oct 23
10
How to pick colums from a ragged array?
I have a large dataset (~1 million rows) of three variables: ID (patient's name), DATE (of appointment) and DIAGNOSIS (given on that date). Patients may have been assigned more than one diagnosis at any one appointment - leading to two rows, same ID and DATE but different DIAGNOSIS. The diagnoses may change between appointments. I want to subset the data in two ways: - define groups
2008 Nov 16
4
duplicate values
Hei R Users, i have the following dataframe: Datetime Temperature and many more collumns 1 2008-6-1 00:00:00 5 2 2008-6-1 02:00:00 5 3 2008-6-1 03:00:00 6 4 2008-6-1 03:00:00 0 5 2008-6-1 04:00:00 6 6 2008-6-1 04:00:00 0 7 2008-6-1 05:00:00 7 8 2008-6-1 06:00:00
2013 Jan 18
5
select rows with identical columns from a data frame
I have a data frame with several columns. I want to select the rows with no NAs (as with complete.cases) and all columns identical. E.g., for --8<---------------cut here---------------start------------->8--- > f <- data.frame(a=c(1,NA,NA,4),b=c(1,NA,3,40),c=c(1,NA,5,40)) > f a b c 1 1 1 1 2 NA NA NA 3 NA 3 5 4 4 40 40 --8<---------------cut
2008 Jul 09
3
randomly select duplicated entries
Using this data as an example dat <- read.table(textConnection("Id myvar 12 1 12 2 12 6 34 9 34 4 34 8 65 15 65 23"), header = TRUE) closeAllConnections() how can I create another data set that does not have duplicate entries for 'Id', but the included values are randomly selected from the available ones. Thanks! Juliet
2012 Mar 01
2
read.table issue with "#"
Hello, > > The problem is that I get a the following error bacause anything after the > # is ignored. > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : > line 6 did not have 500 elements > > R thinks that line 6 has only 2 elements because of the #. > Use 'readLines' instead, followed by 'strsplit'. In the
2013 Jun 10
2
please check this
Hi, Try this: which(duplicated(res10Percent)) # [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 379 #[20] 413 415 417 441 459 461 477 479 505 res10PercentSub1<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1)? #most of the duplicated are dummy==1 res10PercentSub0<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0)
2012 Oct 20
2
Help with programming a tricky algorithm
Hi All, I'm a little stumped by the following problem. I've got a dataset with the following structure: idxy ix iy country (other variables) 1 1 1 c1 x1 2 1 2 c1 x2 3 1 3 c1 x3 . . . . . 3739 55 67 c7 x3739 3740 55 68 c7 x3740 where ix and
2009 May 14
4
Duplicates and duplicated
Hi everybody. I want to identify not only duplicate number but also the original number that has been duplicated. Example: x=c(1,2,3,4,4,5,6,7,8,9) y=duplicated(x) rbind(x,y) gives: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] x 1 2 3 4 4 5 6 7 8 9 y 0 0 0 0 1 0 0 0 0 0 i.e. the second 4 [,5] is a duplicate. What I want is
2012 Aug 13
5
How can I get the Ids with Duplicated key and corresponding Ids with original key?
In this following example Id 4 is duplicated with Id 1. Like this I want both Ids (Duplicated and Duplicated with). Can anyone help? df <- data.frame( "Publication" = c(1, 2, 3, 1, 4, 5, 2, 3), "Reference" = c("a", "b", "c", "a", "d", "e", "b", "c"), "Id"= c(1, 2, 3, 4,
2012 Nov 12
5
Matrix to data frame conversion
I have a matrix which I wanted to convert to a data frame. As I could not succeed and resorted to export to csv and reimport it again. Why did I fail in the attempt and how can I achieve what I wanted without this roundabouts? The original matrix: > str(comb_model0) num [1:90, 1:4] 3.5938 0.0274 0.0342 0.0135 0.0207 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:90]
2017 Dec 13
3
match and new columns
Hi all, I have a data frame tdat <- read.table(textConnection("A B C Y A12 B03 C04 0.70 A23 B05 C06 0.05 A14 B06 C07 1.20 A25 A23 A12 3.51 A16 A25 A14 2,16"),header = TRUE) I want match tdat$B with tdat$A and populate the column values of tdat$A ( col A and Col B) in the newly created columns (col D and col E). please find my attempt and the desired output below Desired output
2017 Dec 13
2
match and new columns
Thank you Rui, I did not get the desired result. Here is the output from your script A B C Y D E 1 A12 B03 C04 0.70 0 0 2 A23 B05 C06 0.05 0 0 3 A14 B06 C07 1.20 0 0 4 A25 A23 A12 3.51 1 1 5 A16 A25 A14 2,16 4 4 On Wed, Dec 13, 2017 at 4:36 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote: > Hello, > > Here is one way. > > tdat$D <- ifelse(tdat$B %in% tdat$A,
2020 Feb 29
3
dput()
I think Robin knows about FAQ 7.31/floating point (author of 'Brobdingnag', among other numerical packages). I agree that this is surprising (to me). To reframe this question: is there way to get an *exact* ASCII representation of a numeric value (i.e., guaranteeing the restored value is identical() to the original) ? .deparseOpts has ?"digits17"?: Real and finite complex
2012 Nov 27
3
loop command to matrix
Dear UseRs,Extremely sorry for a basic question. I have a matrix of 19 rows and 365 columns. what i want to do is the following...First i want to leave out column number 1 and want to calculate the row wise mean of the remaining columns, which will obviously give me 365 values in one column, and then subtracting these values from the column i left out i.e. col=1 then i want to leave out column 2
2017 Dec 14
1
match and new columns
Hi Bill, I put stringsAsFactors = FALSE still did not work. tdat <- read.table(textConnection("A B C Y A12 B03 C04 0.70 A23 B05 C06 0.05 A14 B06 C07 1.20 A25 A23 A12 3.51 A16 A25 A14 2,16"),header = TRUE ,stringsAsFactors = FALSE) tdat$D <- 0 tdat$E <- 0 tdat$D <- (ifelse(tdat$B %in% tdat$A, tdat$A[tdat$B], 0)) tdat$E <- (ifelse(tdat$B %in% tdat$A, tdat$A[tdat$C], 0))