thr3ads.net - similar to: "Select Original and Duplicates"

Displaying 20 results from an estimated 30000 matches similar to: "Select Original and Duplicates"

duplicated() variation that goes both ways to capture all duplicates

2012 Jul 23

duplicated() variation that goes both ways to capture all duplicates

Dear all The trouble with the current duplicated() function in is that it can report duplicates while searching fromFirst _or_ fromLast, but not both ways. Often users will want to identify and extract all the copies of the item that has duplicates, not only the duplicates themselves. To take the example from the man page: > data(iris) > iris[duplicated(iris), ] ##duplicates while

Keep rows in a dataset if one value in a column is duplicated

2012 Sep 27

Keep rows in a dataset if one value in a column is duplicated

Hi, I have a data set of observations by either one person or a pair of people. I want to only keep the pair observations, and was using the code below until it gave me the error " $ operator is invalid for atomic vectors". I am just beginning to learn R, so I apologize if the code is really rough. Basically I want to keep all the rows in the data set for which the value of

Identifying duplicate rows?

2012 Sep 10

Identifying duplicate rows?

Hi, I am trying to identify duplicate values in a column in a date frame. The duplicated function identifies the duplicate rows in the data frame but it only does this for the second record, not both records. Is there a way to mark both rows in the data frame as TRUE? dfA$dups<-duplicated(dfA$Value) dfA Site State Value dups 929 VA 73 FALSE 929 VA 73 TRUE 930 VA 76 FALSE 930 VA 76 TRUE 931

Extracting last time value

2008 Jan 10

Extracting last time value

I have a dataframe as follows: Date time value 20110620 11:18:00 7 20110620 11:39:00 9 20110621 11:41:00 8 20110621 11:40:00 6 20110622 14:05:00 8 20110622 14:06:00 6 For every date, I want to extract the row that has the greatest time. Therefore, ending up like: 20110620 11:39:00 9 20110621 11:41:00 8 20110622 14:07:00 6 I am using for loops (for every date, find largest time value) to do

Identify duplicate numbers and to increase a value

2011 Jan 20

Identify duplicate numbers and to increase a value

Hi everybody. I want to identify duplicate numbers and to increase a value of 0.01 for each time that it is duplicated. Example: x=c(1,2,3,5,6,2,8,9,2,2) I want to do this: 1 2 + 0.01 3 5 6 2 + 0.02 8 9 2 + 0.03 2 + 0.04 I am trying to get something like this: 1 2.01 3 5 6 2.02 8 9 2.03 2.04 Actually I just know the way to identify the duplicated numbers rbind(x, duplicated(x) |

How to pick colums from a ragged array?

2012 Oct 23

How to pick colums from a ragged array?

I have a large dataset (~1 million rows) of three variables: ID (patient's name), DATE (of appointment) and DIAGNOSIS (given on that date). Patients may have been assigned more than one diagnosis at any one appointment - leading to two rows, same ID and DATE but different DIAGNOSIS. The diagnoses may change between appointments. I want to subset the data in two ways: - define groups

duplicate values

2008 Nov 16

duplicate values

Hei R Users, i have the following dataframe: Datetime Temperature and many more collumns 1 2008-6-1 00:00:00 5 2 2008-6-1 02:00:00 5 3 2008-6-1 03:00:00 6 4 2008-6-1 03:00:00 0 5 2008-6-1 04:00:00 6 6 2008-6-1 04:00:00 0 7 2008-6-1 05:00:00 7 8 2008-6-1 06:00:00

select rows with identical columns from a data frame

2013 Jan 18

select rows with identical columns from a data frame

I have a data frame with several columns. I want to select the rows with no NAs (as with complete.cases) and all columns identical. E.g., for --8<---------------cut here---------------start------------->8--- > f <- data.frame(a=c(1,NA,NA,4),b=c(1,NA,3,40),c=c(1,NA,5,40)) > f a b c 1 1 1 1 2 NA NA NA 3 NA 3 5 4 4 40 40 --8<---------------cut

randomly select duplicated entries

2008 Jul 09

randomly select duplicated entries

Using this data as an example dat <- read.table(textConnection("Id myvar 12 1 12 2 12 6 34 9 34 4 34 8 65 15 65 23"), header = TRUE) closeAllConnections() how can I create another data set that does not have duplicate entries for 'Id', but the included values are randomly selected from the available ones. Thanks! Juliet

read.table issue with "#"

2012 Mar 01

read.table issue with "#"

Hello, > > The problem is that I get a the following error bacause anything after the > # is ignored. > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : > line 6 did not have 500 elements > > R thinks that line 6 has only 2 elements because of the #. > Use 'readLines' instead, followed by 'strsplit'. In the

please check this

2013 Jun 10

please check this

Hi, Try this: which(duplicated(res10Percent)) # [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 379 #[20] 413 415 417 441 459 461 477 479 505 res10PercentSub1<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1)? #most of the duplicated are dummy==1 res10PercentSub0<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0)

Help with programming a tricky algorithm

2012 Oct 20

Help with programming a tricky algorithm

Hi All, I'm a little stumped by the following problem. I've got a dataset with the following structure: idxy ix iy country (other variables) 1 1 1 c1 x1 2 1 2 c1 x2 3 1 3 c1 x3 . . . . . 3739 55 67 c7 x3739 3740 55 68 c7 x3740 where ix and

Duplicates and duplicated

2009 May 14

Duplicates and duplicated

Hi everybody. I want to identify not only duplicate number but also the original number that has been duplicated. Example: x=c(1,2,3,4,4,5,6,7,8,9) y=duplicated(x) rbind(x,y) gives: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] x 1 2 3 4 4 5 6 7 8 9 y 0 0 0 0 1 0 0 0 0 0 i.e. the second 4 [,5] is a duplicate. What I want is

How can I get the Ids with Duplicated key and corresponding Ids with original key?

2012 Aug 13

How can I get the Ids with Duplicated key and corresponding Ids with original key?

In this following example Id 4 is duplicated with Id 1. Like this I want both Ids (Duplicated and Duplicated with). Can anyone help? df <- data.frame( "Publication" = c(1, 2, 3, 1, 4, 5, 2, 3), "Reference" = c("a", "b", "c", "a", "d", "e", "b", "c"), "Id"= c(1, 2, 3, 4,

Matrix to data frame conversion

2012 Nov 12

Matrix to data frame conversion

I have a matrix which I wanted to convert to a data frame. As I could not succeed and resorted to export to csv and reimport it again. Why did I fail in the attempt and how can I achieve what I wanted without this roundabouts? The original matrix: > str(comb_model0) num [1:90, 1:4] 3.5938 0.0274 0.0342 0.0135 0.0207 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:90]

match and new columns

2017 Dec 13

match and new columns

Hi all, I have a data frame tdat <- read.table(textConnection("A B C Y A12 B03 C04 0.70 A23 B05 C06 0.05 A14 B06 C07 1.20 A25 A23 A12 3.51 A16 A25 A14 2,16"),header = TRUE) I want match tdat$B with tdat$A and populate the column values of tdat$A ( col A and Col B) in the newly created columns (col D and col E). please find my attempt and the desired output below Desired output

match and new columns

2017 Dec 13

match and new columns

Thank you Rui, I did not get the desired result. Here is the output from your script A B C Y D E 1 A12 B03 C04 0.70 0 0 2 A23 B05 C06 0.05 0 0 3 A14 B06 C07 1.20 0 0 4 A25 A23 A12 3.51 1 1 5 A16 A25 A14 2,16 4 4 On Wed, Dec 13, 2017 at 4:36 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote: > Hello, > > Here is one way. > > tdat$D <- ifelse(tdat$B %in% tdat$A,

dput()

2020 Feb 29

dput()

I think Robin knows about FAQ 7.31/floating point (author of 'Brobdingnag', among other numerical packages). I agree that this is surprising (to me). To reframe this question: is there way to get an *exact* ASCII representation of a numeric value (i.e., guaranteeing the restored value is identical() to the original) ? .deparseOpts has ?"digits17"?: Real and finite complex

loop command to matrix

2012 Nov 27

loop command to matrix

Dear UseRs,Extremely sorry for a basic question. I have a matrix of 19 rows and 365 columns. what i want to do is the following...First i want to leave out column number 1 and want to calculate the row wise mean of the remaining columns, which will obviously give me 365 values in one column, and then subtracting these values from the column i left out i.e. col=1 then i want to leave out column 2

match and new columns

2017 Dec 14

match and new columns

Hi Bill, I put stringsAsFactors = FALSE still did not work. tdat <- read.table(textConnection("A B C Y A12 B03 C04 0.70 A23 B05 C06 0.05 A14 B06 C07 1.20 A25 A23 A12 3.51 A16 A25 A14 2,16"),header = TRUE ,stringsAsFactors = FALSE) tdat$D <- 0 tdat$E <- 0 tdat$D <- (ifelse(tdat$B %in% tdat$A, tdat$A[tdat$B], 0)) tdat$E <- (ifelse(tdat$B %in% tdat$A, tdat$A[tdat$C], 0))

similar to: Select Original and Duplicates