thr3ads.net - similar to: "how to create duplicated ID in multi-records per subject dataset"

Displaying 20 results from an estimated 4000 matches similar to: "how to create duplicated ID in multi-records per subject dataset"

Replace values in a vector

2009 Dec 03

Replace values in a vector

Hi all, I have a vector like this: x<- c(0.7, 0.1, 0, 0.2, 0.2, 0, 0, 0 , 0, 0.4, 0, 0.8, 1.8) I would like to replace the zero values with the first previous non zero value. my returning vector should look like this: y<-c( 0.7, 0.1, 0.1,0.2,0.2,0.2,0.2,0.2, 0.4, 0.4, 0.8, 1.8) How can I do this in R without using for loop? Thank you

last observation carried forward +1

2011 Sep 30

last observation carried forward +1

Hi R-helpers I'm looking for a vectorised function which does missing value replacement as in last observation carried forward in the zoo package but instead of a locf, I would like the locf function to add +1 to each time a missing value occurred. See below for an example. > require(zoo) > x <- 5:15 > x[4:7] <- NA > coredata(na.locf(zoo(x))) [1] 5 6 7 7 7 7 7 12 13

using newdata in survfit with categorical variable

2008 Nov 11

using newdata in survfit with categorical variable

Hi R-helpers, I was trying to put gender='Male' in newdata to create a expected survival curve for a pseudo cohort by using survfit based on Cox regression. My codes are shown below: fit<- coxph(Surv(end, status2)~gender, data=wlwsn1) Summary(fit) coef exp(coef) se(coef) z p genderMale 0.204 1.23 0.0912 2.23 0.025

Choose between duplicated rows

2012 Apr 14

Choose between duplicated rows

Dear r experts, Sorry for this basic question, but I can't seem to find a solution? I have this data frame: df <- data.frame(id = c("id1", "id1", "id1", "id2", "id2", "id2"), A = c(11905, 11907, 11907, 11829, 11829, 11829), v1 = c(NA, 3, NA,1,2,NA), v2 = c(NA,2,NA, 2, NA,NA), v3 = c(NA,1,NA,1,NA,NA), v4 = c("N",

all duplicated wanted

2012 Aug 03

all duplicated wanted

Hi, Has anyone been able to figure out how to print all duplicated observations? I have a dataset, with patients ID, and other lab records. Some patients have multiple lab records, but 'duplicated' ID will only show me the duplicates, not the original observation. How can I print both the original one and the duplicates? Thanks

replace NA-values

2010 Jun 21

replace NA-values

Dear list, I'm trying to replace NA-values with the preceding values in that column. This code works, but I am sure there is a more elegant way... df <- data.frame(id = c("A1", NA, NA, NA, "B1", NA, NA, "C1", NA, NA, NA, NA), value = c(1:12)) rn <- c(rownames(df[!is.na(df$id),]), nrow(df)+1) rn <-

For loop gets exponentially slower as dataset gets larger...

2006 Jan 03

For loop gets exponentially slower as dataset gets larger...

I am running R 2.1.1 in a Microsoft Windows XP environment. I have a matrix with three vectors (“columns”) and ~2 million “rows”. The three vectors are date_, id, and price. The data is ordered (sorted) by code and date_. (The matrix contains daily prices for several thousand stocks, and has ~2 million “rows”. If a stock did not trade on a particular date, its price is set to “NA”)

cummax / cummin for complex numbers

2014 Jul 14

cummax / cummin for complex numbers

Dear all, in R 3.1.0, this is happening: > cummin(c(1+1i,2-3i,4+5i)) Error in cummin(c(1 + (0+1i), 2 - (0+3i), 4 + (0+5i))) : 'cummax' not defined for complex numbers > cummax(c(1+1i,2-3i,4+5i)) Error in cummax(c(1 + (0+1i), 2 - (0+3i), 4 + (0+5i))) : 'cummin' not defined for complex numbers It may be fixed in R-devel, but I thought I'd mention it to make sure

multiple paired t-tests without loops

2010 Apr 24

multiple paired t-tests without loops

I am new to R and I suspect my problem is easily solved, but I haven't been able to figure it out without using loops. I am trying to implement Blair & Karniski's (1993) permutation test. I've included a sample data frame below. This data frame represents the conditional means (C1, C2) for 3 subjects in 2 consecutive samples of a continuous data set (e.g. ERP waveform).

The function cummax() seems to have a bug.

2015 May 17

The function cummax() seems to have a bug.

Hi, The function cummax() seems to have a bug. > x <- c(NA, 0) > storage.mode(x) <- "integer" > cummax(x) [1] NA 0 The correct result of this case should be NA NA. The mistake in [ https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L130-L136] may be the reason. Best Regards, Dongcan -- Dongcan Jiang Team of Search Engine & Web Mining School of Electronic

Removing rows that are duplicates but column values are in reversed order

2013 Apr 12

Removing rows that are duplicates but column values are in reversed order

Hi, From your example data, dat1<- read.table(text=" id1?? id2?? value a????? b?????? 10 c????? d??????? 11 b???? a???????? 10 c????? e???????? 12 ",sep="",header=TRUE,stringsAsFactors=FALSE) #it is easier to get the output you wanted dat1[!duplicated(dat1$value),] #? id1 id2 value #1?? a?? b??? 10 #2?? c?? d??? 11 #4?? c?? e??? 12 But, if you have cases like the one

A more efficient way to roll values in an irregular time series dataset?

2010 Nov 08

A more efficient way to roll values in an irregular time series dataset?

Does anyone recommend a more efficient way to "roll" values in a time series dataset? I merged a bunch of different time series datasets (10's of thousands of them) whose observation dates and sampling interval differ. Some time series observations are reported at the beginning of the month, some at the end, some on Mondays, some on Wednesday, some annually, etc. In the

Seeking a more efficient way to find partition maxima

2008 Jan 07

Seeking a more efficient way to find partition maxima

Hi. Suppose I have a vector that I partition into disjoint, contiguous subvectors. For example, let v = c(1,4,2,6,7,5), partition it into three subvectors, v1 = v[1:3], v2 = v[4], v3 = v[5:6]. I want to find the maximum element of each subvector. In this example, max(v1) is 4, max(v2) is 6, max(v3) is 7. If I knew that the successive subvector maxima would never decrease, as in the example,

using "na.locf" from package zoo to fill NA gaps

2012 Jul 02

using "na.locf" from package zoo to fill NA gaps

Hi everybody, I have a small question about the function "na.locf" from the package "zoo". I saw in the help that this function is able to fill NA gaps with the last value before the NA gap (or with the next value). But it is possible to fill my NA gaps according to the last AND the next value at the same time? Actually, I want R to fill my gaps with the method of

LOCF - Last Observation Carried Forward

2003 Nov 14

LOCF - Last Observation Carried Forward

Hi! Is there a possibilty in R to carry out LOCF (Last Observation Carried Forward) analysis or to create a new data frame (array, matrix) with LOCF? Or some helpful functions, packages? Karl --------------------------------- Gesendet von http://mail.yahoo.de Schneller als Mail - der neue Yahoo! Messenger. [[alternative HTML version deleted]]

cumsum on chron objects

2005 May 17

cumsum on chron objects

Hi, Is there some alternative to cumsum for chron objects? I have data frames that contain some chron objects that look like this: DateTime 13/10/03 12:30:35 NA NA NA 15/10/03 16:30:05 NA NA ... and I've been trying to replace the NA's so that a date/time sequence is created starting with the preceding available value. Because the number of rows with NA's following each available

Creating regularly spaced time series from irregular one

2010 Feb 22

Creating regularly spaced time series from irregular one

Hello, I have a series of intraday (high-frequency) price data in the form of POSIX timestamp followed by the value. I sucesfuly loaded that into "its" package object. I would like to create from it a regularly spaced time series of prices (for example 1min, 5min, etc apart) so i could calcualte returns. There is an interpolation function locf() that for timestamp with value NA uses last

Simple loop code

2010 Apr 29

Simple loop code

Hi fellow R Users, I find that I typically rewrite my data specific to data in columns, which is by no means efficient and I am struggling to break out of this bad habit and utalise some of the excellent things R can do! I have tried to look at 'for' but I don't really follow it, and I wondered if anyone could help with a simple example using my script so I could follow this and build

Zoo series to a date time stamp that is regular

2010 Jun 28

Zoo series to a date time stamp that is regular

NOTE: I will provide data if necessary, but I didn't want clutter everyones mailbox All: I have a time series with level and temperature data for 11 sites for each of three bases. I will have to do this more than once is what I am saying here. OK, The time series are zoo objects with index values in chron format. The problem is that the date and times should be at even 15 min intervals,

Number of replications of a term

2006 Jan 24

Number of replications of a term

Hello, Is there a simple and fast function that returns a vector of the number of replications for each object of a vector ? For example : I have a vector of IDs : ids <- c( "ID1", "ID2", "ID2", "ID3", "ID3","ID3", "ID5") I want the function returns the following vector where each term is the number of replicates for the

similar to: how to create duplicated ID in multi-records per subject dataset