thr3ads.net - similar to: "performance issue with merge"

Displaying 20 results from an estimated 3000 matches similar to: "performance issue with merge"

2009 Dec 10

problem with data processing in R

Hi, I'm stuck with parsing data into R for heatmap representation. The data looks like: 1 id1 x1 x2 x3 .... x20 2 id1 x1 x2 x3 .... x20 3 id1 x1 x2 x3 .... x20 4 id1 x1 x2 x3 .... x20 ......... 348 id2 x1 x2 x3 .... x20 349 id2 x1 x2 x3 .... x20 350 id2 x1 x2 x3 .... x20 351 id2 x1 x2 x3 .... x20 ......... The data is sorted for the IDs (id1,id2 .....id40) and I like to

how to match or merge data.frames in this case...

2008 Sep 04

how to match or merge data.frames in this case...

Hi, I'm trying to match two data frames in order to replace the boundary values in a dataframe.1 with values in dataframe.2 ONLY where the pair id1 id2 matches between the two data frames. Eg. > dataframe.1 ... id1 id2 boundary 3307 1095 1108 438.691 3308 1095 1109 438.691 3309 1095 1121 438.691 3310 1096 1109 438.691 ... 3345 1108 1120 438.691 3346 1108 1121 438.691 3347 1108

Wishlist: merge and subset to keep attributes (PR#8658)

2006 Mar 05

Wishlist: merge and subset to keep attributes (PR#8658)

Full_Name: Ulrike Gr?mping Version: 2.2.1 OS: Windows Submission from: (NULL) (84.190.139.94) When importing data from SPSS, it is a nice feature of the package foreign that it allows (option use.value.labels=F) to work with the original SPSS codes while keeping the value labels as information in an attribute. Unfortunately, after merging or subsetting, these attributes disappear. The code

Fuzzy merge using timestamps

2010 Nov 08

Fuzzy merge using timestamps

Greetings Supreme Council of R Masters, Like toddler, I have gotten my head stuck in the banisters of R ... again. Let it be know I am still a neophyte in the R-community forum world, so please don't flame me too bad. I have two sets of data, each with a set of timestamps. I would like to somehow merge the datasets based on the timestamps and an individual identifier. That is there are

fuzzy merge

2008 Apr 09

fuzzy merge

Hi, I would like to merge two data frames. It is just that I want the merging to be done with some kind of a fuzzy criterion. Let me explain. My first data frame looks like this : ID1 time1 dt 1 2008-01-02 13:11 10 2 2008-01-02 14:20 20 3

Doubt in simple merge

2014 Jan 16

Doubt in simple merge

Dear R community I have a two data set called "Elder" and "Younger". This is my code for simple merge. Elder <- data.frame( ID=c("ID1","ID2","ID3"), age=c(38,35,31)) Younger <- data.frame( ID=c("ID4","ID5","ID3"), age=c(29,21,31)) mer <- merge(Elder,Younger,by="ID", all=T) Output I am

lazy evaluation (was RE: Number of replications of a term)

2006 Jan 25

lazy evaluation (was RE: Number of replications of a term)

From: Thomas Lumley > > On Wed, 25 Jan 2006, Ray Brownrigg wrote: > > > There's an even faster one, which nobody seems to have > mentioned yet: > > > > rep(l <- rle(ids)$lengths, l) > > I considered this but it wasn't clear to me from the initial > post that > each ID occupied a contiguous section of the vector. > > Also, lazy

help for segmented package

2007 Dec 08

help for segmented package

Hi, I am trying to find m breakpoints of a linear regression model. I used the segmented package. It works fine for small number of predicators and breakpoints.(3 r.v. 3 points). However, my model has 14 variables it even would not work even for just one breakpoints!. The error message is always estimated breakpoints are out of range. Since my problem is time related problem. So I

extract the data that match

2010 Feb 17

extract the data that match

Hi r-users, I would like to extract the data that match. Attached is my data: I'm interested in matchind the value in column 'intg' with value in column 'rand_no' > cbind(z=z,intg=dd,rand_no = rr) z intg rand_no [1,] 0.00 0.000 0.001 [2,] 0.01 0.000 0.002 [3,] 0.02 0.000 0.002 [4,] 0.03 0.000 0.003 [5,] 0.04 0.000 0.003 [6,]

applying duplicated, unique and match to lists?

2007 Nov 02

applying duplicated, unique and match to lists?

Dear R developers, While improving duplicated.array() and friends and developing equivalents for the new ff package for large datasets I came across two questions: 1) is it safe to use duplicated.default(), unique.default() and match() on arbitrary lists? If so, we can speed up duplicated.array and friends considerably by using list() instead of paste(collapse="\r") 2) while

merging two data frame with colomns of different length

2009 Jul 31

merging two data frame with colomns of different length

Dear all, I am trying to merge two data frames based on a common column but for this common column both data frame do not have the same length and associated information. I checked previous exemples in the list but was not able to apply them in my case... Is someone know how to do that? Below is my code with the expected result: # data frame 1 Id1 <- c(1,1,1,2,2,2,3,3,3) Habit1 <-

Generalized linear model

2003 Nov 17

Generalized linear model

Hi all! I am fitting a Poisson model, using the following command: > fit2<-glm(canc~id1+year1+time+lnpa,family=poisson) where 'id1', 'year1' and 'time' are factors. I defined them with: > id1<-C(factor(id1), treatment) and 'lnpa' is a continuous variable. The 'summary' function gives me all the effects estimates, that is, for id1, I

Choose between duplicated rows

2012 Apr 14

Choose between duplicated rows

Dear r experts, Sorry for this basic question, but I can't seem to find a solution? I have this data frame: df <- data.frame(id = c("id1", "id1", "id1", "id2", "id2", "id2"), A = c(11905, 11907, 11907, 11829, 11829, 11829), v1 = c(NA, 3, NA,1,2,NA), v2 = c(NA,2,NA, 2, NA,NA), v3 = c(NA,1,NA,1,NA,NA), v4 = c("N",

Removing rows that are duplicates but column values are in reversed order

2013 Apr 12

Removing rows that are duplicates but column values are in reversed order

Hi, From your example data, dat1<- read.table(text=" id1?? id2?? value a????? b?????? 10 c????? d??????? 11 b???? a???????? 10 c????? e???????? 12 ",sep="",header=TRUE,stringsAsFactors=FALSE) #it is easier to get the output you wanted dat1[!duplicated(dat1$value),] #? id1 id2 value #1?? a?? b??? 10 #2?? c?? d??? 11 #4?? c?? e??? 12 But, if you have cases like the one

Parsing

2008 Jul 09

Parsing

Dear R users, I have a big text file formatted like this: x x_string y y_string id1 id1_string id2 id2_string z z_string w w_string stuff stuff stuff stuff stuff stuff stuff stuff stuff // x x_string1 y y_string1 z z_string1 w w_string1 stuff stuff stuff stuff stuff stuff stuff stuff stuff // x x_string2 y y_string2 id1

average columns of data frame corresponding to replicates

2010 Sep 07

average columns of data frame corresponding to replicates

Hi Group, I have a data frame below. Within this data frame there are samples (columns) that are measured more than once. Samples are indicated by "idx". So "id1" is present in columns 1, 3, and 5. Not every id is repeated. I would like to create a new data frame so that the repeated ids are averaged. For example, in the new data frame, columns 1, 3, and 5 of the original

Subtracting rows by id

2011 May 25

Subtracting rows by id

Dear R users, I have two datasets: id1 <- c(rep(1,10), rep(2,10), rep(3,10)) value1 <- sample(1:100, 30, replace=TRUE) dataset1 <- cbind(id1,value1) id2 <- c(1,2,3) subtract.value <- c(1,3,5) dataset2 <- cbind(id2, subtract.value) I want to subtract the number of rows in the subtract.value that corresponds to the id value in dataset1. So for the 1 in id1, I want to

x and y lengths differ

2013 May 15

x and y lengths differ

I have a problem with R. I try to compute the confidence interval for my df. When I want to create the plot I have this problem: Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ. I try this code: library(dplR) df.rwi <- detrend(rwl = df, method = "Spline",nyrs=NULL) write.table(df.rwi,file="rwi.txt",quote=FALSE,row.names=TRUE)

zpool version 3 & Uberblock version 9 , zpool upgrade only half succeeded?

2007 Dec 13

zpool version 3 & Uberblock version 9 , zpool upgrade only half succeeded?

We are currently experiencing a very huge perfomance drop on our zfs storage server. We have 2 pools, pool 1 stor is a raidz out of 7 iscsi nodes, home is a local mirror pool. Recently we had some issues with one of the storagenodes, because of that the pool was degraded. Since we did not succeed in bringing this storagenode back online (on zfs level) we upgraded our nashead from opensolaris b57

[LLVMdev] Measurements of the new inlinehint attribute

2010 Feb 15

[LLVMdev] Measurements of the new inlinehint attribute

Friday I enabled the inlinehint function attribute in the inliner. It mostly affects the performance of -Os compiled code. I have made some measurements on the SPEC test suite to show what it means. I made three runs of then nightly tests. The baseline represents -Os with no inlinehint: make TEST=nightly OPTFLAGS=-Os EXTRA_LOPT_OPTIONS=-inlinehint-threshold=0

similar to: performance issue with merge