Displaying 20 results from an estimated 3000 matches similar to: "performance issue with merge"
2009 Dec 10
1
problem with data processing in R
Hi,
I'm stuck with parsing data into R for heatmap representation.
The data looks like:
1 id1 x1 x2 x3 .... x20
2 id1 x1 x2 x3 .... x20
3 id1 x1 x2 x3 .... x20
4 id1 x1 x2 x3 .... x20
.........
348 id2 x1 x2 x3 .... x20
349 id2 x1 x2 x3 .... x20
350 id2 x1 x2 x3 .... x20
351 id2 x1 x2 x3 .... x20
.........
The data is sorted for the IDs (id1,id2 .....id40) and I like to
2008 Sep 04
1
how to match or merge data.frames in this case...
Hi,
I'm trying to match two data frames in order to replace the boundary values
in a dataframe.1 with values in dataframe.2 ONLY where the pair id1 id2
matches between the two data frames.
Eg.
> dataframe.1
... id1 id2 boundary
3307 1095 1108 438.691
3308 1095 1109 438.691
3309 1095 1121 438.691
3310 1096 1109 438.691
...
3345 1108 1120 438.691
3346 1108 1121 438.691
3347 1108
2006 Mar 05
2
Wishlist: merge and subset to keep attributes (PR#8658)
Full_Name: Ulrike Gr?mping
Version: 2.2.1
OS: Windows
Submission from: (NULL) (84.190.139.94)
When importing data from SPSS, it is a nice feature of the package foreign that
it allows (option use.value.labels=F) to work with the original SPSS codes while
keeping the value labels as information in an attribute. Unfortunately, after
merging or subsetting, these attributes disappear.
The code
2010 Nov 08
2
Fuzzy merge using timestamps
Greetings Supreme Council of R Masters,
Like toddler, I have gotten my head stuck in the banisters of R ... again.
Let it be know I am still a neophyte in the R-community forum world, so
please don't flame me too bad.
I have two sets of data, each with a set of timestamps. I would like to
somehow merge the datasets based on the timestamps and an individual
identifier. That is there are
2008 Apr 09
2
fuzzy merge
Hi,
I would like to merge two data frames. It is just that I want the merging to be done with some kind of a fuzzy criterion. Let me explain.
My first data frame looks like this :
ID1 time1 dt
1 2008-01-02 13:11 10
2 2008-01-02 14:20 20
3
2014 Jan 16
1
Doubt in simple merge
Dear R community
I have a two data set called "Elder" and "Younger".
This is my code for simple merge.
Elder <- data.frame(
ID=c("ID1","ID2","ID3"),
age=c(38,35,31))
Younger <- data.frame(
ID=c("ID4","ID5","ID3"),
age=c(29,21,31))
mer <- merge(Elder,Younger,by="ID", all=T)
Output I am
2006 Jan 25
0
lazy evaluation (was RE: Number of replications of a term)
From: Thomas Lumley
>
> On Wed, 25 Jan 2006, Ray Brownrigg wrote:
>
> > There's an even faster one, which nobody seems to have
> mentioned yet:
> >
> > rep(l <- rle(ids)$lengths, l)
>
> I considered this but it wasn't clear to me from the initial
> post that
> each ID occupied a contiguous section of the vector.
>
> Also, lazy
2007 Dec 08
0
help for segmented package
Hi,
I am trying to find m breakpoints of a linear regression model. I
used the segmented package. It works fine for small number of
predicators and breakpoints.(3 r.v. 3 points). However, my model has
14 variables it even would not work even for just one breakpoints!.
The error message is always estimated breakpoints are out of range.
Since my problem is time related problem. So I
2010 Feb 17
2
extract the data that match
Hi r-users,
I would like to extract the data that match. Attached is my data:
I'm interested in matchind the value in column 'intg' with value in column 'rand_no'
> cbind(z=z,intg=dd,rand_no = rr)
z intg rand_no
[1,] 0.00 0.000 0.001
[2,] 0.01 0.000 0.002
[3,] 0.02 0.000 0.002
[4,] 0.03 0.000 0.003
[5,] 0.04 0.000 0.003
[6,]
2007 Nov 02
0
applying duplicated, unique and match to lists?
Dear R developers,
While improving duplicated.array() and friends and developing equivalents for the new ff package for large datasets I came across two questions:
1) is it safe to use duplicated.default(), unique.default() and match() on arbitrary lists? If so, we can speed up duplicated.array and friends considerably by using list() instead of paste(collapse="\r")
2) while
2009 Jul 31
2
merging two data frame with colomns of different length
Dear all,
I am trying to merge two data frames based on a common column but for this
common column both data frame do not have the same length and associated
information. I checked previous exemples in the list but was not able to apply
them in my case... Is someone know how to do that? Below is my code with the
expected result:
# data frame 1
Id1 <- c(1,1,1,2,2,2,3,3,3)
Habit1 <-
2003 Nov 17
1
Generalized linear model
Hi all!
I am fitting a Poisson model, using the following command:
> fit2<-glm(canc~id1+year1+time+lnpa,family=poisson)
where 'id1', 'year1' and 'time' are factors. I defined them with:
> id1<-C(factor(id1), treatment)
and 'lnpa' is a continuous variable.
The 'summary' function gives me all the effects estimates, that is, for id1,
I
2012 Apr 14
3
Choose between duplicated rows
Dear r experts,
Sorry for this basic question, but I can't seem to find a solution?
I have this data frame:
df <- data.frame(id = c("id1", "id1", "id1", "id2", "id2", "id2"), A =
c(11905, 11907, 11907, 11829, 11829, 11829), v1 = c(NA, 3, NA,1,2,NA), v2 =
c(NA,2,NA, 2, NA,NA), v3 = c(NA,1,NA,1,NA,NA), v4 = c("N",
2013 Apr 12
1
Removing rows that are duplicates but column values are in reversed order
Hi,
From your example data,
dat1<- read.table(text="
id1?? id2?? value
a????? b?????? 10
c????? d??????? 11
b???? a???????? 10
c????? e???????? 12
",sep="",header=TRUE,stringsAsFactors=FALSE)
#it is easier to get the output you wanted
dat1[!duplicated(dat1$value),]
#? id1 id2 value
#1?? a?? b??? 10
#2?? c?? d??? 11
#4?? c?? e??? 12
But, if you have cases like the one
2008 Jul 09
2
Parsing
Dear R users,
I have a big text file formatted like this:
x x_string
y y_string
id1 id1_string
id2 id2_string
z z_string
w w_string
stuff stuff stuff
stuff stuff stuff
stuff stuff stuff
//
x x_string1
y y_string1
z z_string1
w w_string1
stuff stuff stuff
stuff stuff stuff
stuff stuff stuff
//
x x_string2
y y_string2
id1
2010 Sep 07
1
average columns of data frame corresponding to replicates
Hi Group,
I have a data frame below. Within this data frame there are samples
(columns) that are measured more than once. Samples are indicated by
"idx". So "id1" is present in columns 1, 3, and 5. Not every id is
repeated. I would like to create a new data frame so that the repeated
ids are averaged. For example, in the new data frame, columns 1, 3,
and 5 of the original
2011 May 25
1
Subtracting rows by id
Dear R users,
I have two datasets:
id1 <- c(rep(1,10), rep(2,10), rep(3,10))
value1 <- sample(1:100, 30, replace=TRUE)
dataset1 <- cbind(id1,value1)
id2 <- c(1,2,3)
subtract.value <- c(1,3,5)
dataset2 <- cbind(id2, subtract.value)
I want to subtract the number of rows in the subtract.value that
corresponds to the id value in dataset1. So for the 1 in id1, I want
to
2013 May 15
1
x and y lengths differ
I have a problem with R. I try to compute the confidence interval for my
df. When I want to create the plot I have this problem: Error in
xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ.
I try this code:
library(dplR)
df.rwi <- detrend(rwl = df, method = "Spline",nyrs=NULL)
write.table(df.rwi,file="rwi.txt",quote=FALSE,row.names=TRUE)
2007 Dec 13
0
zpool version 3 & Uberblock version 9 , zpool upgrade only half succeeded?
We are currently experiencing a very huge perfomance drop on our zfs storage server.
We have 2 pools, pool 1 stor is a raidz out of 7 iscsi nodes, home is a local mirror pool. Recently we had some issues with one of the storagenodes, because of that the pool was degraded. Since we did not succeed in bringing this storagenode back online (on zfs level) we upgraded our nashead from opensolaris b57
2010 Feb 15
0
[LLVMdev] Measurements of the new inlinehint attribute
Friday I enabled the inlinehint function attribute in the inliner. It mostly affects the performance of -Os compiled code. I have made some measurements on the SPEC test suite to show what it means.
I made three runs of then nightly tests. The baseline represents -Os with no inlinehint:
make TEST=nightly OPTFLAGS=-Os EXTRA_LOPT_OPTIONS=-inlinehint-threshold=0