thr3ads.net - R help - [R] Fastest way to compare a single value with all values in one column of a data frame [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Dimitri Liakhovitski

2013-Jan-29 21:11 UTC

[R] Fastest way to compare a single value with all values in one column of a data frame

Hello!

I have a large data frame x:
x<-data.frame(item=letters[1:5],a=1:5,b=11:15)  # in actuality, x has 1000
rows
x$item<-as.character(x$item)
I also have a small data frame y with just 1 row:
y<-data.frame(item="f",a=3,b=10)
y$item<-as.character(y$item)

I have to decide if y$a is larger than the smallest of all the values in
x$a. If it is, I want y to replace the whole row in x that has the lowest
value in column a.
This is how I'd do it.

if(y$a>min(x$a)){
  whichmin<-which(x$a==min(x$a))
  x[whichmin,]<-y[1,]
}


I am wondering if there is a faster way of doing it. What would be the
fastest possible way? I'd have to do it, unfortunately, many-many times.

Thank you very much!

-- 
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>

	[[alternative HTML version deleted]]

nalluri pratap

2013-Jan-30 12:32 UTC

head link

[R] Fastest way to compare a single value with all values in one column of a data frame

Hi Dimitri,
 
Does this help?
 
k1<-data.frame(item=sample(rep(letters),10,replace=T),a=c(1:10),b=11:20)
k2<-data.frame(item="f",a=3,b=10)
merge<-function(y,x)
{
if(y$a>min(x$a))
{
  x<-rbind(x,y)
  x<-x[-which.min(x$a),]
}
return(x)
}
merge(k2,k1)
 
or much faster way would be to refer "library(sqldf)".

--- On Wed, 30/1/13, Dimitri Liakhovitski <dimitri.liakhovitski@gmail.com>
wrote:


From: Dimitri Liakhovitski <dimitri.liakhovitski@gmail.com>
Subject: [R] Fastest way to compare a single value with all values in one column
of a data frame
To: "r-help" <r-help@r-project.org>
Date: Wednesday, 30 January, 2013, 2:41 AM


Hello!

I have a large data frame x:
x<-data.frame(item=letters[1:5],a=1:5,b=11:15)  # in actuality, x has 1000
rows
x$item<-as.character(x$item)
I also have a small data frame y with just 1 row:
y<-data.frame(item="f",a=3,b=10)
y$item<-as.character(y$item)

I have to decide if y$a is larger than the smallest of all the values in
x$a. If it is, I want y to replace the whole row in x that has the lowest
value in column a.
This is how I'd do it.

if(y$a>min(x$a)){
  whichmin<-which(x$a==min(x$a))
  x[whichmin,]<-y[1,]
}


I am wondering if there is a faster way of doing it. What would be the
fastest possible way? I'd have to do it, unfortunately, many-many times.

Thank you very much!

-- 
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>

    [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]

Jessica Streicher

2013-Jan-30 12:38 UTC

head link

[R] Fastest way to compare a single value with all values in one column of a data frame

If you wanted this for all values in x that are smaller, i'd use

x[x$a < y$a,] <- y

for just the smallest:

x[intersect(which(x$a < y$a),which.min(x$a)),] <- y


On 29.01.2013, at 22:11, Dimitri Liakhovitski wrote:
> Hello!
> 
> I have a large data frame x:
> x<-data.frame(item=letters[1:5],a=1:5,b=11:15)  # in actuality, x has
1000
> rows
> x$item<-as.character(x$item)
> I also have a small data frame y with just 1 row:
> y<-data.frame(item="f",a=3,b=10)
> y$item<-as.character(y$item)
> 
> I have to decide if y$a is larger than the smallest of all the values in
> x$a. If it is, I want y to replace the whole row in x that has the lowest
> value in column a.
> This is how I'd do it.
> 
> if(y$a>min(x$a)){
>  whichmin<-which(x$a==min(x$a))
>  x[whichmin,]<-y[1,]
> }
> 
> 
> I am wondering if there is a faster way of doing it. What would be the
> fastest possible way? I'd have to do it, unfortunately, many-many
times.
> 
> Thank you very much!
> 
> -- 
> Dimitri Liakhovitski
> gfk.com <http://marketfusionanalytics.com/>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

arun

2013-Jan-30 14:22 UTC

head link

[R] Fastest way to compare a single value with all values in one column of a data frame

Hi,
I guess you could also use:


?x[match(min(x$a),x$a[x$a<y$a]),]<- y
?x
#? item a? b
#1??? f 3 10
#2??? b 2 12
#3??? c 3 13
#4??? d 4 14
#5??? e 5 15
A.K.



----- Original Message -----
From: Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com>
To: r-help <r-help at r-project.org>
Cc: 
Sent: Tuesday, January 29, 2013 4:11 PM
Subject: [R] Fastest way to compare a single value with all values in one column
of a data frame

Hello!

I have a large data frame x:
x<-data.frame(item=letters[1:5],a=1:5,b=11:15)? # in actuality, x has 1000
rows
x$item<-as.character(x$item)
I also have a small data frame y with just 1 row:
y<-data.frame(item="f",a=3,b=10)
y$item<-as.character(y$item)

I have to decide if y$a is larger than the smallest of all the values in
x$a. If it is, I want y to replace the whole row in x that has the lowest
value in column a.
This is how I'd do it.

if(y$a>min(x$a)){
? whichmin<-which(x$a==min(x$a))
? x[whichmin,]<-y[1,]
}


I am wondering if there is a faster way of doing it. What would be the
fastest possible way? I'd have to do it, unfortunately, many-many times.

Thank you very much!

-- 
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

arun

2013-Jan-30 16:03 UTC

head link

[R] Fastest way to compare a single value with all values in one column of a data frame

HI,

Sorry, my previous solution doesn't work.
This should work for your dataset:
set.seed(1851)
x<-
data.frame(item=sample(letters[1:5],20,replace=TRUE),a=sample(1:15,20,replace=TRUE),b=sample(20:30,20,replace=TRUE),stringsAsFactors=F)
y<- data.frame(item="f",a=3,b=10,stringsAsFactors=F)
?x[x$a%in%which.min(x[x$a<y$a,]$a),]<- y #if there are multiple minimum
values

set.seed(1241)
x1<-
data.frame(item=sample(letters[1:10],1e4,replace=TRUE),a=sample(1:30,1e4,replace=TRUE),b=sample(1:100,1e4,replace=TRUE),stringsAsFactors=F)
y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F)
length(x1$a[x1$a==1])
#[1] 330
?system.time({x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1})
#?? user? system elapsed 
?# 0.000?? 0.000?? 0.001 
length(x1$a[x1$a==1])
#[1] 0


#For some reason, it is not working when the multiple number of minimum values
> some value

set.seed(1241)
x1<-
data.frame(item=sample(letters[1:10],1e5,replace=TRUE),a=sample(1:30,1e5,replace=TRUE),b=sample(1:100,1e5,replace=TRUE),stringsAsFactors=F)
y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F)
length(x1$a[x1$a==1])
#[1] 3404
x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1
?length(x1$a[x1$a==1])
#[1] 3404 #not getting replaced

#However, if I try:
set.seed(1241)
?x1<-
data.frame(item=sample(letters[1:10],1e6,replace=TRUE),a=sample(1:5000,1e6,replace=TRUE),b=sample(1:100,1e6,replace=TRUE),stringsAsFactors=F)
?y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F)
?length(x1$a[x1$a==1])
#[1] 208
?system.time(x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1)
#user? system elapsed 
?# 0.124?? 0.016?? 0.138 
? length(x1$a[x1$a==1])
#[1] 0


#Tried Jessica's solution:
set.seed(1851)
?x<-
data.frame(item=sample(letters[1:5],20,replace=TRUE),a=sample(1:15,20,replace=TRUE),b=sample(20:30,20,replace=TRUE),stringsAsFactors=F)
?y<- data.frame(item="f",a=3,b=10,stringsAsFactors=F)
?x[intersect(which(x$a < y$a),which.min(x$a)),] <- y
?x
#?? item? a? b
#1???? a? 8 25
#2???? a 10 26
#3???? f? 3 10 #replaced
#4???? e 15 26
#5???? b 13 20
#6???? a? 5 23
#7???? d? 4 29
#8???? e? 2 24
#9???? c? 7 30
#10??? e 14 24
#11??? d? 2 20
#12??? e 10 21
#13??? c 13 27
#14??? d 12 23
#15??? b 11 26
#16??? e? 5 22
#17??? c? 1 26? #it is not replaced
#18??? a? 8 21
#19??? e 10 26
#20??? c? 2 22



A.K.





----- Original Message -----
From: Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com>
To: r-help <r-help at r-project.org>
Cc: 
Sent: Tuesday, January 29, 2013 4:11 PM
Subject: [R] Fastest way to compare a single value with all values in one column
of a data frame

Hello!

I have a large data frame x:
x<-data.frame(item=letters[1:5],a=1:5,b=11:15)? # in actuality, x has 1000
rows
x$item<-as.character(x$item)
I also have a small data frame y with just 1 row:
y<-data.frame(item="f",a=3,b=10)
y$item<-as.character(y$item)

I have to decide if y$a is larger than the smallest of all the values in
x$a. If it is, I want y to replace the whole row in x that has the lowest
value in column a.
This is how I'd do it.

if(y$a>min(x$a)){
? whichmin<-which(x$a==min(x$a))
? x[whichmin,]<-y[1,]
}


I am wondering if there is a faster way of doing it. What would be the
fastest possible way? I'd have to do it, unfortunately, many-many times.

Thank you very much!

-- 
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Maybe Matching Threads

Search for more seemingly similar threads

R help - Jan 2013 - Fastest way to compare a single value with all values in one column of a data frame

[R] Fastest way to compare a single value with all values in one column of a data frame

[R] Fastest way to compare a single value with all values in one column of a data frame

[R] Fastest way to compare a single value with all values in one column of a data frame

[R] Fastest way to compare a single value with all values in one column of a data frame

[R] Fastest way to compare a single value with all values in one column of a data frame

Maybe Matching Threads