Wladimir Eremeev
2005-Apr-05 11:51 UTC
[R] problems with subset (misunderstanding somewhere)
Dear r-help, I have the following function defined: cubic.distance<-function(x1,y1,z1,x2,y2,z2) { max(c(abs(x1-x2),abs(y1-y2),abs(z1-z2))) } I have a data frame from which I make subsets. When I call subset(dataframe,cubic.distance(tb19h,tb37v,tb19v,190,210,227)<=2) I have the result with 0 rows. However, the data frame contains the row (among others, that suit) tb19v tb19h tb37v 226.6 189.3 208.4 Call cubic.distance(189.3,208.4,226.6,190,210,227) gives [1] 1.6 Next call:> cubic.distance(189.3,208.4,226.6,190,210,227)<=2[1] TRUE It seems to me, that I have made errors somewhere in calls. Could you, please, be so kind, to tell me, where they are? Thank you. -- Best regards Wladimir Eremeev mailto:wl at eimb.ru =========================================================================Research Scientist, PhD Leninsky Prospect 33, Space Monitoring & Ecoinformation Systems Sector, Moscow, Russia, 119071, Institute of Ecology, Phone: (095) 135-9972; Russian Academy of Sciences Fax: (095) 135-9972
Prof Brian Ripley
2005-Apr-05 12:35 UTC
[R] problems with subset (misunderstanding somewhere)
On Tue, 5 Apr 2005, Wladimir Eremeev wrote:> Dear r-help, > > I have the following function defined: > > cubic.distance<-function(x1,y1,z1,x2,y2,z2) { > max(c(abs(x1-x2),abs(y1-y2),abs(z1-z2))) > } > > I have a data frame from which I make subsets. > > When I call > subset(dataframe,cubic.distance(tb19h,tb37v,tb19v,190,210,227)<=2) > I have the result with 0 rows. > > However, the data frame contains the row (among others, that suit) > tb19v tb19h tb37v > 226.6 189.3 208.4 > > Call > cubic.distance(189.3,208.4,226.6,190,210,227) > gives > [1] 1.6 > > Next call: >> cubic.distance(189.3,208.4,226.6,190,210,227)<=2 > [1] TRUE > > It seems to me, that I have made errors somewhere in calls. > Could you, please, be so kind, to tell me, where they are?Your function finds the maximum distance over all rows (it is passed vectors). Replace max() by pmax() for a logical result for each row. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 05-Apr-05 Wladimir Eremeev wrote:> Dear r-help, > > I have the following function defined: > > cubic.distance<-function(x1,y1,z1,x2,y2,z2) { > max(c(abs(x1-x2),abs(y1-y2),abs(z1-z2))) > } > > I have a data frame from which I make subsets. > > When I call > subset(dataframe,cubic.distance(tb19h,tb37v,tb19v,190,210,227)<=2) > I have the result with 0 rows. > > However, the data frame contains the row (among others, that suit) > tb19v tb19h tb37v > 226.6 189.3 208.4Did you test the function cubic.distance? As written, I think it will always return a single value, since max() returns the maximum of *all* the values, not by rows (even if you use cbind() rather than c()). If yor redefine the function as cubic.distance<-function(x1,y1,z1,x2,y2,z2) { apply(cbind(abs(x1-x2),abs(y1-y2),abs(z1-z2)),1,max) } I think you will find it does what you want (if I have understood your problem correctly). Example (with the function defined as above): x<-cbind(rnorm(10,190,1),rnorm(10,210,1),rnorm(10,227,1)) x<-cbind(rnorm(10,190,2),rnorm(10,210,2),rnorm(10,227,2)) colnames(x)<-c("tb19h","tb37v","tb19v") x.df<-as.data.frame(x) (cubic.distance(x.df$tb19h,x.df$tb37v,x.df$tb19v,190,210,227)<=2) #[1] FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE subset(x.df,cubic.distance(tb19h,tb37v,tb19v,190,210,227)<=2) # tb19h tb37v tb19v #3 189.3930 211.4345 226.3436 #4 189.4521 208.8493 228.0324 #9 188.2441 210.4914 226.4521 #10 191.4781 211.5234 226.1837 With your definition, you would have got the *single" result FALSE, since there is at least one case where the distance > 2, so the max > 2, so the subset criterion evaluates to FALSE, so no rows are selected. Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 05-Apr-05 Time: 14:29:22 ------------------------------ XFMail ------------------------------
Wladimir Eremeev
2005-Apr-05 13:50 UTC
[R] problems with subset (misunderstanding somewhere)
Ted, TH> Did you test the function cubic.distance? Yes, I did. TH> As written, I think it TH> will always return a single value, Yes, here was the misunderstanding. Subset required a vector, and I gave it a scalar. Prof. Ripley has already shown my mistake. -- Best regards Wladimir Eremeev mailto:wl at eimb.ru =========================================================================Research Scientist, PhD Leninsky Prospect 33, Space Monitoring & Ecoinformation Systems Sector, Moscow, Russia, 119071, Institute of Ecology, Phone: (095) 135-9972; Russian Academy of Sciences Fax: (095) 135-9972