thr3ads.net - R help - [R] Indexing by logical vectors [Jul 2010]

If this information is useful, please help other people find it:
Share via:

Christian Raschke

2010-Jul-19 23:16 UTC

[R] Indexing by logical vectors

Dear R-Listers,

My question concerns indexing vectors by logical vectors that are based 
on the original vector. Consider the following simple example to 
hopefully make clear what I mean:

a <- rnorm(10)
a[a<0] <- NA

However, I am now working with multiple data frames that I received, 
where each of them has nicely descriptive, yet long names(). In my 
scripts there are many instances where operations similar to the one 
above are required. Again a simple example:


some.data.frame <- data.frame(some.long.variable.name=rnorm(10), 
some.other.long.variable.name=rnorm(10))

some.data.frame$some.other.long.variable.name[some.data.frame$some.other.long.variable.name
< 0] <- NA


The fact that the names are so long makes things not very readable in 
the script and hard to debug. Is there a way in R to refer to the
"self"
of whatever is being indexed? I am looking for something like

some.data.frame$some.other.long.variable.name[.self < 0] <- NA

that would accomplish the same result as above. Or is there another 
concise, but less messy way to do this? I prefer not attaching the 
data.frames and partial matching makes things even more messy since many 
names() are very similar. I know I could just rename everything, but I'd 
like to learn if there is and easy or obvious way to do this in R that I 
have missed so far.

I would appreciate any advice, and I apologize if this topic has been 
discussed before.


 > sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-redhat-linux-gnu

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base


-- 
Christian Raschke
Department of Economics
and
ISDS Research Lab (HSRG)
Louisiana State University
crasch2 at lsu.edu

David Winsemius

2010-Jul-19 23:46 UTC

head link

[R] Indexing by logical vectors

On Jul 19, 2010, at 7:16 PM, Christian Raschke wrote:
> Dear R-Listers,
>
> My question concerns indexing vectors by logical vectors that are  
> based on the original vector. Consider the following simple example  
> to hopefully make clear what I mean:
>
> a <- rnorm(10)
> a[a<0] <- NA
>
> However, I am now working with multiple data frames that I received,  
> where each of them has nicely descriptive, yet long names(). In my  
> scripts there are many instances where operations similar to the one  
> above are required. Again a simple example:
>
>
> some.data.frame <- data.frame(some.long.variable.name=rnorm(10),  
> some.other.long.variable.name=rnorm(10))
>
> some.data.frame$some.other.long.variable.name[some.data.frame 
> $some.other.long.variable.name < 0] <- NA
>
>
> The fact that the names are so long makes things not very readable  
> in the script and hard to debug. Is there a way in R to refer to the  
> "self" of whatever is being indexed? I am looking for something
like
>
> some.data.frame$some.other.long.variable.name[.self < 0] <- NA
There is an alternative, "is.na()<-" which I think is a bit  more  
readable:

is.na($some.other.long.variable.name) <- some.data.frame 
$some.other.long.variable.name < 0

But do _not_ do:

with(some.data.frame, is.na(some.other.long.variable.name) <-  
some.other.long.variable.name < 0 )

-- 
David.>
> that would accomplish the same result as above. Or is there another  
> concise, but less messy way to do this? I prefer not attaching the  
> data.frames and partial matching makes things even more messy since  
> many names() are very similar. I know I could just rename  
> everything, but I'd like to learn if there is and easy or obvious  
> way to do this in R that I have missed so far.
>
> I would appreciate any advice, and I apologize if this topic has  
> been discussed before.
>
>
> > sessionInfo()
> R version 2.11.0 (2010-04-22)
> x86_64-redhat-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
>
> -- 
> Christian Raschke
> Department of Economics
> and
> ISDS Research Lab (HSRG)
> Louisiana State University
> crasch2 at lsu.edu
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT

Bill.Venables at csiro.au

2010-Jul-20 00:12 UTC

head link

[R] Indexing by logical vectors

As far as I know the answer to your question is "No", but there are
things you can do to improve the readability of your code.  One thing I find
useful is to avoid using "$" as much as possible and to favour things
like with() and within().

The first thing you might do is think about choosing shorter names, of course. 
If that's not possible, you could try something like this.

ensureNN <- function(x) {  # "ensure non-negative"
	is.na(x[x < 0]) <- TRUE
	x
} 

some.data.frame <- within(some.data.frame, {
  some.long.variable.name <- ensureNN(some.long.variable.name)
  some.other.long.variable.name <- ensureNN(some.other.long.variable.name)
})

Of course if you wanted to do this to all variables in a data frame you could do

some.data.frame <- data.frame(lapply(some.data.frame, ensureNN))

and it all happens, no questions asled.  (I can see a generic function emerging
here, perhaps...)

W.


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Christian Raschke
Sent: Tuesday, 20 July 2010 9:16 AM
To: r-help at r-project.org
Subject: [R] Indexing by logical vectors

Dear R-Listers,

My question concerns indexing vectors by logical vectors that are based 
on the original vector. Consider the following simple example to 
hopefully make clear what I mean:

a <- rnorm(10)
a[a<0] <- NA

However, I am now working with multiple data frames that I received, 
where each of them has nicely descriptive, yet long names(). In my 
scripts there are many instances where operations similar to the one 
above are required. Again a simple example:


some.data.frame <- data.frame(some.long.variable.name=rnorm(10), 
some.other.long.variable.name=rnorm(10))

some.data.frame$some.other.long.variable.name[some.data.frame$some.other.long.variable.name
< 0] <- NA


The fact that the names are so long makes things not very readable in 
the script and hard to debug. Is there a way in R to refer to the
"self"
of whatever is being indexed? I am looking for something like

some.data.frame$some.other.long.variable.name[.self < 0] <- NA

that would accomplish the same result as above. Or is there another 
concise, but less messy way to do this? I prefer not attaching the 
data.frames and partial matching makes things even more messy since many 
names() are very similar. I know I could just rename everything, but I'd 
like to learn if there is and easy or obvious way to do this in R that I 
have missed so far.

I would appreciate any advice, and I apologize if this topic has been 
discussed before.


 > sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-redhat-linux-gnu

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base


-- 
Christian Raschke
Department of Economics
and
ISDS Research Lab (HSRG)
Louisiana State University
crasch2 at lsu.edu

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Christian Raschke

2010-Jul-20 04:18 UTC

head link

[R] Indexing by logical vectors

On Mon, 2010-07-19 at 19:46 -0400, David Winsemius
wrote:> On Jul 19, 2010, at 7:16 PM, Christian Raschke wrote:
> 
> > Dear R-Listers,
> >
> > My question concerns indexing vectors by logical vectors that are  
> > based on the original vector. Consider the following simple example  
> > to hopefully make clear what I mean:
> >
> > a <- rnorm(10)
> > a[a<0] <- NA
> >
> > However, I am now working with multiple data frames that I received,  
> > where each of them has nicely descriptive, yet long names(). In my  
> > scripts there are many instances where operations similar to the one  
> > above are required. Again a simple example:
> >
> >
> > some.data.frame <- data.frame(some.long.variable.name=rnorm(10),  
> > some.other.long.variable.name=rnorm(10))
> >
> > some.data.frame$some.other.long.variable.name[some.data.frame 
> > $some.other.long.variable.name < 0] <- NA
> >
> >
> > The fact that the names are so long makes things not very readable  
> > in the script and hard to debug. Is there a way in R to refer to the  
> > "self" of whatever is being indexed? I am looking for
something like
> >
> > some.data.frame$some.other.long.variable.name[.self < 0] <- NA
> 
> There is an alternative, "is.na()<-" which I think is a bit 
more
> readable:
> 
> is.na($some.other.long.variable.name) <- some.data.frame 
> $some.other.long.variable.name < 0
Thanks, David!

As written, this throws and error. However,

is.na(some.data.frame$some.other.long.variable.name) <- some.data.frame
$some.other.long.variable.name < 0

works, but does not seem like much of an improvement to me. 
> 
> But do _not_ do:
> 
> with(some.data.frame, is.na(some.other.long.variable.name) <-  
> some.other.long.variable.name < 0 )
>

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Jul 2010 - Indexing by logical vectors

[R] Indexing by logical vectors

[R] Indexing by logical vectors

[R] Indexing by logical vectors

[R] Indexing by logical vectors

Possibly Parallel Threads