thr3ads.net - R help - [R] correct way to subset a vector [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Juliet Hannah

2009-Jul-09 15:40 UTC

[R] correct way to subset a vector

Hi,

#make example data
dat <- data.frame(matrix(rnorm(15),ncol=5))
colnames(dat) <-
c("ab","cd","ef","gh","ij")

If I want to get a subset of the data for the middle 3 columns, and I
know the names of the start column and the end column, I can do this:

mysub <- subset(dat,select=c(cd:gh))

If I wanted to do this just on the column names, without subsetting
the data, how could I do this?

mynames <- colnames(dat);

#mynames
#[1] "ab" "cd" "ef" "gh" "ij"

Is there an easy way to create the vector
c("cd","ef","gh") as I did
above using something similar to cd:gh?

Thanks,

Juliet

Steve Lianoglou

2009-Jul-09 15:54 UTC

head link

[R] correct way to subset a vector

Hi,

On Jul 9, 2009, at 11:40 AM, Juliet Hannah wrote:
> Hi,
>
> #make example data
> dat <- data.frame(matrix(rnorm(15),ncol=5))
> colnames(dat) <-
c("ab","cd","ef","gh","ij")
>
> If I want to get a subset of the data for the middle 3 columns, and I
> know the names of the start column and the end column, I can do this:
>
> mysub <- subset(dat,select=c(cd:gh))
>
> If I wanted to do this just on the column names, without subsetting
> the data, how could I do this?
>
> mynames <- colnames(dat);
>
> #mynames
> #[1] "ab" "cd" "ef" "gh"
"ij"
>
> Is there an easy way to create the vector
c("cd","ef","gh") as I did
> above using something similar to cd:gh?
How about just taking your mynames vector? eg:

R> mynames[2:4]
[1] "cd" "ef" "gh"

R> dat[, mynames[2:4]]
           cd         ef          gh
1  1.7745386  1.0958930 -0.07213304
2  0.7480372 -0.1364458 -0.62848211
3 -0.5477843  1.5811382 -0.74404103

-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos

Marc Schwartz

2009-Jul-09 16:17 UTC

head link

[R] correct way to subset a vector

On Jul 9, 2009, at 10:40 AM, Juliet Hannah wrote:
> Hi,
>
> #make example data
> dat <- data.frame(matrix(rnorm(15),ncol=5))
> colnames(dat) <-
c("ab","cd","ef","gh","ij")
>
> If I want to get a subset of the data for the middle 3 columns, and I
> know the names of the start column and the end column, I can do this:
>
> mysub <- subset(dat,select=c(cd:gh))
>
> If I wanted to do this just on the column names, without subsetting
> the data, how could I do this?
>
> mynames <- colnames(dat);
>
> #mynames
> #[1] "ab" "cd" "ef" "gh"
"ij"
>
> Is there an easy way to create the vector
c("cd","ef","gh") as I did
> above using something similar to cd:gh?
>
> Thanks,
>
> Juliet

Using the same presumption that the desired values are consecutive in  
the vector:

# Use which() to get the indices for the start and end of the subset
 > mynames[which(mynames == "cd"):which(mynames == "gh")]
[1] "cd" "ef" "gh"

You can encapsulate that in a function:

subset.vector <- function(x, start, end)
{
   x[which(x == start):which(x == end)]
}

 > subset.vector(mynames, "cd", "gh")
[1] "cd" "ef" "gh"

Note that you can also do this:

 > names(subset(dat, select = cd:gh))
[1] "cd" "ef" "gh"

but that actually goes through the process of subsetting the data  
frame first, which potentially introduces a lot of overhead and memory  
use if the data frame is large. It also presumes that the desired  
vector is a subset of the column names of the initial data frame.

To use the same sequence based approach as is used in  
subset.data.frame(), you can do what is used internally within that  
function:

subset.vector <- function(x, select)
{
   nl <- as.list(1L:length(x))
   names(nl) <- x
   vars <- eval(substitute(select), nl)
   x[vars]
}

 > subset.vector(mynames, select = cd:gh)
[1] "cd" "ef" "gh"

BTW, well done on recognizing that you can use the sequence of column  
names for the 'select' argument. A lot of folks, even experienced  
useRs, don't realize that you can do that...  :-)

HTH,

Marc Schwartz

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Jul 2009 - correct way to subset a vector

[R] correct way to subset a vector

[R] correct way to subset a vector

[R] correct way to subset a vector

Seemingly Similar Threads