thr3ads.net - R help - [R] Creating subset using selected columns [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Suparna Mitra

2013-Jun-16 07:20 UTC

[R] Creating subset using selected columns

Hello R experts,
 I need a help to create a subset file. I know with subset comand, its very
easy to select many different columns, or threshold. But here I have a bit
problem as in my data file is big. And I don't want to identify the column
numbers  

or names manually. I am trying to find any way to automatise this.

For example I have a file with about 1500 columns from TRFLP intensity
data.


And the column names are like:
 [1] "Sample.Name"    "Marker"         "RE"       
"Dye"
 "Allele.1"       "Size.1"         "Height.1"     
"Peak.Area.1"
 "Data.Point.1"
  [10] "Allele.2"       "Size.2"        
"Height.2"       "Peak.Area.2"
 "Data.Point.2"   "Allele.3"       "Size.3"       
"Height.3"
"Peak.Area.3"
  [19] "Data.Point.3"   "Allele.4"       "Size.4" 
"Height.4"
"Peak.Area.4"    "Data.Point.4"   "Allele.5"      
"Size.5"
"Height.5"
  [28] "Peak.Area.5"    "Data.Point.5"  
"Allele.6"       "Size.6"
"Height.6"       "Peak.Area.6"    "Data.Point.6"  
"Allele.7"
"Size.7"
  [37] "Height.7"       "Peak.Area.7"   
"Data.Point.7"   "Allele.8"
"Size.8"         "Height.8"       "Peak.Area.8"   
"Data.Point.8"
"Allele.9"
  [46] "Size.9"         "Height.9"      
"Peak.Area.9"    "Data.Point.9"
"Allele.10"      "Size.10"        "Height.10"     
"Peak.Area.10"
"Data.Point.10"
.....

Suppose I want to create a subset selecting all the columns with
name Peak.Area
(as in unix Peak.Area.*)
How can I do that in R? 
Thanks a lot for the help.
Best wishes,
Mitra

	[[alternative HTML version deleted]]

Rui Barradas

2013-Jun-16 08:40 UTC

head link

[R] Creating subset using selected columns

Hello,

You could try something like the following.
The example below assumes your data.frame is named 'dat'


cnums <- grep("Peak\\.Area", colnames(dat))
subdat <- dat[cnums]

See ?regexp for the regular expressions used by ?grep.

Hope this helps,

Rui Barradas

Em 16-06-2013 08:20, Suparna Mitra escreveu:> Hello R experts,
>   I need a help to create a subset file. I know with subset comand, its
very
> easy to select many different columns, or threshold. But here I have a bit
> problem as in my data file is big. And I don't want to identify the
column
> numbers
>
> or names manually. I am trying to find any way to automatise this.
>
> For example I have a file with about 1500 columns from TRFLP intensity
> data.
>
>
> And the column names are like:
>   [1] "Sample.Name"    "Marker"         "RE" 
"Dye"
>   "Allele.1"       "Size.1"        
"Height.1"       "Peak.Area.1"
>   "Data.Point.1"
>    [10] "Allele.2"       "Size.2"        
"Height.2"       "Peak.Area.2"
>   "Data.Point.2"   "Allele.3"       "Size.3" 
"Height.3"
> "Peak.Area.3"
>    [19] "Data.Point.3"   "Allele.4"      
"Size.4"         "Height.4"
> "Peak.Area.4"    "Data.Point.4"   "Allele.5" 
"Size.5"
> "Height.5"
>    [28] "Peak.Area.5"    "Data.Point.5"  
"Allele.6"       "Size.6"
> "Height.6"       "Peak.Area.6"   
"Data.Point.6"   "Allele.7"
> "Size.7"
>    [37] "Height.7"       "Peak.Area.7"   
"Data.Point.7"   "Allele.8"
> "Size.8"         "Height.8"      
"Peak.Area.8"    "Data.Point.8"
> "Allele.9"
>    [46] "Size.9"         "Height.9"      
"Peak.Area.9"    "Data.Point.9"
> "Allele.10"      "Size.10"        "Height.10"
"Peak.Area.10"
> "Data.Point.10"
> .....
>
> Suppose I want to create a subset selecting all the columns with
> name Peak.Area
> (as in unix Peak.Area.*)
> How can I do that in R?
> Thanks a lot for the help.
> Best wishes,
> Mitra
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Rainer Schuermann

2013-Jun-16 08:42 UTC

head link

[R] Creating subset using selected columns

Supposed your data.frame is called x, try
x[ which( substr( colnames( x ), 1, 4 ) == "Peak" ) ]


On Sunday 16 June 2013 15:20:37 Suparna Mitra wrote:> Hello R experts,
>  I need a help to create a subset file. I know with subset comand, its very
> easy to select many different columns, or threshold. But here I have a bit
> problem as in my data file is big. And I don't want to identify the
column
> numbers  
> 
> or names manually. I am trying to find any way to automatise this.
> 
> For example I have a file with about 1500 columns from TRFLP intensity
> data.
> 
> 
> And the column names are like:
>  [1] "Sample.Name"    "Marker"         "RE"  
"Dye"
>  "Allele.1"       "Size.1"         "Height.1"
"Peak.Area.1"
>  "Data.Point.1"
>   [10] "Allele.2"       "Size.2"        
"Height.2"       "Peak.Area.2"
>  "Data.Point.2"   "Allele.3"       "Size.3"  
"Height.3"
> "Peak.Area.3"
>   [19] "Data.Point.3"   "Allele.4"      
"Size.4"         "Height.4"
> "Peak.Area.4"    "Data.Point.4"   "Allele.5" 
"Size.5"
> "Height.5"
>   [28] "Peak.Area.5"    "Data.Point.5"  
"Allele.6"       "Size.6"
> "Height.6"       "Peak.Area.6"   
"Data.Point.6"   "Allele.7"
> "Size.7"
>   [37] "Height.7"       "Peak.Area.7"   
"Data.Point.7"   "Allele.8"
> "Size.8"         "Height.8"      
"Peak.Area.8"    "Data.Point.8"
> "Allele.9"
>   [46] "Size.9"         "Height.9"      
"Peak.Area.9"    "Data.Point.9"
> "Allele.10"      "Size.10"        "Height.10"
"Peak.Area.10"
> "Data.Point.10"
> .....
> 
> Suppose I want to create a subset selecting all the columns with
> name Peak.Area
> (as in unix Peak.Area.*)
> How can I do that in R? 
> Thanks a lot for the help.
> Best wishes,
> Mitra
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Jun 2013 - Creating subset using selected columns

[R] Creating subset using selected columns

[R] Creating subset using selected columns

[R] Creating subset using selected columns