thr3ads.net - R help - [R] looping through a data frame [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Ivan Baxter

2006-Jun-23 18:38 UTC

[R] looping through a data frame

Hi- I am having trouble with the syntax of  looping through  the rows 
and columns of a data frame.

I have a table with 17 observations for 84 lines at n=5-10 per line. So 
the table is ~700x17.

I want to pull out the median and stdev for each line and put it in a 
dataframe with rowname = linename.

So I have tried the following....
#read in the table
input.table <- read.table(file =  "First_run_all.txt", header = T)
#pull out the line names
line.run <- unique(input.table$Line)
#pull out the column names except for Line
el.names <- names(input.table[2:18])


#now I want to calculate the median for each line for each column. The 
code below would work for a matrix
calc.frame.med <- matrix(ncol = length(el.names), nrow = 
length(line.run), dimnames = list(line.run,el.names))
for(i in 1:length(el.names)){
    for(j in 1:length(line.run)){
       calc.frame.med[j,i] <- median(input.table[input.table$Line == 
line.run[j],el.names[i]])
    }
}


#however, it won't allow me to pull stuff out based on the row names 
will it?
batch1.med <- calc.frame.med[rownames(calc.frame.med) == batch1,]
#doesn't work.
#It seems like I want to create the data as a matrix and then be able to 
treat it like a data.frame.

can anyone set me straight on the right way to do this?

Thanks

Ivan



-- 
**************************************************************
Ivan Baxter
Research Scientist
Bindley Bioscience Center
Purdue University
765-543-7288
ibaxter at purdue.edu

jim holtman

2006-Jun-23 20:46 UTC

head link

[R] looping through a data frame

It looks like you want the column means for each unique instance of Line.
If that is so, try this: (Line has unique numbers in the range 1:5)
> str(x)`data.frame':   25 obs. of  18 variables:
 $ Line: num  5 3 2 1 1 4 3 5 4 1 ...
 $ V2  : num  0.3861 0.0134 0.3824 0.8697 0.3403 ...
 $ V3  : num  0.4776 0.8612 0.4381 0.2448 0.0707 ...
 $ V4  : num  0.892 0.864 0.390 0.777 0.961 ...
 $ V5  : num  0.655 0.353 0.270 0.993 0.633 ...
 $ V6  : num  0.454 0.511 0.208 0.229 0.596 ...
 $ V7  : num  0.615 0.557 0.329 0.453 0.500 ...
 $ V8  : num  0.895 0.644 0.741 0.605 0.903 ...
 $ V9  : num  0.268 0.219 0.517 0.269 0.181 ...
 $ V10 : num  0.5110 0.2576 0.0465 0.4179 0.8540 ...
 $ V11 : num  0.762 0.933 0.471 0.604 0.485 ...
 $ V12 : num  0.192 0.257 0.181 0.477 0.771 ...
 $ V13 : num  0.6737 0.0949 0.4926 0.4616 0.3752 ...
 $ V14 : num  0.954 0.812 0.782 0.268 0.762 ...
 $ V15 : num  0.4861 0.0638 0.7845 0.4183 0.9810 ...
 $ V16 : num  0.420 0.334 0.865 0.177 0.493 ...
 $ V17 : num  0.659 0.185 0.954 0.898 0.944 ...
 $ V18 : num  0.4058 0.0853 0.9326 0.8384 0.8794 ...> sapply(split(seq(nrow(x)), x$Line), function(z) colMeans(x[z,2:18]))            1         2         3         4         5
V2  0.6127840 0.5086587 0.5788833 0.4644615 0.4309832
V3  0.4767014 0.4742475 0.3711332 0.4924043 0.5234278
V4  0.5809474 0.4480547 0.7011737 0.4015890 0.4967741
V5  0.6308344 0.1973047 0.5139931 0.5954598 0.4103783
V6  0.5686456 0.2733915 0.4741611 0.6590434 0.3368377
V7  0.3733256 0.5335852 0.6860015 0.3432356 0.4859149
V8  0.5975514 0.5355630 0.6758798 0.4619429 0.6510002
V9  0.6814301 0.6151856 0.5076237 0.4173341 0.2176028
V10 0.6799704 0.3197591 0.3102719 0.4209485 0.4051600
V11 0.5540828 0.4474840 0.4946577 0.2194847 0.3836363
V12 0.5000410 0.1509925 0.3744429 0.2316218 0.3495196
V13 0.4898115 0.5852952 0.4697099 0.4346127 0.5736597
V14 0.4135897 0.7071779 0.4640510 0.5645719 0.7029126
V15 0.5346258 0.5340159 0.4429280 0.5265885 0.4918243
V16 0.4354619 0.7265643 0.4439110 0.4037036 0.4708805
V17 0.7393969 0.6011346 0.4725786 0.5430598 0.6076132
V18 0.4639411 0.6589378 0.4020718 0.5948647 0.2981538>


On 6/23/06, Ivan Baxter <ibaxter@purdue.edu>
wrote:>
> Hi- I am having trouble with the syntax of  looping through  the rows
> and columns of a data frame.
>
> I have a table with 17 observations for 84 lines at n=5-10 per line. So
> the table is ~700x17.
>
> I want to pull out the median and stdev for each line and put it in a
> dataframe with rowname = linename.
>
> So I have tried the following....
> #read in the table
> input.table <- read.table(file =  "First_run_all.txt", header
= T)
> #pull out the line names
> line.run <- unique(input.table$Line)
> #pull out the column names except for Line
> el.names <- names(input.table[2:18])
>
>
> #now I want to calculate the median for each line for each column. The
> code below would work for a matrix
> calc.frame.med <- matrix(ncol = length(el.names), nrow >
length(line.run), dimnames = list(line.run,el.names))
> for(i in 1:length(el.names)){
>    for(j in 1:length(line.run)){
>       calc.frame.med[j,i] <- median(input.table[input.table$Line =>
line.run[j],el.names[i]])
>    }
> }
>
>
> #however, it won't allow me to pull stuff out based on the row names
> will it?
> batch1.med <- calc.frame.med[rownames(calc.frame.med) == batch1,]
> #doesn't work.
> #It seems like I want to create the data as a matrix and then be able to
> treat it like a data.frame.
>
> can anyone set me straight on the right way to do this?
>
> Thanks
>
> Ivan
>
>
>
> --
> **************************************************************
> Ivan Baxter
> Research Scientist
> Bindley Bioscience Center
> Purdue University
> 765-543-7288
> ibaxter@purdue.edu
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)

What is the problem you are trying to solve?

	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Jun 2006 - looping through a data frame

[R] looping through a data frame

[R] looping through a data frame

Apparently Analagous Threads