Hi Umesh,
I use the plyr package for this sort of thing:
library(plyr)
daply(dataframe, .(ped), myfun)
Best,
Ista
On Tue, Mar 22, 2011 at 3:48 AM, Umesh Rosyara <rosyaraur at gmail.com>
wrote:> Dear R-experts
>
> Execuse me for an easy question, but I need help, sorry for that.
>
> >From days I have been working with a large dataset, where operations
are
> needed within a component of dataset. Here is my question:
>
> I have big dataset where x1:.....x1000 or so. What I need to do is to work
> on 4 consequite variables to calculate a statistics and output. So far so
> good. There are more vector operations inside function to do this. My
> question this time is I want to do this seperately for each level of factor
> (infollowing example it is Ped, thus if there are 20 ped, I want a output
> with 20 statistics, so that I can work further on them).
>
> #data generation
> ped <- c(1,1,1,1,1, 1,1,1,1,1, 2,2,2,2,2, 2,2,2,2,2)# I have 20 ped
> fd <- c(1,1,1,1,1, 2,2,2,2,2, ?3,3,3,3,3, 4,4,4,4,4) # I have ~100 fd
> iid <- c(1:20) # number can go up to 2000
> mid <- c(0,0,1,1,1, 0,0,6,6,6, 0,0, 11,11,11, 0,0,16,16,16)
> fid <- c(0,0,2,2,2, 0,0,7,7,7, 0,0, 12,12,12, 0,0,17, 17, 17)
> y <- c(3,4,5,6,7, ?3,4,8,9, 8, ?2,3,3,6,7, ? ? ?9,12,10,8,12)
> x1 <- c(1,1,1,0,0, 1,0,1,1,0, ? 0, 1,1,0,1, ? ?1, 1,1,0,0)
> x2 <- c(1,1,1,0,0, 1,0,1,1,0, ? 0, 1,1, 1,0, ? 1,1,0,1,0)
> x3 <- c(1,0,0,1,1, 1,1,1,1,1, ? 1, 1,1, 1,0, ? 1,1,0,1,0)
> x4 <- c(1,1,1,1,0, 0,1,1, 0,0, ?0, 1,0,0, 0, ? 0,0,1, 1,1)
> # I have more X variables potentially >1000 but I need to work four at a
> time
> dataframe <- data.frame(ped, fd, iid, mid, fid, y, x1, x2, x3, x4)
>
> myfun <- function(dataframe) ?{
> namemat <- matrix(c(1:4), nrow = 1)
> smyfun <- function(x) ?{
> ?x <- as.vector(x)
> ?K1 <- dataframe$x1 * 0.23
> ?K2 <- dataframe$x2 * 0.98
> ?# just example there is long vector calculations in read dataset
> ?kt1 <- K1 * K2
> ?kt2 <- K1 / K2
> ?Qni <- (K1*(kt1-0.25)+ K2 *(kt2-0.25))
> ?y <- dataframe$y
> ?yg <- mean(y, na.rm= TRUE) # mean of trait Y # mean of trait Y
> ?dvm <- (y-yg ) # deviation of phenotypic value from mean
> ?sumdvm <-abs(sum(dvm, na.rm= TRUE))
> ?yQni <- y* Qni
> ?sumyQni <-abs(sum(yQni, na.rm= TRUE))
> ?npt = ( sumdvm/ sumyQni)
> ?return(npt)
> ?}
> ?npt1 <- apply(namemat,1, smyfun)
> ?return(npt1)
> }
>
> ?myfun (dataframe)
>
> My question is how can I automate the process so that the above function
can
> calculate different values for n levels (>20 in my real data) of factor
ped.
>
>
> Thanks in advance for the help. R-community is always helpful.
>
> Umesh R
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org