I have a complicated, multi-part question. My apologies if I do not make myself clear. I am also a fairly novice R user, so forgive me if this seems rudimentary. I want to calculate a index of colocation for whale dive data and prey distribution data. This entails: Calculating a frequency distribution of whale depth of dive data BY DIVE into depth bins from prey (fish and zoop) data. For each dive, calculate the center of gravity (CG) and inertia (I). For each dive, calculate a global index of colocation (GIC) vs. each prey type. I want to be able to write a function (or series of functions) such that I do not have to separate my data by dive and rerun the functions for each dive manually. Example whale data, where number if the dive number (sometimes 40+ dives), dive is equal to the depth and classification is related to the type of dive it is. [IMG]http://i41.tinypic.com/33vc5rs.jpg[/IMG] Depth bins come from a separate data set containing prey information: [IMG]http://i43.tinypic.com/rjjy4n.jpg[/IMG] I have the following codes that work for the dive data as a whole, but need to write a loop or include an apply function such that I can run this for the data for each dive which is contained in a single file. So, for a whale with 40 dives, I need 40 whale frequencies, 40 whale CGs, 40 whale Is, etc. The prey distributionss are the SAME for each dive! Ultimately, I'd like a table which contains a list of the delta GIC values. #bin whale dive depths whale.cut=cut(whale,c(0 ,depths), right=FALSE) whale.freq=table(whale.cut) # compute CG fish.CG=sum(depths*fish)/sum(fish) whale.CG=sum(depths*whale.freq)/sum(whale.freq) zoop.CG=sum(depths*zoop)/sum(zoop) # compute Inertia fish.I=sum((depths-fish.CG)^2*fish)/sum(fish) whale.I=sum((depths-whale.CG)^2*whale.freq)/sum(whale.freq) zoop.I=sum((depths-zoop.CG)^2*zoop)/sum(zoop) #compute GIC as per # compute delta CG deltaCG.fish_whale=fish.CG-whale.CG GIC.fish_whale1-((deltaCG.fish_whale)^2/((deltaCG.fish_whale)^2+fish.I+whale.I)) deltaCG.zoop_whale=zoop.CG-whale.CG GIC.zoop_whale1-((deltaCG.zoop_whale)^2/((deltaCG.zoop_whale)^2+zoop.I+whale.I)) -- View this message in context: http://r.789695.n4.nabble.com/Calculating-an-index-of-colocation-for-a-large-dataset-tp4670084.html Sent from the R help mailing list archive at Nabble.com.
Adams, Jean
2013-Jun-24 14:15 UTC
[R] Calculating an index of colocation for a large dataset
Are you saying that the dive data for a single whale with 8 dives is stored in 8 separate files? How are you reading these files into R? read.table()? Could you post an example of the line of code that reads in the data? Something like this might work for you. # vector of file names file.names <- c("dive1.csv", "dive2.csv", ... "diven.csv") # read in all the files and save them into a list of data frames dives <- lapply(file.names, read.csv) # define a single function that does all of your computations compute <- function(whale, depths, fish, zoop) { ## insert all the code in your e-mail here ## # then list off all the variables you want to keep as output from the function here list(fish.CG, whale.CB, zoop.CG, fish.I, whale.I, zoop.I, GIC.fish_whale, GIC.zoop_whale) } # then you can run your function on every whale dive # not sure how to fill in the arguments to the compute() function, # because I don't know where whale, depths, fish, and zoop come from in your code results <- lapply(dives, function(dat) compute( ... )) Jean On Fri, Jun 21, 2013 at 5:18 PM, Bree W <bree.witteveen@alaska.edu> wrote:> I have a complicated, multi-part question. My apologies if I do not make > myself clear. I am also a fairly novice R user, so forgive me if this seems > rudimentary. I want to calculate a index of colocation for whale dive data > and prey distribution data. This entails: > > Calculating a frequency distribution of whale depth of dive data BY DIVE > into depth bins from prey (fish and zoop) data. > For each dive, calculate the center of gravity (CG) and inertia (I). > For each dive, calculate a global index of colocation (GIC) vs. each prey > type. > I want to be able to write a function (or series of functions) such that I > do not have to separate my data by dive and rerun the functions for each > dive manually. > > Example whale data, where number if the dive number (sometimes 40+ dives), > dive is equal to the depth and classification is related to the type of > dive > it is. [IMG]http://i41.tinypic.com/33vc5rs.jpg[/IMG] > > Depth bins come from a separate data set containing prey information: > [IMG]http://i43.tinypic.com/rjjy4n.jpg[/IMG] > > I have the following codes that work for the dive data as a whole, but need > to write a loop or include an apply function such that I can run this for > the data for each dive which is contained in a single file. So, for a whale > with 40 dives, I need 40 whale frequencies, 40 whale CGs, 40 whale Is, etc. > The prey distributionss are the SAME for each dive! Ultimately, I'd like a > table which contains a list of the delta GIC values. > > #bin whale dive depths > whale.cut=cut(whale,c(0 ,depths), right=FALSE) > whale.freq=table(whale.cut) > > # compute CG > fish.CG=sum(depths*fish)/sum(fish) > whale.CG=sum(depths*whale.freq)/sum(whale.freq) > zoop.CG=sum(depths*zoop)/sum(zoop) > > # compute Inertia > fish.I=sum((depths-fish.CG)^2*fish)/sum(fish) > whale.I=sum((depths-whale.CG)^2*whale.freq)/sum(whale.freq) > zoop.I=sum((depths-zoop.CG)^2*zoop)/sum(zoop) > > #compute GIC as per > # compute delta CG > deltaCG.fish_whale=fish.CG-whale.CG > GIC.fish_whale> 1-((deltaCG.fish_whale)^2/((deltaCG.fish_whale)^2+fish.I+whale.I)) > deltaCG.zoop_whale=zoop.CG-whale.CG > GIC.zoop_whale> 1-((deltaCG.zoop_whale)^2/((deltaCG.zoop_whale)^2+zoop.I+whale.I)) > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Calculating-an-index-of-colocation-for-a-large-dataset-tp4670084.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Adams, Jean
2013-Jun-25 13:13 UTC
[R] Calculating an index of colocation for a large dataset
Bree, You should cc r-help on all correspondence so that the thread of conversation is maintained for all readers. And you shouldn't attach any files. If you want to share some data, used dput(). For example, dput(head(whale)) dput(head(prey)) Give this code a try ... Note that I don't know what you are using for depths. You'll have to fill that in. And I'm guessing that you used the whale$number for your dive.freq. If not, you'll have to change that. Jean # whale data whale.data <- read.table("Whale_402_All.txt", header=TRUE) #open whale depth distribution dive <- whale.data$Depth_BottomPhase #name dive depth column number <- whale.data$Dive_Number #name dive_number column class <- whale.data$Classification # name classification column whale <- cbind(number, dive, class) #create whale dataframe # split the data frame into a list with a different element for each dive dives <- split(whale, dive) # prey data prey <- read.table("Prey_402_All.txt", header=TRUE) # open prey depth distributions fish <- prey$Fish # name fish column zoop <- prey$Krill # name krill column # depth data depths <- ???? # define a single function that does all of your computations compute <- function(whale, depths, fish, zoop) { # you don't say what part of the whale data you are counting ... I'll assume it's the number dive.freq <- table(cut(whale$number, c(0, depths))) #compute Center of Gravity fish.CG <- sum(depths*fish)/sum(fish) #calculate CG for fish distribution ONCE for each whale zoop.CG <- sum(depths*zoop)/sum(zoop) #calculate CG for zoop distribution ONCE for each whale whale.CG <- sum(depths*dive.freq/sum(dive.freq) #calculate for EACH dive #compute Inertia fish.I <- sum((depths-fish.CG)^2*fish)/sum(fish) zoop.I <- sum((depths-zoop.CG)^2*zoop)/sum(zoop) whale.I <- sum((depths-whale.CG)^2*dive.freq)/sum(dive.freq) #needs to be calculated for EACH dive # compute delta CG deltaCG.fish_whale <- fish.CG-whale.CG GIC.fish_whale <- 1-((deltaCG.fish_whale)^2/((deltaCG.fish_whale)^2+fish.I+whale.I)) deltaCG.zoop_whale <- zoop.CG-whale.CG GIC.zoop_whale <- 1-((deltaCG.zoop_whale)^2/((deltaCG.zoop_whale)^2+zoop.I+whale.I)) # then list off all the variables you want to keep as output from the function here list(fish.CG, whale.CB, zoop.CG, fish.I, whale.I, zoop.I, GIC.fish_whale, GIC.zoop_whale) } results <- lapply(dives, function(dat) compute(dat, depths, fish, zoop)) On Mon, Jun 24, 2013 at 7:10 PM, Bree Witteveen <bhwitteveen@alaska.edu>wrote:> Hi Jean, > Thank you for the reply. I will play around with the options you > suggested. > The dive data are contained in a single file; a single whale text file can > contain upwards of 40 dives. I want to get frequency distributions for each > dive and then calculate the GIC as described in the original post. I am > able to get the frequency for the entire data set, but I am in over my head > when I try to do it on a per dive basis, save the results and calculate the > subsequent variables. > > Here is how I'm reading my data in: > whale.data<-read.table("Whale_402_All.txt", header=TRUE) #open whale depth > distribution > prey<-read.table("Prey_402_All.txt", header=TRUE) # open prey depth > distributions > dive<-whale.data$Depth_BottomPhase #name dive depth column > number<-whale.data$Dive_Number #name dive_number column > class<-whale.data$Classification # name classification column > whale<-cbind(number,dive,class) #create whale dataframe > fish<-prey$Fish # name fish column > zoop<-prey$Krill # name krill column > > freq<-function(x){ > dive.freq=table(cut(x,c(0,depths))) > return(dive.freq) > } > #function to create dive.freq > #but can only do this for the whole data set > > #compute Center of Gravity > fish.CG=sum(depths*fish)/sum(fish) #calculate CG for fish distribution > ONCE for each whale > zoop.CG =sum(depths*zoop)/sum(zoop) #calculate CG for zoop distribution > ONCE for each whale > whale.CG=sum(depths*dive.freq/sum(dive.freq) #calculate for EACH dive > > #compute Inertia > fish.I=sum((depths-fish.CG)^2*fish)/sum(fish) > zoop.I=sum((depths-zoop.CG)^2*zoop)/sum(zoop) > whale.I=sum((depths-whale.CG)^2*dive.freq)/sum(dive.freq) #needs to be > calculated for EACH dive > > > # compute delta CG > deltaCG.fish_whale=fish.CG-whale.CG > GIC.fish_whale> 1-((deltaCG.fish_whale)^2/((deltaCG.fish_whale)^2+fish.I+whale.I)) > > deltaCG.zoop_whale=zoop.CG-whale.CG > GIC.zoop_whale> 1-((deltaCG.zoop_whale)^2/((deltaCG.zoop_whale)^2+zoop.I+whale.I)) > > I'm an R novice to be sure and I've done quite a bit of searching on the > web for solutions on how to apply and store frequency data for based on a > column variable and haven't had much luck. > > Thank you again, > Bree > > > On 6/24/13 6:15 AM, Adams, Jean wrote: > > Are you saying that the dive data for a single whale with 8 dives is > stored in 8 separate files? > How are you reading these files into R? read.table()? Could you post an > example of the line of code that reads in the data? > > Something like this might work for you. > > # vector of file names > file.names <- c("dive1.csv", "dive2.csv", ... "diven.csv") > > # read in all the files and save them into a list of data frames > dives <- lapply(file.names, read.csv) > > # define a single function that does all of your computations > compute <- function(whale, depths, fish, zoop) { > > ## insert all the code in your e-mail here ## > > # then list off all the variables you want to keep as output from the > function here > list(fish.CG, whale.CB, zoop.CG, fish.I, whale.I, zoop.I, > GIC.fish_whale, GIC.zoop_whale) > > } > > # then you can run your function on every whale dive > # not sure how to fill in the arguments to the compute() function, > # because I don't know where whale, depths, fish, and zoop come from in > your code > results <- lapply(dives, function(dat) compute( ... )) > > Jean > > > On Fri, Jun 21, 2013 at 5:18 PM, Bree W <bree.witteveen@alaska.edu> wrote: > >> I have a complicated, multi-part question. My apologies if I do not make >> myself clear. I am also a fairly novice R user, so forgive me if this >> seems >> rudimentary. I want to calculate a index of colocation for whale dive data >> and prey distribution data. This entails: >> >> Calculating a frequency distribution of whale depth of dive data BY DIVE >> into depth bins from prey (fish and zoop) data. >> For each dive, calculate the center of gravity (CG) and inertia (I). >> For each dive, calculate a global index of colocation (GIC) vs. each prey >> type. >> I want to be able to write a function (or series of functions) such that I >> do not have to separate my data by dive and rerun the functions for each >> dive manually. >> >> Example whale data, where number if the dive number (sometimes 40+ dives), >> dive is equal to the depth and classification is related to the type of >> dive >> it is. [IMG]http://i41.tinypic.com/33vc5rs.jpg[/IMG] >> >> Depth bins come from a separate data set containing prey information: >> [IMG]http://i43.tinypic.com/rjjy4n.jpg[/IMG] >> >> I have the following codes that work for the dive data as a whole, but >> need >> to write a loop or include an apply function such that I can run this for >> the data for each dive which is contained in a single file. So, for a >> whale >> with 40 dives, I need 40 whale frequencies, 40 whale CGs, 40 whale Is, >> etc. >> The prey distributionss are the SAME for each dive! Ultimately, I'd like a >> table which contains a list of the delta GIC values. >> >> #bin whale dive depths >> whale.cut=cut(whale,c(0 ,depths), right=FALSE) >> whale.freq=table(whale.cut) >> >> # compute CG >> fish.CG=sum(depths*fish)/sum(fish) >> whale.CG=sum(depths*whale.freq)/sum(whale.freq) >> zoop.CG=sum(depths*zoop)/sum(zoop) >> >> # compute Inertia >> fish.I=sum((depths-fish.CG)^2*fish)/sum(fish) >> whale.I=sum((depths-whale.CG)^2*dive.freq)/sum(dive.freq) >> >> zoop.I=sum((depths-zoop.CG)^2*zoop)/sum(zoop) >> >> #compute GIC as per >> # compute delta CG >> deltaCG.fish_whale=fish.CG-whale.CG >> GIC.fish_whale>> 1-((deltaCG.fish_whale)^2/((deltaCG.fish_whale)^2+fish.I+whale.I)) >> deltaCG.zoop_whale=zoop.CG-whale.CG >> GIC.zoop_whale>> 1-((deltaCG.zoop_whale)^2/((deltaCG.zoop_whale)^2+zoop.I+whale.I)) >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Calculating-an-index-of-colocation-for-a-large-dataset-tp4670084.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Briana H. Witteveen > Research Assistant Professor > University of Alaska Fairbanks > School of Fisheries and Ocean Sciences > Marine Advisory Program > Kodiak Seafood and Marine Science Center > 118 Trident Way > Kodiak, AK 90615 > 907-486-1519http://seagrant.uaf.edu/map/gap/index.php > >[[alternative HTML version deleted]]