Johannes Radinger
2014-Mar-26 16:09 UTC
dataframe calculations based on certain values of a column
Hi, I have data in a dataframe in following structure var1 <- c("a","b","c","a","b","c","a","b","c") var2 <- c("X","X","X","Y","Y","Y","Z","Z","Z") var3 <- c(1,2,2,5,2,6,7,4,4) df <- data.frame(var1,var2,var3) Now I'd like to calculate relative values of var3. This values should be relative to the base value (where var1=c) which is indicated for each group (var2). To illustrate how my result column should look like I divide the column var3 by a vector c(2,2,2,6,6,6,4,4,4) (= for each group of var2 the value c) Of course this can also be done like this: df$div <- rep(df$var3[df$var1=="c"],each=length(unique(df$var1))) df$result_calc <- df$var3/df$div However what when the dataframe is not as simple and not that well ordered as in the example here. So for example there is always a value c for each group but all the "c"s are clumped in the last rows of the dataframe or scatterd in a random mannar. Is there a simple way to still calculate such relative values. Probably with an approach using apply, but maybe someone can give me a hint. Or do I need to sort my dataframe in order to do such calculations? best, /Johannes [[alternative HTML version deleted]]