Hello R Community, I have the following design question. I have a data set that looks like this (shortened for the sake of example). Gender Age M 70 F 65 M 70 Each row represents a person with an age/gender combination. We could put this data into a data frame. Now, I would like to do some actuarial analysis on this data set. To do so, I need to create and store a mortality curve for each person in the table (a mortality curve is a matrix with 2 columns: date and survival probability). I can write a function that returns a mortality curve given gender and age. The question is the following: In what data format should I store all these mortality curve objects? Should I add a column to the data frame and each entry in that column is a matrix (a mortality curve)? This way, the mortality curve would be stored next to age/gender data in the data frame. However, I read in several places that putting vectors/matrices as elements of a data frame is a bad idea. I do not know why. What is a good design choice in this instance please? How should I store the mortality curves? Thank you for your help.
use a list. or create new class which is a list On Jun 16, 2012 8:52 AM, "Onur Uncu" <onuruncu@gmail.com> wrote:> Hello R Community, > > I have the following design question. I have a data set that looks > like this (shortened for the sake of example). > > Gender Age > M 70 > F 65 > M 70 > > Each row represents a person with an age/gender combination. We could > put this data into a data frame. > > Now, I would like to do some actuarial analysis on this data set. To > do so, I need to create and store a mortality curve for each person in > the table (a mortality curve is a matrix with 2 columns: date and > survival probability). I can write a function that returns a mortality > curve given gender and age. The question is the following: In what > data format should I store all these mortality curve objects? Should I > add a column to the data frame and each entry in that column is a > matrix (a mortality curve)? This way, the mortality curve would be > stored next to age/gender data in the data frame. However, I read in > several places that putting vectors/matrices as elements of a data > frame is a bad idea. I do not know why. What is a good design choice > in this instance please? How should I store the mortality curves? > > Thank you for your help. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hello, Follow this example. It uses a list to hold the mortality curves. Since there are only two different gender/age combinations, it first gets all such unique combinations and then creates a list of the appropriate length. Then assigns a matrix to the first list element. DF <- read.table(text=" Gender Age M 70 F 65 M 70 ", header=TRUE) # get unique gender&age nms <- unique(apply(DF, 1, paste, collapse=".")) n <- length(nms) # Create a list: # list are meant to hold any type of related objects mort.curve <- vector("list", n) names(mort.curve) <- nms # Assign a value to its 1st element mort.curve[[ 1 ]] <- matrix(1:12, nrow=4) mort.curve$M.70 # see it mort.curve[[ "M.70" ]] # the same mort.curve[[ nms[1] ]] # the same Alternatively, if you want each data.frame row to correspond to its own list element, the list would be vector("list", nrow(DF)). Anyway, list are very flexible, and the premier choice for that sort of problem. Hope this helps, Rui Barradas Em 16-06-2012 16:50, Onur Uncu escreveu:> Hello R Community, > > I have the following design question. I have a data set that looks > like this (shortened for the sake of example). > > Gender Age > M 70 > F 65 > M 70 > > Each row represents a person with an age/gender combination. We could > put this data into a data frame. > > Now, I would like to do some actuarial analysis on this data set. To > do so, I need to create and store a mortality curve for each person in > the table (a mortality curve is a matrix with 2 columns: date and > survival probability). I can write a function that returns a mortality > curve given gender and age. The question is the following: In what > data format should I store all these mortality curve objects? Should I > add a column to the data frame and each entry in that column is a > matrix (a mortality curve)? This way, the mortality curve would be > stored next to age/gender data in the data frame. However, I read in > several places that putting vectors/matrices as elements of a data > frame is a bad idea. I do not know why. What is a good design choice > in this instance please? How should I store the mortality curves? > > Thank you for your help. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >