Deall all, I'm just learning R, but unfortunately I need to urgently do a rather more complex task so I need some help. I have just learnt the very basics a few days ago and am not ready yet to deal with panels and kernel densities, so a soft guidance would be most appreciated. I have a (very) large panel data set (400,000 individuals x50 time periods) and need to display the evolution of the kernel density of some of the variables over time. I can do slices in STATA - eg the distribution in 1950 and 1990 but they are difficult to compare due to different scales used. So I need to have a nice 3D graph that shows how the distribution evolved over time and am hoping that R can do it. I'm very grateful for any suggestions, eugene.
On Saturday 23 August 2003 10:36, Eugene Salinas wrote:> Deall all, > > I'm just learning R, but unfortunately I need to > urgently do a rather more complex task so I need some > help. I have just learnt the very basics a few days > ago and am not ready yet to deal with panels and > kernel densities, so a soft guidance would be most > appreciated. > > I have a (very) large panel data set (400,000 > individuals x50 time periods) and need to display the > evolution of the kernel density of some of the > variables over time. I can do slices in STATA - eg the > distribution in 1950 and 1990 but they are difficult > to compare due to different scales used. So I need to > have a nice 3D graph that shows how the distribution > evolved over time and am hoping that R can do it. > > I'm very grateful for any suggestions, eugene.As a first step, you could create a matrix (with 50 rows, one for each time point) where each row holds the kernel density estimate for that time point. e.g. (with a grid of size 100 for each estimated density), foo <- matrix(0, 50, 100) for (i in 1:50) foo[i, ] <- density(rnorm(5000), from = -4, to = 4, n = 100)$y ^^^^^^^^^^^ ^^^^^^^^^^^ put your put appropriate variable here ranges Whether a 3D view of this will be very informative will depend on your data (maybe you could play with the density() parameters), but persp() should give you something: persp(foo, theta = 135, phi = 30, scale = FALSE, ltheta = -120, shade = 0.75, border = NA) HTH, Deepayan
As a possible enhancement, I would think about using the same bandwidth at all the time points --- indeed I would probably start by looking at a few time points, playing with the bandwidth and then using e.g. persp on density estimates at all 50 time points with that bandwidth.> As a first step, you could create a matrix (with 50 rows, one for each time > point) where each row holds the kernel density estimate for that time point. > e.g. (with a grid of size 100 for each estimated density), > > foo <- matrix(0, 50, 100) > for (i in 1:50) > foo[i, ] <- density(rnorm(5000), from = -4, to = 4, n = 100)$y > ^^^^^^^^^^^ ^^^^^^^^^^^ > put your put appropriate > variable here ranges-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595