Hello * I would like to reshape wide the following dataset:> rl <- read.dta("intermedi/rapporti_lavoro.dta") [c("id_rl","prog","sil_pi","sil_cf","sil_dat_avv")] > dim(rl)[1] 12964 5> object.size(rl)1194728 bytes> head(rl)id_rl prog sil_pi sil_cf sil_dat_avv 1 638 1 04567XXXXXX NLMDRE64A5XXXXXX 2000-08-03 2 1033 1 54872XXXXXX FLGOIP66A3XXXXXX 2000-11-28 3 1043 2 56849XXXXXX QPWOER52E2XXXXXX 2000-07-07 4 1508 2 54982XXXXXX FJKLSD67P4XXXXXX 2000-12-12 5 1532 2 56849XXXXXX QWERTG50T0XXXXXX 2000-03-30 6 3283 1 12345XXXXXX POIQWE74H0XXXXXX 1999-12-31 Sil_cf and sil_pi are the idvar (sensible data too), prog is the timevar (now dataset is not sorted)> sapply(rl, class)id_rl prog sil_pi sil_cf sil_dat_avv "integer" "integer" "character" "character" "Date"> apply(rl, 2, function(x) sum(duplicated(x)))id_rl prog sil_pi sil_cf sil_dat_avv 0 12863 6957 9886 10539> range(rl$prog)[1] 1 101> table(cut(rl$prog,5))(0.9,20.9] (20.9,41] (41,61] (61,81.1] (81.1,101] 12784 75 42 40 23 So i've scripted rl.wide <- reshape(rl, idvar=c("sil_cf","sil_pi"), timevar="prog", direction="wide") but after a biblic time I got something like "Error: evaluation nested too deeply: infinite recursion". Any suggestion to perform that reshape? Many thanks Luca
You might have a look at the reshape __package__, which I generally find easier to use than the reshape function. Kevin Wright On Mon, Sep 14, 2009 at 9:41 AM, Luca Braglia <braglia@poleis.eu> wrote:> Hello * > > I would like to reshape wide the following dataset: > > > > rl <- read.dta("intermedi/rapporti_lavoro.dta") > [c("id_rl","prog","sil_pi","sil_cf","sil_dat_avv")] > > dim(rl) > [1] 12964 5 > > object.size(rl) > 1194728 bytes > > > head(rl) > id_rl prog sil_pi sil_cf sil_dat_avv > 1 638 1 04567XXXXXX NLMDRE64A5XXXXXX 2000-08-03 > 2 1033 1 54872XXXXXX FLGOIP66A3XXXXXX 2000-11-28 > 3 1043 2 56849XXXXXX QPWOER52E2XXXXXX 2000-07-07 > 4 1508 2 54982XXXXXX FJKLSD67P4XXXXXX 2000-12-12 > 5 1532 2 56849XXXXXX QWERTG50T0XXXXXX 2000-03-30 > 6 3283 1 12345XXXXXX POIQWE74H0XXXXXX 1999-12-31 > > Sil_cf and sil_pi are the idvar (sensible data too), prog is the timevar > (now dataset is not sorted) > > > sapply(rl, class) > id_rl prog sil_pi sil_cf sil_dat_avv > "integer" "integer" "character" "character" "Date" > > > apply(rl, 2, function(x) sum(duplicated(x))) > id_rl prog sil_pi sil_cf sil_dat_avv > 0 12863 6957 9886 10539 > > > range(rl$prog) > [1] 1 101 > > > table(cut(rl$prog,5)) > > (0.9,20.9] (20.9,41] (41,61] (61,81.1] (81.1,101] > 12784 75 42 40 23 > > > > > So i've scripted > > rl.wide <- reshape(rl, idvar=c("sil_cf","sil_pi"), timevar="prog", > direction="wide") > > but after a biblic time I got something like "Error: evaluation nested too > deeply: infinite recursion". > > Any suggestion to perform that reshape? > > Many thanks > > Luca > > ______________________________________________ > R-help@r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Some hints:The reshape works on the small set of data provided. The answer looks like this:> reshape(rl, idvar=c("sil_cf","sil_pi"), timevar="prog", direction="wide")sil_pi sil_cf id_rl.1 sil_dat_avv.1 id_rl.2 sil_dat_avv.2 1 04567XXXXXX NLMDRE64A5XXXXXX 638 2000-08-03 NA <NA> 2 54872XXXXXX FLGOIP66A3XXXXXX 1033 2000-11-28 NA <NA> 3 56849XXXXXX QPWOER52E2XXXXXX NA <NA> 1043 2000-07-07 4 54982XXXXXX FJKLSD67P4XXXXXX NA <NA> 1508 2000-12-12 5 56849XXXXXX QWERTG50T0XXXXXX NA <NA> 1532 2000-03-30 6 12345XXXXXX POIQWE74H0XXXXXX 3283 1999-12-31 NA <NA> How distinct values do you have in prog? You will end up with about 2+ prog *2 number of columns. Is this your intention? Also maybe you wanted to drop the id_rl column: reshape(rl, idvar=c("sil_cf","sil_pi"), timevar="prog", direction="wide", drop = "id_rl") sil_pi sil_cf sil_dat_avv.1 sil_dat_avv.2 1 04567XXXXXX NLMDRE64A5XXXXXX 2000-08-03 <NA> 2 54872XXXXXX FLGOIP66A3XXXXXX 2000-11-28 <NA> 3 56849XXXXXX QPWOER52E2XXXXXX <NA> 2000-07-07 4 54982XXXXXX FJKLSD67P4XXXXXX <NA> 2000-12-12 5 56849XXXXXX QWERTG50T0XXXXXX <NA> 2000-03-30 6 12345XXXXXX POIQWE74H0XXXXXX 1999-12-31 <NA> HTH Schalk Heunis On Mon, Sep 14, 2009 at 4:41 PM, Luca Braglia <braglia@poleis.eu> wrote:> Hello * > > I would like to reshape wide the following dataset: > > > > rl <- read.dta("intermedi/rapporti_lavoro.dta") > [c("id_rl","prog","sil_pi","sil_cf","sil_dat_avv")] > > dim(rl) > [1] 12964 5 > > object.size(rl) > 1194728 bytes > > > head(rl) > id_rl prog sil_pi sil_cf sil_dat_avv > 1 638 1 04567XXXXXX NLMDRE64A5XXXXXX 2000-08-03 > 2 1033 1 54872XXXXXX FLGOIP66A3XXXXXX 2000-11-28 > 3 1043 2 56849XXXXXX QPWOER52E2XXXXXX 2000-07-07 > 4 1508 2 54982XXXXXX FJKLSD67P4XXXXXX 2000-12-12 > 5 1532 2 56849XXXXXX QWERTG50T0XXXXXX 2000-03-30 > 6 3283 1 12345XXXXXX POIQWE74H0XXXXXX 1999-12-31 > > Sil_cf and sil_pi are the idvar (sensible data too), prog is the timevar > (now dataset is not sorted) > > > sapply(rl, class) > id_rl prog sil_pi sil_cf sil_dat_avv > "integer" "integer" "character" "character" "Date" > > > apply(rl, 2, function(x) sum(duplicated(x))) > id_rl prog sil_pi sil_cf sil_dat_avv > 0 12863 6957 9886 10539 > > > range(rl$prog) > [1] 1 101 > > > table(cut(rl$prog,5)) > > (0.9,20.9] (20.9,41] (41,61] (61,81.1] (81.1,101] > 12784 75 42 40 23 > > > > > So i've scripted > > rl.wide <- reshape(rl, idvar=c("sil_cf","sil_pi"), timevar="prog", > direction="wide") > > but after a biblic time I got something like "Error: evaluation nested too > deeply: infinite recursion". > > Any suggestion to perform that reshape? > > Many thanks > > Luca > > ______________________________________________ > R-help@r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]