Paul Johnson
2002-Feb-12 06:37 UTC
[R] A couple of little R things I can't figure out (column percents, regression with lagged variables)
Simple usage questions that I ought to be able to figure on my own, but can't. 1.I'm able to produce a cross tabulation table showing counts with either table or xtabs. But I want column percentages for interpretation, and it seems stupid to sit there with a calculator figuring marginals and column percentages. How to make R do it after this: > x <- c(1,3,1,3,1,3,1,3,4,4) > y <- c(2,4,1,4,2,4,1,4,2,4) > hmm <- table(x,y) > hmm y x 1 2 4 1 2 2 0 3 0 0 4 4 0 1 1 #I can get the column sums: > tots <- apply(hmm,2,sum) and I can get the total N of counts, but don't understand how to make it calculate column percents, as in y x 1 2 4 1 100 66.67 0 3 0 0 80 4 0 33.33 20 Pointers appreciated. 2. A student said here's y, a vector representing a time series, and here's x, a vector representing a time series. I want to do a conventional regression of y on the lag of x. In sas you do xlag=lag(x) and then use xlag in a regresson. I just want something simple like lm(y~lag(x)). But in R base there's no lag. So I can get it the old fashioned way: > xx <- c(NA,x) > modl <- lm(y~xx[1:length(y)]) > summary(modl) One sidenote is that summary does not include any mention of the fact that 1 observation was lost due to missing value. That seems bad to me. I see the lag function in ts, but when I use it, it doesn't change x, so obviously I don't see the point of that. > z <- lag (x) > z [1] 1 3 1 3 1 3 1 3 4 4 So long, thanks in advance, greetings, etc... -- Paul E. Johnson email: pauljohn at ukans.edu Dept. of Political Science http://lark.cc.ku.edu/~pauljohn University of Kansas Office: (785) 864-9086 Lawrence, Kansas 66045 FAX: (785) 864-5700 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA
2002-Feb-12 07:14 UTC
[R] A couple of little R things I can't figure out (column percents, regression with lagged variables)
Paul Johnson <pauljohn at ku.edu> writes:> y > > x 1 2 4 > 1 100 66.67 0 > 3 0 0 80 > 4 0 33.33 20 > > Pointers appreciated.prop.table(hmm,2) * 100> 2. A student said here's y, a vector representing a time series, and > here's x, a vector representing a time series. I want to do a > conventional regression of y on the lag of x. In sas you do > xlag=lag(x) and then use xlag in a regresson. I just want something > simple like lm(y~lag(x)). But in R base there's no lag. > > So I can get it the old fashioned way: > > xx <- c(NA,x) > > modl <- lm(y~xx[1:length(y)]) > > summary(modl) > > One sidenote is that summary does not include any mention of the fact > that 1 observation was lost due to missing value. That seems bad to me. > > I see the lag function in ts, but when I use it, it doesn't change x, > so obviously I don't see the point of that. > > > z <- lag (x) > > z > [1] 1 3 1 3 1 3 1 3 4 4This stuff only works for time series, e.g. cbind(ts(x),lag(ts(x))). Notice that even then, lag() works in the opposite direction (for historical reasons). -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
stecalza@tiscali.it
2002-Feb-12 09:03 UTC
[R] RE: [R] A couple of little R things I can't figure out (column percents, regression with lagged variables)
Try with prop.table(hmm,#) put # = 1 for row percentages # = 2 for column ones Stefano -- Messaggio Originale -->Simple usage questions that I ought to be able to figure on my own, but>can't. > >1.I'm able to produce a cross tabulation table showing counts with >either table or xtabs. But I want column percentages for >interpretation, and it seems stupid to sit there with a calculator >figuring marginals and column percentages. How to make R do it after this: > > > x <- c(1,3,1,3,1,3,1,3,4,4) > > y <- c(2,4,1,4,2,4,1,4,2,4) > > hmm <- table(x,y) > > hmm > y >x 1 2 4 > 1 2 2 0 > 3 0 0 4 > 4 0 1 1 > >#I can get the column sums: > > tots <- apply(hmm,2,sum) > >and I can get the total N of counts, but don't understand how to make it > >calculate column percents, as in > > y > >x 1 2 4 > 1 100 66.67 0 > 3 0 0 80 > 4 0 33.33 20 > >Pointers appreciated. > > >2. A student said here's y, a vector representing a time series, and >here's x, a vector representing a time series. I want to do a >conventional regression of y on the lag of x. In sas you do xlag=lag(x)>and then use xlag in a regresson. I just want something simple like >lm(y~lag(x)). But in R base there's no lag. > >So I can get it the old fashioned way: > > xx <- c(NA,x) > > modl <- lm(y~xx[1:length(y)]) > > summary(modl) > >One sidenote is that summary does not include any mention of the fact >that 1 observation was lost due to missing value. That seems bad to me. > >I see the lag function in ts, but when I use it, it doesn't change x, so > >obviously I don't see the point of that. > > > z <- lag (x) > > z > [1] 1 3 1 3 1 3 1 3 4 4 > > >So long, thanks in advance, greetings, etc... > >-- >Paul E. Johnson email: pauljohn at ukans.edu >Dept. of Political Science http://lark.cc.ku.edu/~pauljohn >University of Kansas Office: (785) 864-9086 >Lawrence, Kansas 66045 FAX: (785) 864-5700 > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- >r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html >Send "info", "help", or "[un]subscribe" >(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >__________________________________________________________________ Abbonati a Tiscali! Con Tiscali By Phone puoi anche ascoltare ed inviare email al telefono. Chiama Tiscali By Phone all' 892 800 http://byphone.tiscali.it -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Joerg Maeder
2002-Feb-12 09:34 UTC
[R] A couple of little R things I can't figure out (column percents,regression with lagged variables)
hallo paul, for the percentages use this code (the trick is to use transpose twice) t(t(hm)/apply(hmm,2,sum))*100 Paul Johnson wrote:> > Simple usage questions that I ought to be able to figure on my own, but > can't. > > 1.I'm able to produce a cross tabulation table showing counts with > either table or xtabs. But I want column percentages for > interpretation, and it seems stupid to sit there with a calculator > figuring marginals and column percentages. How to make R do it after this: > > > x <- c(1,3,1,3,1,3,1,3,4,4) > > y <- c(2,4,1,4,2,4,1,4,2,4) > > hmm <- table(x,y) > > hmm > y > x 1 2 4 > 1 2 2 0 > 3 0 0 4 > 4 0 1 1 > > #I can get the column sums: > > tots <- apply(hmm,2,sum) > > and I can get the total N of counts, but don't understand how to make it > calculate column percents, as in > > y > > x 1 2 4 > 1 100 66.67 0 > 3 0 0 80 > 4 0 33.33 20 > > Pointers appreciated. > > 2. A student said here's y, a vector representing a time series, and > here's x, a vector representing a time series. I want to do a > conventional regression of y on the lag of x. In sas you do xlag=lag(x) > and then use xlag in a regresson. I just want something simple like > lm(y~lag(x)). But in R base there's no lag. > > So I can get it the old fashioned way: > > xx <- c(NA,x) > > modl <- lm(y~xx[1:length(y)]) > > summary(modl) > > One sidenote is that summary does not include any mention of the fact > that 1 observation was lost due to missing value. That seems bad to me. > > I see the lag function in ts, but when I use it, it doesn't change x, so > obviously I don't see the point of that. > > > z <- lag (x) > > z > [1] 1 3 1 3 1 3 1 3 4 4 > > So long, thanks in advance, greetings, etc... > > -- > Paul E. Johnson email: pauljohn at ukans.edu > Dept. of Political Science http://lark.cc.ku.edu/~pauljohn > University of Kansas Office: (785) 864-9086 > Lawrence, Kansas 66045 FAX: (785) 864-5700 > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- Joerg Maeder .:|:||:..:.||.:: maeder at atmos.umnw.ethz.ch Tel: +41 1 633 36 25 .:|:||:..:.||.:: http://www.iac.ethz.ch/staff/maeder PhD student at INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE (IACETH) ETH Z?RICH Switzerland -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jason Turner
2002-Feb-12 09:38 UTC
[R] A couple of little R things I can't figure out (column percents, regression with lagged variables)
Long one - sorry in advance. On Tue, Feb 12, 2002 at 12:37:55AM -0600, Paul Johnson wrote:> 1.I'm able to produce a cross tabulation table showing counts with > either table or xtabs. But I want column percentages for ...apply(hmm,2,function(zz){zz/sum(zz)}) * 100> 2. A student said here's y, a vector representing a time series, and > here's x, a vector representing a time series. I want to do a > conventional regression of y on the lag of x. In sas you do xlag=lag(x)...> I see the lag function in ts, but when I use it, it doesn't change x, so > obviously I don't see the point of that.In R, lag(x) shifts the "start" attribute of the time series, but doesn't alter the data per se. This is why you get the same before and after picture, when you just look at it as a vector. To see the effect of lagging data, it only makes sense to compare one lagged time series with another time series, like so... (using x and y above)> library(ts) > yt<-ts(y) > xt<-ts(x)> lagxt<-lag(xt) > x1<-ts.union(yt,xt,lagxt) > x1Time Series: Start = 0 End = 10 Frequency = 1 yt xt lagxt 0 NA NA 1 1 2 1 3 2 4 3 1 3 1 1 3 4 4 3 1 5 2 1 3 6 4 3 1 7 1 1 3 8 4 3 4 9 2 4 4 10 4 4 NA Note the "start" value of x1. If you want to use lm and friends, you must make them "plain vanilla" vectors - the time series part confuses lm. One way...> d1<-data.frame(x1) > r1<-lm(yt ~ lagxt, data=d1)...> One sidenote is that summary does not include any mention of the fact > that 1 observation was lost due to missing value. That seems bad to me.... It does> summary(r1)..[lots of stuff removed]... Residual standard error: 1.171 on 7 degrees of freedom Multiple R-Squared: 0.3143, Adjusted R-squared: 0.2163 F-statistic: 3.208 on 1 and 7 DF, p-value: 0.1164 ^^^^^^^^^^ though I'll be the first to admit it doesn't jump up and bite you ;) Cheers Jason -- Indigo Industrial Controls Ltd. 64-21-343-545 jasont at indigoindustrial.co.nz -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Marc R. Feldesman
2002-Feb-12 10:38 UTC
[R] A couple of little R things I can't figure out (column percents, regression with lagged variables)
At 11:14 PM 2/11/02, Peter Dalgaard BSA wrote: >Paul Johnson <pauljohn at ku.edu> writes: > >> y >> >> x 1 2 4 >> 1 100 66.67 0 >> 3 0 0 80 >> 4 0 33.33 20 >> >> Pointers appreciated. > >prop.table(hmm,2) * 100 > Would it be possible to put a cross reference to this function (and margin.table) into the help file for table()? They are handy functions whose existence isn't obvious. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jason Turner
2002-Feb-12 11:58 UTC
[R] A couple of little R things I can't figure out (column percents, regression with lagged variables)
After the good observations on "prop.table" as an answer to Paul Johnson's first problem, and comparing it to my much less elegant solution.... ----- Forwarded message from Jason Turner <jasont at indigoindustrial.co.nz> ----- Date: Tue, 12 Feb 2002 22:38:12 +1300 From: Jason Turner <jasont at indigoindustrial.co.nz> To: Paul Johnson <pauljohn at ku.edu> Subject: Re: [R] A couple of little R things I can't figure out (column percents, regression with lagged variables) ... apply(hmm,2,function(zz){zz/sum(zz)}) * 100 ... ----- End forwarded message ----- It's my new invention. I call it "The Wheel". Off to the patent office ;) Jason who can see that the existing wheel is much better than his, and thinks R is very, very good. Take a bow, R-core. -- Indigo Industrial Controls Ltd. 64-21-343-545 jasont at indigoindustrial.co.nz -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._