Greetings. I'm teaching linear regression this semester and that means I write more functions for my regression support package "rockchalk". I'm at a point now were some fresh eyes would help, so if you are a student in a course on regression, please consider looking over my package overview document here: http://pj.freefaculty.org/R/rockchalk.pdf That tells how you can grab the test verion, which is now at 1.7.I'm leaving rockchalk on CRAN at 1.6.3, but as the document linked above explains, you can download the test version from our local repository called "kran". I have been making new packages every 3 days or so. If you are a github user, you can clone a copy of the source if you like (http:// The functions that have gotten the biggest workover are predictOMatic, newdata, plotSlopes, plotCurves, and testSlopes. If you just install the package and run those examples, you will be able to tell right away if you are interested in trying to adapt these to your needs. Generally speaking, I watch the students each semester to see which R things frustrate them the most and then I try to automate them. That's how the regression table function (outreg) started, and difficulties in plotting predicted values for various kinds of regression drive most of the rest. I guess the rockchalk vignette explains all that. If you are interested in that flavor of R, or know other people who might be, spread the word: we are offering a one-week summer course at the University of Kansas. There is a discount for enrollment before the end of the month. The R course is part of our larger Summer Statistical Institute, which has been growing rapidly for the past decade. We've had very popular courses on structural equations modeling and hierarchical models. Here's the more formal announcement. Stats Camp 2013 registration is now in full swing. Last year we had over 300 participants attend 11 different courses. This coming June we have 15 different courses being offered. Please visit our web pages at http://crmda.ku.edu/statscamp for information on the courses, a brief syllabus for each, and registration information. pj -- Paul E. Johnson Professor, Political Science Assoc. Director 1541 Lilac Lane, Room 504 Center for Research Methods University of Kansas University of Kansas http://pj.freefaculty.org http://quant.ku.edu
Hi Paul, I skimmed over the pdf. I have comments on the discusssion about centering. I'm from a completely different field (chemometrics). Of course, I also have to explain centering. However, the argumentation I use is somewhat different from the one you give in your pdf. One argument I have in favour of (mean) centering is numerical stability, depending on the algorithm of course. I generally recommend that if data is centered, there should be an argument why the *chosen* center is *meaningful*, emphasizing that centering actually involves decisions, and that the center can have a meaning. While I agree that a centered model with the center chosen without any thought about its meaning is "exactly the same in every important way" compared to not centering, I disagree with the generality of your claim. A "natural" center of the data may exist. And in this case, using this appropriate center will ease the interpretation. Examples: - In analytical chemistry / chemometrics e.g. we can often use blanks (samples without analyte) as coordinate origin. Centering to the blank removes the influence of some parts of the instrumentation, like sample holders, cuvettes, etc. - Many of our samples (sample in the meaning of physical specimen) have a so-called matrix (a common composition/substance in which different other substances/things are observed), or is measured in a solvent. - I also work with biological specimen. There we often have controls (either control specimen/patients or for example normal tissue [vs. diseased tissues]) which are another type of "natural" coordinate origin. - I can even imagine problems where mean centering is meaningful: if the problem involves modeling properties that are deviations from a mean (I'm thinking of process analytics). However, mean centering will always need careful attention about the sampling procedure. Looking from the opposite point of view, some problems of *mean* centering become apparent. If the data comes from different groups, the mean may not be meaningful (I once heard a biologist arguing that the average human has one ovary and one testicle - this gets your audience awake and usually convinces immediately). And the mean may be influenced by the different proportions of the groups in your data. Which is what you do *not* want: what you want is a stable center. Best, Claudia -- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.beleites at ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399
Hi Paul, I skimmed over the pdf. I have comments on the discusssion about centering. I'm from a completely different field (chemometrics). Of course, I also have to explain centering. However, the argumentation I use is somewhat different from the one you give in your pdf. One argument I have in favour of (mean) centering is numerical stability, depending on the algorithm of course. I generally recommend that if data is centered, there should be an argument why the *chosen* center is *meaningful*, emphasizing that centering actually involves decisions, and that the center can have a meaning. While I agree that a centered model with the center chosen without any thought about its meaning is "exactly the same in every important way" compared to not centering, I disagree with the generality of your claim. A "natural" center of the data may exist. And in this case, using this appropriate center will ease the interpretation. Examples: - In analytical chemistry / chemometrics e.g. we can often use blanks (samples without analyte) as coordinate origin. Centering to the blank removes the influence of some parts of the instrumentation, like sample holders, cuvettes, etc. - Many of our samples (sample in the meaning of physical specimen) have a so-called matrix (a common composition/substance in which different other substances/things are observed), or is measured in a solvent. - I also work with biological specimen. There we often have controls (either control specimen/patients or for example normal tissue [vs. diseased tissues]) which are another type of "natural" coordinate origin. - I can even imagine problems where mean centering is meaningful: if the problem involves modeling properties that are deviations from a mean (I'm thinking of process analytics). However, mean centering will always need careful attention about the sampling procedure. Looking from the opposite point of view, some problems of *mean* centering become apparent. If the data comes from different groups, the mean may not be meaningful (I once heard a biologist arguing that the average human has one ovary and one testicle - this gets your audience awake and usually convinces immediately). And the mean may be influenced by the different proportions of the groups in your data. Which is what you do *not* want: what you want is a stable center. Best, Claudia -- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.beleites at ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399