similar to: Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing

Displaying 20 results from an estimated 500 matches similar to: "Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing"

2017 Nov 02
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler, Thank you for searching for, and finding, the basic description of the behavior of R in this matter. I think your example is in agreement with the book. But let me first note the following. You write: "F_j refers to a factor (variable) in a model and not a categorical factor". However: "a factor is a vector object used to specify a discrete classification"
2017 Nov 04
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler, I rephrase my previous mail, as follows: In your example, T_i = X1:X2:X3. Let F_j = X3. (The numerical variables X1 and X2 are not encoded at all.) Then T_{i(j)} = X1:X2, which in the example is dropped from the model. Hence the X3 in T_i must be encoded by dummy variables, as indeed it is. Arie On Thu, Nov 2, 2017 at 4:11 PM, Tyler <tylermw at gmail.com> wrote: > Hi
2017 Nov 06
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler, You write that you understand what I am saying. However, I am now at loss about what exactly is the problem with the behavior of R. Here is a script which reproduces your experiments with three variables (excluding the full model): m=expand.grid(X1=c(1,-1),X2=c(1,-1),X3=c("A","B","C")) model.matrix(~(X1+X2+X3)^3-X1:X3,data=m)
2017 Oct 12
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi, I recently ran into an inconsistency in the way model.matrix.default handles factor encoding for higher level interactions with categorical variables when the full hierarchy of effects is not present. Depending on which lower level interactions are specified, the factor encoding changes for a higher level interaction. Consider the following minimal reproducible example: -------------- >
2017 Oct 31
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie, Thank you for your further research into the issue. Regarding Stata: On the other hand, JMP gives model matrices that use the main effects contrasts in computing the higher order interactions, without the dummy variable encoding. I verified this both by analyzing the linear model given in my first example and noting that JMP has one more degree of freedom than R for the same model, as
2017 Nov 02
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie, The book out of which this behavior is based does not use factor (in this section) to refer to categorical factor. I will again point to this sentence, from page 40, in the same section and referring to the behavior under question, that shows F_j is not limited to categorical factors: "Numeric variables appear in the computations as themselves, uncoded. Therefore, the rule does not
2017 Nov 04
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie, I understand what you're saying. The following excerpt out of the book shows that F_j does not refer exclusively to categorical factors: "...the rule does not do anything special for them, and it remains valid, in a trivial sense, whenever any of the F_j is numeric rather than categorical." Since F_j refers to both categorical and numeric variables, the behavior of
2017 Nov 06
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie, Given the heuristic, in all of my examples with a missing two-factor interaction the three-factor interaction should be coded with dummy variables. In reality, it is encoded by dummy variables only when the numeric:numeric interaction is missing, and by contrasts for the other two. The heuristic does not specify separate behavior for numeric vs categorical factors (When the author of
2017 Oct 15
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
I think it is not a bug. It is a general property of interactions. This property is best observed if all variables are factors (qualitative). For example, you have three variables (factors). You ask for as many interactions as possible, except an interaction term between two particular variables. When this interaction is not a constant, it is different for different values of the remaining
2020 Oct 03
1
Lahman Baseball Data Using R DBI Package
The double quotes are required by SQL if a name is not of the form letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL keyword like 'where' or 'select'. If you are doing this from a function, you may as well quote all the names. -Bill On Fri, Oct 2, 2020 at 6:18 PM Philip <herd_dog at cox.net> wrote: > The \?2B\? worked. Have no idea why. Can
2020 Oct 02
3
Lahman Baseball Data Using R DBI Package
I?m trying to pull data from one table (batting) in the Lahman Baseball database. Notice X2B for doubles and X3B for triples ? fourth and fifth from the right. The dbGetQuery function runs fine when I leave there two out but I get error messages (in red) when I include 2B/3B or X2B/X3B. Can anyone give me some direction? Thanks, Philip Heinrich
2020 Oct 08
0
Lahman Baseball Data Using R DBI Package
Hi Philip, You've probably realized by now that R doesn't like column names that start with a number. If you try to access an R-dataframe column named 2B or 3B with the familiar "$" notation, you'll get an error: > library(DBI) > library(RSQLite) > con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite") > Hack12Batting <-
2020 Oct 08
1
Lahman Baseball Data Using R DBI Package
This is really a feature of SQL, not R. SQL requires that you double quote column names that start with numbers, include spaces, etc., or that are SQL key words. E.g., > d <- data.frame(Order=c("sit","stay","heel"), Where=c("here","there","there"), From=c("me","me","you")) >
2003 Jul 23
6
Condition indexes and variance inflation factors
Has anyone programmed condition indexes in R? I know that there is a function for variance inflation factors available in the car package; however, Belsley (1991) Conditioning Diagnostics (Wiley) notes that there are several weaknesses of VIFs: e.g. 1) High VIFs are sufficient but not necessary conditions for collinearity 2) VIFs don't diagnose the number of collinearities and 3) No one has
2016 Oct 03
2
suggested addition to model.matrix
Hello, All: What's the simplest way to convert a data.frame into a model.matrix? One way is given by the following example, modified from the examples in help(model.matrix): dd <- data.frame(a = gl(3,4), b = gl(4,1,12)) ab <- model.matrix(~ a + b, dd) ab0 <- model.matrix(~., dd) all.equal(ab, ab0) What do you think about replacing "model.matrix(~ a +
2010 Jun 23
4
Comparing distributions
I am trying to do something in R and would appreciate a push into the right direction. I hope some of you experts can help. I have two distributions obtrained from 10000 datapoints each (about 10000 datapoints each, non-normal with multi-model shape (when eye-balling densities) but other then that I know little about its distribution). When plotting the two distributions together I can see that
2010 Jan 29
1
regression with categorial variables
Hi All, I am working on an example where the electric utility is investigating the effect of size of household and the type of air conditioning on electricity consumption. I fit a multiple linear regression Electricity consumption=size of the house hold + air conditioning type There are 3 air conditioning types so I modeled them as a dummy variable Type A Type B Type C Where type A is the
2004 Jun 07
1
Censboot Warning and Error Messages
Good day R help list!!! I've been trying to do Bootstrap in R on Censored data. I encountered WARNING/ERROR messages which I could not find explanation. I've been searching on the literature for two days now and still can't find answers. I hope there's anyone out there who can help me with these two questions: 1. If the "Loglik converged before variable..." message
2012 May 29
3
trouble automating formula edits when log or * are present; update trouble
Greetings I want to take a fitted regression and replace all uses of a variable in a formula. For example, I'd like to take m1 <- lm(y ~ x1, data=dat) and replace x1 with something else, say x1c, so the formula would become m1 <- lm(y ~ x1c, data=dat) I have working code to finish that part of the problem, but it fails when the formula is more complicated. If the formula has log(x1)
2010 Aug 11
2
help to polish plot in ggplot2
Hi, I wanted to generate a plot which is almost like the plot generated by the following codes. category <- paste("Geographical Category", 1:10) grp1 <- rnorm(10, mean=10, sd=10) grp2 <- rnorm(10, mean=20, sd=10) grp3 <- rnorm(10, mean=15, sd=10) grp4 <- rnorm(10, mean=12, sd=10) mydat <- data.frame(category,grp1,grp2,grp3,grp4) dat.m <- melt(mydat) p <-