Displaying 20 results from an estimated 500 matches similar to: "Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing"
2017 Nov 02
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler,
Thank you for searching for, and finding, the basic description of the
behavior of R in this matter.
I think your example is in agreement with the book.
But let me first note the following. You write: "F_j refers to a
factor (variable) in a model and not a categorical factor". However:
"a factor is a vector object used to specify a discrete
classification"
2017 Nov 04
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler,
I rephrase my previous mail, as follows:
In your example, T_i = X1:X2:X3. Let F_j = X3. (The numerical
variables X1 and X2 are not encoded at all.) Then T_{i(j)} = X1:X2,
which in the example is dropped from the model. Hence the X3 in T_i
must be encoded by dummy variables, as indeed it is.
Arie
On Thu, Nov 2, 2017 at 4:11 PM, Tyler <tylermw at gmail.com> wrote:
> Hi
2017 Nov 06
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler,
You write that you understand what I am saying. However, I am now at
loss about what exactly is the problem with the behavior of R. Here
is a script which reproduces your experiments with three variables
(excluding the full model):
m=expand.grid(X1=c(1,-1),X2=c(1,-1),X3=c("A","B","C"))
model.matrix(~(X1+X2+X3)^3-X1:X3,data=m)
2017 Oct 12
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi,
I recently ran into an inconsistency in the way model.matrix.default
handles factor encoding for higher level interactions with categorical
variables when the full hierarchy of effects is not present. Depending on
which lower level interactions are specified, the factor encoding changes
for a higher level interaction. Consider the following minimal reproducible
example:
--------------
>
2017 Oct 31
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie,
Thank you for your further research into the issue.
Regarding Stata: On the other hand, JMP gives model matrices that use the
main effects contrasts in computing the higher order interactions, without
the dummy variable encoding. I verified this both by analyzing the linear
model given in my first example and noting that JMP has one more degree of
freedom than R for the same model, as
2017 Nov 02
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie,
The book out of which this behavior is based does not use factor (in this
section) to refer to categorical factor. I will again point to this
sentence, from page 40, in the same section and referring to the behavior
under question, that shows F_j is not limited to categorical factors:
"Numeric variables appear in the computations as themselves, uncoded.
Therefore, the rule does not
2017 Nov 04
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie,
I understand what you're saying. The following excerpt out of the book
shows that F_j does not refer exclusively to categorical factors: "...the
rule does not do anything special for them, and it remains valid, in a
trivial sense, whenever any of the F_j is numeric rather than categorical."
Since F_j refers to both categorical and numeric variables, the behavior of
2017 Nov 06
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie,
Given the heuristic, in all of my examples with a missing two-factor
interaction the three-factor interaction should be coded with dummy
variables. In reality, it is encoded by dummy variables only when the
numeric:numeric interaction is missing, and by contrasts for the other two.
The heuristic does not specify separate behavior for numeric vs categorical
factors (When the author of
2017 Oct 15
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
I think it is not a bug. It is a general property of interactions.
This property is best observed if all variables are factors
(qualitative).
For example, you have three variables (factors). You ask for as many
interactions as possible, except an interaction term between two
particular variables. When this interaction is not a constant, it is
different for different values of the remaining
2020 Oct 03
1
Lahman Baseball Data Using R DBI Package
The double quotes are required by SQL if a name is not of the form
letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL
keyword like 'where' or 'select'. If you are doing this from a function,
you may as well quote all the names.
-Bill
On Fri, Oct 2, 2020 at 6:18 PM Philip <herd_dog at cox.net> wrote:
> The \?2B\? worked. Have no idea why. Can
2020 Oct 02
3
Lahman Baseball Data Using R DBI Package
I?m trying to pull data from one table (batting) in the Lahman Baseball database. Notice X2B for doubles and X3B for triples ? fourth and fifth from the right.
The dbGetQuery function runs fine when I leave there two out but I get error messages (in red) when I include 2B/3B or X2B/X3B.
Can anyone give me some direction?
Thanks,
Philip Heinrich
2020 Oct 08
0
Lahman Baseball Data Using R DBI Package
Hi Philip,
You've probably realized by now that R doesn't like column names that
start with a number. If you try to access an R-dataframe column named
2B or 3B with the familiar "$" notation, you'll get an error:
> library(DBI)
> library(RSQLite)
> con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
> Hack12Batting <-
2020 Oct 08
1
Lahman Baseball Data Using R DBI Package
This is really a feature of SQL, not R. SQL requires that you double quote
column names that start with numbers, include spaces, etc., or that are SQL
key words. E.g.,
> d <- data.frame(Order=c("sit","stay","heel"),
Where=c("here","there","there"), From=c("me","me","you"))
>
2003 Jul 23
6
Condition indexes and variance inflation factors
Has anyone programmed condition indexes in R?
I know that there is a function for variance inflation factors
available in the car package; however, Belsley (1991) Conditioning
Diagnostics (Wiley) notes that there are several weaknesses of VIFs:
e.g. 1) High VIFs are sufficient but not necessary conditions for
collinearity 2) VIFs don't diagnose the number of collinearities and 3)
No one has
2016 Oct 03
2
suggested addition to model.matrix
Hello, All:
What's the simplest way to convert a data.frame into a model.matrix?
One way is given by the following example, modified from the
examples in help(model.matrix):
dd <- data.frame(a = gl(3,4), b = gl(4,1,12))
ab <- model.matrix(~ a + b, dd)
ab0 <- model.matrix(~., dd)
all.equal(ab, ab0)
What do you think about replacing "model.matrix(~ a +
2010 Jun 23
4
Comparing distributions
I am trying to do something in R and would appreciate a push into the
right direction. I hope some of you experts can help.
I have two distributions obtrained from 10000 datapoints each (about
10000 datapoints each, non-normal with multi-model shape (when
eye-balling densities) but other then that I know little about its
distribution). When plotting the two distributions together I can see
that
2010 Jan 29
1
regression with categorial variables
Hi All,
I am working on an example where the electric utility is investigating the
effect of size of household and the type of air conditioning on electricity
consumption. I fit a multiple linear regression
Electricity consumption=size of the house hold + air conditioning type
There are 3 air conditioning types so I modeled them as a dummy variable
Type A
Type B
Type C
Where type A is the
2004 Jun 07
1
Censboot Warning and Error Messages
Good day R help list!!!
I've been trying to do Bootstrap in R on Censored data. I encountered
WARNING/ERROR messages which I could not find explanation.
I've been searching on the literature for two days now and still can't find
answers. I hope there's anyone out there who can help me
with these two questions:
1. If the "Loglik converged before variable..." message
2012 May 29
3
trouble automating formula edits when log or * are present; update trouble
Greetings
I want to take a fitted regression and replace all uses of a variable
in a formula. For example, I'd like to take
m1 <- lm(y ~ x1, data=dat)
and replace x1 with something else, say x1c, so the formula would become
m1 <- lm(y ~ x1c, data=dat)
I have working code to finish that part of the problem, but it fails
when the formula is more complicated. If the formula has log(x1)
2010 Aug 11
2
help to polish plot in ggplot2
Hi,
I wanted to generate a plot which is almost like the plot generated by the
following codes.
category <- paste("Geographical Category", 1:10)
grp1 <- rnorm(10, mean=10, sd=10)
grp2 <- rnorm(10, mean=20, sd=10)
grp3 <- rnorm(10, mean=15, sd=10)
grp4 <- rnorm(10, mean=12, sd=10)
mydat <- data.frame(category,grp1,grp2,grp3,grp4)
dat.m <- melt(mydat)
p <-