Displaying 20 results from an estimated 300 matches similar to: "Lahman Baseball Data Using R DBI Package"
2020 Oct 03
1
Lahman Baseball Data Using R DBI Package
The double quotes are required by SQL if a name is not of the form
letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL
keyword like 'where' or 'select'. If you are doing this from a function,
you may as well quote all the names.
-Bill
On Fri, Oct 2, 2020 at 6:18 PM Philip <herd_dog at cox.net> wrote:
> The \?2B\? worked. Have no idea why. Can
2020 Oct 08
0
Lahman Baseball Data Using R DBI Package
Hi Philip,
You've probably realized by now that R doesn't like column names that
start with a number. If you try to access an R-dataframe column named
2B or 3B with the familiar "$" notation, you'll get an error:
> library(DBI)
> library(RSQLite)
> con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
> Hack12Batting <-
2020 Oct 08
1
Lahman Baseball Data Using R DBI Package
This is really a feature of SQL, not R. SQL requires that you double quote
column names that start with numbers, include spaces, etc., or that are SQL
key words. E.g.,
> d <- data.frame(Order=c("sit","stay","heel"),
Where=c("here","there","there"), From=c("me","me","you"))
>
2006 Feb 23
2
Working with lists with numerical names
Greetings!
I'm have a hard time working with some data I imported from a baseball
database. Several of the database columns have numbers in them (2B,
3B), and when I try to use these vectors from the data frame, I get
syntax errors, probably because it's interpreting the name as a number:
> show(batting2005)
playerID yearID stint teamID lgID G AB R H 2B 3B HR RBI SB CS BB
2014 Aug 26
1
no visible binding for global variable for data sets in a package
I'm updating the Lahman package of baseball statistics to the 2013
release. In addition to
the main data sets, the package also contains several convenience
functions that make use
of these data sets. These now trigger the notes below from R CMD check
run with
Win builder, R-devel. How can I avoid these?
* using R Under development (unstable) (2014-08-25 r66471)
* using platform:
2009 Dec 02
2
Extracting vectors from a matrix (err, I think) in RMySQL
I have a query which returns a data set like so:
> salaries
yearID POS pct
1 2009 RF 203
2 2009 DH 200
3 2009 1B 198
4 2009 3B 180
5 2009 LF 169
6 2009 SS 156
7 2009 CF 148
8 2009 2B 97
9 2009 C 86
10 2008 DH 234
11 2008 1B 199
12 2008 RF 197
13 2008 3B 191
14 2008 SS 180
15 2008 CF 164
16 2008 LF 156
17 2008 2B 104
18 2008
2009 Dec 03
2
Formatting of numbers on y axis
Hello all. I have the following:
plot(salaries$yearID, salaries$salary, type='n', xaxt='n', xlab='',
yaxt='n', ylab='')
axis(1, at=unique(salaries$yearID), labels=unique(salaries$yearID), lwd=.25,
tck=-0.05)
axis(2, axTicks(2), format(axTicks(2), scientific = F))
Which nicely creates the Y axis with the raw numbers, which are in the range
of .5 - 7
2011 Sep 16
2
R license for a derived data-only package
I'm looking for guidance or advice about the R license to use in
preparing a package containing the
Baseball Database from http://baseball1.com/statistics/
My main purpose is to make it available to students in a course, and to
develop it with others
I'd like to put it on R-Forge, and then perhaps make it public on CRAN.
However, the page above bears a very restrictive copyright notice
2009 Dec 02
0
Stacked bar chart help.
Can't figure this out. I have the following list of salary averages per
year, per position. The dput output is:
> dput(salaries)
structure(list(yearID = c(2009, 2009, 2009, 2009, 2009, 2009,
2009, 2009, 2009, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008,
2008), AVG = c(8956855.61, 7886684.166126, 7534048.43102, 7406439.339471,
7219148.437934, 6697734.908336, 6400379.88398,
2006 Mar 07
0
Form fields and MySQL column name Association
Im having trouble with a sports site im creating for a client of mine.
The part is where the Team Captians for each team in the league go into
the admin section to submit a match report for each match. The form
looks like this:
TEAM 1 BATTING: (these are all drop down menus that pull the player
names from the database)
There are 10 of these rows that look like this, 1 row for each player.
2006 Apr 05
7
Archive monthly count for blog
I am creating a blog to learn ruby on rails.
from the layout page i pass all the posts as a collection to
_archive.rhtml
<%= render :partial => "archive", :collection => @archive %>
On _archive.rhtml i have access to the collection. I am then gone to
render partial another page _archivecount.rhtml to display the number of
posts for each month.
Can anyone give a clue
2017 Oct 12
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi,
I recently ran into an inconsistency in the way model.matrix.default
handles factor encoding for higher level interactions with categorical
variables when the full hierarchy of effects is not present. Depending on
which lower level interactions are specified, the factor encoding changes
for a higher level interaction. Consider the following minimal reproducible
example:
--------------
>
2017 Nov 06
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler,
You write that you understand what I am saying. However, I am now at
loss about what exactly is the problem with the behavior of R. Here
is a script which reproduces your experiments with three variables
(excluding the full model):
m=expand.grid(X1=c(1,-1),X2=c(1,-1),X3=c("A","B","C"))
model.matrix(~(X1+X2+X3)^3-X1:X3,data=m)
2017 Oct 27
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler,
I want to bring to your attention the following document: "What
happens if you omit the main effect in a regression model with an
interaction?" (https://stats.idre.ucla.edu/stata/faq/what-happens-if-you-omit-the-main-effect-in-a-regression-model-with-an-interaction).
This gives a useful review of the problem. Your example is Case 2: a
continuous and a categorical regressor.
2017 Oct 15
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
I think it is not a bug. It is a general property of interactions.
This property is best observed if all variables are factors
(qualitative).
For example, you have three variables (factors). You ask for as many
interactions as possible, except an interaction term between two
particular variables. When this interaction is not a constant, it is
different for different values of the remaining
2017 Nov 04
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler,
I rephrase my previous mail, as follows:
In your example, T_i = X1:X2:X3. Let F_j = X3. (The numerical
variables X1 and X2 are not encoded at all.) Then T_{i(j)} = X1:X2,
which in the example is dropped from the model. Hence the X3 in T_i
must be encoded by dummy variables, as indeed it is.
Arie
On Thu, Nov 2, 2017 at 4:11 PM, Tyler <tylermw at gmail.com> wrote:
> Hi
2017 Nov 06
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie,
Given the heuristic, in all of my examples with a missing two-factor
interaction the three-factor interaction should be coded with dummy
variables. In reality, it is encoded by dummy variables only when the
numeric:numeric interaction is missing, and by contrasts for the other two.
The heuristic does not specify separate behavior for numeric vs categorical
factors (When the author of
2017 Nov 02
2
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hello Tyler,
Thank you for searching for, and finding, the basic description of the
behavior of R in this matter.
I think your example is in agreement with the book.
But let me first note the following. You write: "F_j refers to a
factor (variable) in a model and not a categorical factor". However:
"a factor is a vector object used to specify a discrete
classification"
2017 Oct 31
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie,
Thank you for your further research into the issue.
Regarding Stata: On the other hand, JMP gives model matrices that use the
main effects contrasts in computing the higher order interactions, without
the dummy variable encoding. I verified this both by analyzing the linear
model given in my first example and noting that JMP has one more degree of
freedom than R for the same model, as
2017 Nov 02
0
Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Hi Arie,
The book out of which this behavior is based does not use factor (in this
section) to refer to categorical factor. I will again point to this
sentence, from page 40, in the same section and referring to the behavior
under question, that shows F_j is not limited to categorical factors:
"Numeric variables appear in the computations as themselves, uncoded.
Therefore, the rule does not