similar to: Subsetting Data

Displaying 20 results from an estimated 9000 matches similar to: "Subsetting Data"

2012 Aug 13
3
Using the effects package to plot logit probabilities
I'm trying to run a logit model and plot the probability curve for a number of the important predictors. I'm trying to do this with the Effects package. df=data.frame(income=c(5,5,3,3,6,5), won=c(0,0,1,1,1,0), age=c(18,18,23,50,19,39), home=c(0,0,1,0,0,1)) str(df) md1 = glm(factor(won) ~ income + age + home, data=df,
2012 Jul 19
3
Removing values from a string
So I have the following data frame and I want to know how I can remove all "NA" values from each string, and also remove all "|" values from the START of the string. So they should something like "auto|insurance" or "auto|insurance|quote" one = data.frame(keyword=c("|auto", "NA|auto|insurance|quote", "NA|auto|insurance",
2012 Jul 05
2
Plotting the probability curve from a logit model with 10 predictors
I have a logit model with about 10 predictors and I am trying to plot the probability curve for the model. Y=1 = 1 / 1+e^-z where z=B0 + B1X1 + ... + BnXi If the model had only one predictor, I know to do something like below. mod1 = glm(factor(won) ~ as.numeric(bid), data=mydat, family=binomial(link="logit")) all.x <- expand.grid(won=unique(won), bid=unique(bid)) y.hat.new
2011 Dec 22
1
Error message with glm
I'm working on a logistic regression in R with the car package but keep getting the following error message. It's only and warning and not an error, but I'm just not sure how to resolve the issues. glm.fit: algorithm did not converge glm.fit: fitted probabilities numerically 0 or 1 occurred d1 = data.frame(mwin=c(mwin), mbid=c(mbid)) m1 = zelig(mwin ~ mbid, data=d1,
2012 Aug 02
1
Naive Bayes in R
I'm developing a naive bayes in R. I have the following data and am trying to predict on returned (class). dat = data.frame(home=c(0,1,1,0,0), gender=c("M","M","F","M","F"), returned=c(0,0,1,1,0)) str(dat) dat$home <- as.factor(dat$home) dat$returned <- as.factor(dat$returned) library(e1071) m <- naiveBayes(returned ~ ., dat) m
2011 Dec 21
1
Predicting a linear model for all combinations
Lets say I have a linear model and I want to find the average expented value of the dependent variable. So let's assume that I'm studying the price I pay for coffee. Price = B0 + B1(weather) + B2(gender) + ... What I'm trying to find is the predicted price for every possible combination of values in the independent variables. So Expected price when: weather=1, gender=male weather=1,
2012 Feb 09
1
Grouping together a time variable
I have the following variable, time, which is a character variable and it's structured as follows. > head(as.character(dat$time), 30) [1] "00:00:01" "00:00:16" "00:00:24" "00:00:25" "00:00:25" "00:00:40" "00:01:50" "00:01:54" "00:02:33" "00:02:43" "00:03:22" [12]
2012 Aug 08
1
Calculating percentages across multiple columns
I have the following data and am trying to find the percentage of bid values purchased for that price. So let's say I have a bid of 5 and it's sold 2 times for $3 and $5. Since the original bid was $5, the percentage of times that that bid value results in a sold purchase AT that specific bid level was 1/3 because of the three time where the bid was three, it ended up being sold for $5
2011 Dec 15
1
Reordering a numeric variable
I'm running a linear model in R using the car package. I have a variable education, which i have recoded and regrouped to my wishes. However, R seems to place each element of that variable in alphabetical order. When I am running the model, don't I need the model order from lowest to highest to make an inference that a one unit change in one variable produced a one unit change in
2012 Sep 11
1
Plotting every probability curve
I don't have a logistic regression model and am trying to generate probability curves for all possible combinations of the variables. My logit model has 5+ variables, and I want to draw curves for every scenario. See code below. When home_owner is 0 and 1, I want curves. The same goes for all other variables categories, so that I have permutations for all possible combinations. I've
2012 Aug 07
2
Re-grouping data in R
I have a data frame with a column of values that I want to bucket (group) into specific levels. > str(dat)'data.frame': 3678 obs. of 39 variables: $ id : int 23 76 129 156 166 180 200 214 296 344 ... $ final_purchase_amount : Factor w/ 32 levels "\\N","1082","1109",..: 1 1 1 1 1 1 1 1 1 1 ... So I ran the following to
2011 Nov 17
1
Error When Installing the RODBC Package
I'm running R in Ubuntu 10.10 and am trying to install the RODBC package. However, I get the following error message: ERROR: configuration failed for package ‘RODBC’ * removing ‘/home/amathew/R/i686-pc-linux-gnu-library/2.13/RODBC’ The downloaded packages are in ‘/tmp/RtmpekzPOQ/downloaded_packages’ Warning message: In install.packages() : installation of package 'RODBC' had
2011 Apr 28
1
Merging two columns of a data frame
Hi folks, I have a simple question that I just can't solve. I'm trying to merge two columns in my data frame. > sessionInfo() R version 2.13.0 (2011-04-13) Platform: i686-pc-linux-gnu (32-bit) > head(dat) Year Month Number 2002 Jan 0 2002 Feb 0 2002 March 0 2002 April 1 2002 May 0 2002 June 0 I tried to do the following, but it
2012 Sep 26
1
Specifying a response variable in a Bayesian network
I'm trying to teach myself about Bayesian Networks and am working with the following data and the bnlearn package. I understand the conceptual aspects of BNs, but I'm not sure how to specify the response variables in R when constructing a dag plot. I've cecked ?hc and done numerous google searches without luck. Can anyone help? library("bnlearn")
2011 Dec 16
1
Zellig Error Message
I'm trying to calculate predicted probabilities in R with Zelig and keep getting the following error. Can anyone help? > x.low <- setx(mod, type=1)Error in dta[complete.cases(mf), names(dta) %in% vars, drop = FALSE] : incorrect number of dimensions When I ran the model, I ran everything but the explanatory variable as a numeric variable. Now, I'm trying everything and no
2013 Nov 05
2
Convert date column with two different structures
Let's say I have the following data frame and the date column has two different ways in which date is presented. How can I use as.Date or the lubridate package to have one date structure for the entire colum df = data.frame(Date=c("5/1/13","8/1/13","9/1/13","Apr-10", "Apr-11","Apr-12","Apr-13")) It's
2012 Feb 09
1
Finding all the coefficients for a logit model
Let's say I have a variable, day, which is saved as a factor with 7 levels, and I use it in a logistic regression model. I ran the model using the car package in R and printed out the results. mod1 = glm(factor(status1) ~ factor(day), data=mydat, family=binomial(link="logit")) print(summary(mod1)) The result I get is: Coefficients: Estimate Std. Error z value
2011 Jun 09
1
Using a function inside a function
I'm trying to run a function inside a function but get an error message. lst <- list(roots = c("car insurance", "auto insurance"), roots2 = c("insurance"), prefix = c("cheap", "budget"), prefix2 = c("low cost"), suffix = c("quote", "quotes"), suffix2 = c("rate", "rates"), suffix3 =
2011 Jun 04
1
Partial Matching
Let's say that I have a string and I want to know if a single word is present in the string. I've written the following function to see if the word "Geico" is mentioned in the string "Cheap Geico car insurance". However, it doesn't work, and I assume it has something to do with the any() function. Do I need to use regular expressions? (I hope not) main <-
2012 Jul 09
1
Using the effects package
I've been looking into the effects package and it seems to be a great tool for plotting the probabilities of the response variable by the predictors. However, I'm wonder if I can use the effects package to plot the probabilities on the y axis and one predictor on the x axis, with the curve having the info for another predictor. So let's say our response variable is win, a binary