thr3ads.net - similar to: "turning data with start and end date into daily data"

Displaying 20 results from an estimated 10000 matches similar to: "turning data with start and end date into daily data"

summing values by week - based on daily dates - but with some dates missing

2011 Mar 30

summing values by week - based on daily dates - but with some dates missing

Dear everybody, I have the following challenge. I have a data set with 2 subgroups, dates (days), and corresponding values (see example code below). Within each subgroup: I need to aggregate (sum) the values by week - for weeks that start on a Monday (for example, 2008-12-29 was a Monday). I find it difficult because I have missing dates in my data - so that sometimes I don't even have the

merging 2 frames while keeping all the entries from the "reference" frame

2011 Apr 04

merging 2 frames while keeping all the entries from the "reference" frame

Hello! I have my data frame "mydata" (below) and data frame "reference" - that contains all the dates I would like to be present in the final data frame. I am trying to merge them so that the the result data frame contains all 8 dates in both subgroups (i.e., Group1 should have 8 rows and Group2 too). But when I merge it it's not coming out this way. Any hint would be

cbind in aggregate formula - based on an existing object (vector)

2011 Jul 14

cbind in aggregate formula - based on an existing object (vector)

Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars<-c("var1","var2","var3") I'd like my aggregate formula (its "cbind" part) to be able to use my "myvars" object. Is it

using "aggregate" when variable names contain spaces

2011 Apr 18

using "aggregate" when variable names contain spaces

Hello! my data set has many variables. Unfortuantely, many of those variables contain spaces in their names. I need advice on: how to refer to variable names in the formula for "aggregate". See example below: ### Generating example data set: mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400)

Creating a "shifted" month (one that starts not on the first of each month but on another date)

2011 May 19

Creating a "shifted" month (one that starts not on the first of each month but on another date)

Hello! I have a data frame with dates. I need to create a new "month" that starts on the 20th of each month - because I'll need to aggregate my data later by that "shifted" month. I wrote the code below and it works. However, I was wondering if there is some ready-made function in some package - that makes it easier/more elegant? Thanks a lot! # Example data:

Summing values by weekday and weekend - based on daily dates

2011 Jul 22

Summing values by weekday and weekend - based on daily dates

Hi, all Here I created a data frame like mydates<- seq(as.Date("2010-05-29"), length = 43, by = "day") myvalues<-runif(43,0,1) myframe<-data.frame(dates=mydates, day=weekdays(dates), value=myvalues) dates day value 1 2010-05-29 Saturday 0.14576143 2 2010-05-30 Sunday 0.37669604 3 2010-05-31 Monday 0.74813943 4 2010-06-01 Tuesday

Summing daily values by weekday and weekend

2011 Jul 22

Summing daily values by weekday and weekend

(Sorry for reposting. Please delete previous msgs. Thanks!) Hi, all Here I created a data frame like mydates<- seq(as.Date("2010-05-29"), length = 43, by = "day") myvalues<-runif(43,0,1) myframe<-data.frame(dates=mydates, day=weekdays(dates), value=myvalues) dates day value 1 2010-05-29 Saturday 0.14576143 2 2010-05-30 Sunday 0.37669604

transposing a data frame from horizontal to vertical (stacking)

2010 Jun 29

transposing a data frame from horizontal to vertical (stacking)

Hello, everyone! I have a very simple task - I have a data frame (see MyData below) and I need to stack the data (see result below). I wrote the syntax below - it's very basic and it does what I need. But I am sure what I am trying to do is a very typical task and there must be a much shorter/more elegant way of doing it. Any advice? Thank you very much!

lm looking for weights outside of the user-defined function

2010 Oct 22

lm looking for weights outside of the user-defined function

Dear R'ers, I am fighting with a problem that is driving me crazy. I use "lm" in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x<-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights<-runif(100) data.for.regression<-x[1:3] ### Creating function

lme vs. lmer results

2010 Oct 26

lme vs. lmer results

Hello, and sorry for asking a question without the data - hope it can still be answered: I've run two things on the same data: # Using lme: mix.lme <- lme(DV ~a+b+c+d+e+f+h+i, random = random = ~ e+f+h+i| group, data = mydata) # Using lmer mix.lmer <- lmer(DV ~a+b+c+d+(1|group)+(e|group)+(f|group)+(h|group)+(i|group), data = mydata) lme provided an output (fixed effects and random

as.formula doesn't want to take a phrase

2011 Apr 18

as.formula doesn't want to take a phrase

Hello! I am trying to create a formula object using as.formula. But it's not working: examplephraze<-"for.my.example" myformula<-as.formula(paste(examplephraze,"~group, sum, data=mydata",sep="")) What's the problem? Thanks a lot! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com

data manipulation and summaries with few million rows

2011 Aug 24

data manipulation and summaries with few million rows

I have a data set with about 6 million rows and 50 columns. It is a mixture of dates, factors, and numerics. What I am trying to accomplish can be seen with the following simplified data, which is given as dput output below. > head(myData) mydate gender mygroup id 1 2012-03-25 F A 1 2 2005-05-23 F B 2 3 2005-09-08 F B 2 4 2005-12-07 F B 2

problem with function

2012 Jul 17

problem with function

Dear list, I have a problem with defining a function (see below) to read my testfile (see testfile). My function only returns mydata I wish to work with attr(mydata, 'fc') as well (for labelling a plot). Principally it works if I do not insist on this function but it would be much easer if it is possible to return mydata AND attr(mydata, 'fc') by using a function. 1) testfile:

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

2010 May 05

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636

Probably simple function problem

2007 Mar 16

Probably simple function problem

# I have a simple function problem. I thought that I could write a function to modify a couple of vectors but I am doing something wrong #I have a standard cost vector called "fuel" and some adjustments to the #costs called "adjusts". The changes are completely dependend on the length #of the dataframe newdata I then need to take the modifed vectors and use # them later. I

stop on rows where !is.na(mydata$ti_all)

2012 Sep 24

stop on rows where !is.na(mydata$ti_all)

Dear R experts, I got help to build a loop but there is a bug inside it that causes one part of the mechanism to fail. It should grow once, but if keep growing on rows where $ti_all is not NA. Here is a wall of code that very crudely demonstrates the problem, there is a couple of dim() outputs at the end where you can see how it the second time around keeps adds (2) rows, but this does not

Predicting with a principal component regression model: "non-conformable arguments" error

2011 Apr 18

Predicting with a principal component regression model: "non-conformable arguments" error

Hello all, I have generated a principal components regression model using the pcr() function from the PLS package (R version 2.12.0). I am getting a "non-conformable arguments" error when I try to use the predict() function on new data, but only when I try to read in the new data from a separate file. More specifically, when my data looks like this #########training data

logistic regression - what is being predicted when using predict - probabilities or odds?

2010 Feb 18

logistic regression - what is being predicted when using predict - probabilities or odds?

Dear gurus, I've analyzed a (fake) data set ("data") using logistic regression (glm): logreg1 <- glm(z ~ x1 + x2 + y, data=data, family=binomial("logit"), na.action=na.pass) Then, I created a data frame with 2 fixed levels (0 and 1) for each predictor: attach(data) x1<-c(0,1) x2<-c(0,1) y<-c(0,1) newdata1<-data.frame(expand.grid(x1,x2,y))

predict.glmmPQL Problem

2006 Mar 24

predict.glmmPQL Problem

Dear all, for a cross-validation I have to use predict.glmmPQL() , where the formula of the corresponding glmmPQL call is not given explicitly, but constructed using as.formula. However, this does not work as expected: x1<-rnorm(100); x2<-rbinom(100,3,0.5); y<-rpois(100,2) mydata<-data.frame(x1,x2,y) library(MASS) # works as expected model1<-glmmPQL(y~x1, ~1 | factor(x2),

RWeka prediction

2009 Apr 26

RWeka prediction

Dear All,I encountered a problem when I use RWeka for prediction. Specifically, I use the following: res=J48(X1~.,data=mydata); predict(res), #it worked fine but when I tried to use a different data set, i.e. predict(res,newdata=mynewdata); all the predictions I get is 0, which apparently is problematic. What is weird is, if I use the old data, but use the newdata option, i.e.

similar to: turning data with start and end date into daily data