Displaying 20 results from an estimated 10000 matches similar to: "turning data with start and end date into daily data"
2011 Mar 30
2
summing values by week - based on daily dates - but with some dates missing
Dear everybody,
I have the following challenge. I have a data set with 2 subgroups,
dates (days), and corresponding values (see example code below).
Within each subgroup: I need to aggregate (sum) the values by week -
for weeks that start on a Monday (for example, 2008-12-29 was a
Monday).
I find it difficult because I have missing dates in my data - so that
sometimes I don't even have the
2011 Apr 04
2
merging 2 frames while keeping all the entries from the "reference" frame
Hello!
I have my data frame "mydata" (below) and data frame "reference" -
that contains all the dates I would like to be present in the final
data frame.
I am trying to merge them so that the the result data frame contains
all 8 dates in both subgroups (i.e., Group1 should have 8 rows and
Group2 too). But when I merge it it's not coming out this way. Any
hint would be
2011 Jul 14
2
cbind in aggregate formula - based on an existing object (vector)
Hello!
I am aggregating using a formula in aggregate - of the type:
aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)
However, I actually have an object (vector of my variables to be aggregated):
myvars<-c("var1","var2","var3")
I'd like my aggregate formula (its "cbind" part) to be able to use my
"myvars" object. Is it
2011 Apr 18
1
using "aggregate" when variable names contain spaces
Hello!
my data set has many variables. Unfortuantely, many of those variables
contain spaces in their names.
I need advice on: how to refer to variable names in the formula for
"aggregate". See example below:
### Generating example data set:
mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4)
value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
2011 May 19
1
Creating a "shifted" month (one that starts not on the first of each month but on another date)
Hello!
I have a data frame with dates. I need to create a new "month" that
starts on the 20th of each month - because I'll need to aggregate my
data later by that "shifted" month.
I wrote the code below and it works. However, I was wondering if there
is some ready-made function in some package - that makes it
easier/more elegant?
Thanks a lot!
# Example data:
2011 Jul 22
1
Summing values by weekday and weekend - based on daily dates
Hi, all
Here I created a data frame like
mydates<- seq(as.Date("2010-05-29"), length = 43, by = "day")
myvalues<-runif(43,0,1)
myframe<-data.frame(dates=mydates, day=weekdays(dates), value=myvalues)
dates day value
1 2010-05-29 Saturday 0.14576143
2 2010-05-30 Sunday 0.37669604
3 2010-05-31 Monday 0.74813943
4 2010-06-01 Tuesday
2011 Jul 22
1
Summing daily values by weekday and weekend
(Sorry for reposting. Please delete previous msgs. Thanks!)
Hi, all
Here I created a data frame like
mydates<- seq(as.Date("2010-05-29"), length = 43, by = "day")
myvalues<-runif(43,0,1)
myframe<-data.frame(dates=mydates, day=weekdays(dates), value=myvalues)
dates day value
1 2010-05-29 Saturday 0.14576143
2 2010-05-30 Sunday 0.37669604
2010 Jun 29
2
transposing a data frame from horizontal to vertical (stacking)
Hello, everyone!
I have a very simple task - I have a data frame (see MyData below) and
I need to stack the data (see result below).
I wrote the syntax below - it's very basic and it does what I need.
But I am sure what I am trying to do is a very typical task and there
must be a much shorter/more elegant way of doing it.
Any advice?
Thank you very much!
2010 Oct 22
1
lm looking for weights outside of the user-defined function
Dear R'ers,
I am fighting with a problem that is driving me crazy. I use "lm" in
my user-defined function, but it seems to be looking for weights
outside of my function's environment:
### Generating example data:
x<-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1))
myweights<-runif(100)
data.for.regression<-x[1:3]
### Creating function
2010 Oct 26
1
lme vs. lmer results
Hello,
and sorry for asking a question without the data - hope it can still
be answered:
I've run two things on the same data:
# Using lme:
mix.lme <- lme(DV ~a+b+c+d+e+f+h+i, random = random = ~ e+f+h+i|
group, data = mydata)
# Using lmer
mix.lmer <- lmer(DV
~a+b+c+d+(1|group)+(e|group)+(f|group)+(h|group)+(i|group), data =
mydata)
lme provided an output (fixed effects and random
2011 Apr 18
2
as.formula doesn't want to take a phrase
Hello!
I am trying to create a formula object using as.formula. But it's not working:
examplephraze<-"for.my.example"
myformula<-as.formula(paste(examplephraze,"~group, sum, data=mydata",sep=""))
What's the problem?
Thanks a lot!
--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com
2011 Aug 24
2
data manipulation and summaries with few million rows
I have a data set with about 6 million rows and 50 columns. It is a
mixture of dates, factors, and numerics.
What I am trying to accomplish can be seen with the following
simplified data, which is given as dput output below.
> head(myData)
mydate gender mygroup id
1 2012-03-25 F A 1
2 2005-05-23 F B 2
3 2005-09-08 F B 2
4 2005-12-07 F B 2
2012 Jul 17
1
problem with function
Dear list,
I have a problem with defining a function (see below) to read my testfile
(see testfile). My function only returns mydata I wish to work with
attr(mydata, 'fc') as well (for labelling a plot). Principally it works if
I do not insist on this function but it would be much easer if it is
possible to return mydata AND attr(mydata, 'fc') by using a function.
1) testfile:
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables
(predictors and the dependent variable) are numeric.
rf<-randomForest(formula, data=mydata, importance=T, etc.)
my results object "rf" contains predictor importances:
rf$importance
I am seeing two columns:
%IncMSE IncNodePurity
V1 -0.01683558 58.10910
V2 0.04000299 71.27579
V3 0.01974636
2007 Mar 16
1
Probably simple function problem
# I have a simple function problem. I thought that I
could write a function to modify a couple of vectors
but I am doing something wrong
#I have a standard cost vector called "fuel" and some
adjustments to the
#costs called "adjusts". The changes are completely
dependend on the length
#of the dataframe newdata I then need to take the
modifed vectors and use
# them later. I
2012 Sep 24
0
stop on rows where !is.na(mydata$ti_all)
Dear R experts,
I got help to build a loop but there is a bug inside it that causes
one part of the mechanism to fail.
It should grow once, but if keep growing on rows where $ti_all is not NA.
Here is a wall of code that very crudely demonstrates the problem,
there is a couple of dim() outputs at the end where you can see how it
the second time around keeps adds (2) rows, but this does not
2011 Apr 18
2
Predicting with a principal component regression model: "non-conformable arguments" error
Hello all,
I have generated a principal components regression model using the pcr()
function from the PLS package (R version 2.12.0). I am getting a
"non-conformable arguments" error when I try to use the predict() function
on new data, but only when I try to read in the new data from a separate
file.
More specifically, when my data looks like this
#########training data
2010 Feb 18
1
logistic regression - what is being predicted when using predict - probabilities or odds?
Dear gurus,
I've analyzed a (fake) data set ("data") using logistic regression (glm):
logreg1 <- glm(z ~ x1 + x2 + y, data=data, family=binomial("logit"),
na.action=na.pass)
Then, I created a data frame with 2 fixed levels (0 and 1) for each predictor:
attach(data)
x1<-c(0,1)
x2<-c(0,1)
y<-c(0,1)
newdata1<-data.frame(expand.grid(x1,x2,y))
2006 Mar 24
1
predict.glmmPQL Problem
Dear all,
for a cross-validation I have to use predict.glmmPQL() , where the
formula of
the corresponding glmmPQL call is not given explicitly, but constructed
using as.formula.
However, this does not work as expected:
x1<-rnorm(100); x2<-rbinom(100,3,0.5); y<-rpois(100,2)
mydata<-data.frame(x1,x2,y)
library(MASS)
# works as expected
model1<-glmmPQL(y~x1, ~1 | factor(x2),
2009 Apr 26
2
RWeka prediction
Dear All,I encountered a problem when I use RWeka for prediction.
Specifically, I use the following:
res=J48(X1~.,data=mydata);
predict(res), #it worked fine
but when I tried to use a different data set,
i.e. predict(res,newdata=mynewdata);
all the predictions I get is 0, which apparently is problematic.
What is weird is, if I use the old data, but use the newdata option,
i.e.