Displaying 20 results from an estimated 10000 matches similar to: "Possible improvement in lm"
2006 Feb 17
3
(Newbie) Functions on vectors
Folks,
I want to make the following function more efficient, by vectorizing it:
getCriterionDecisionDate <- function (quarter , year)
{
if (length(quarter) != length(year)) stop ("Quarter and year vectors
of unequal length!");
ret <- character(0);
for (i in 1:length(quarter)) {
currQuarter <- quarter[i];
currYear <- year[i];
if ((currQuarter < 1) |
2006 Mar 07
2
(newbie) Accessing the pieces of a 'by' object
Folks,
I know that I can do the following using a loop. That's been a lot
easier for me to write and understand. But I am trying to force myself
to use more vectorized / matrixed code so that eventually I will
become a better R programmer.
I have a dataframe that has some values by Year, Quarter and Ranking.
The variable of interest is the return (F3MRet), to be weighted
averaged within the
2006 Feb 24
2
Minor documentation improvement
Gentlemen,
In the documentation for reshape, in the function signature, the
argument "direction" is not listed. However, it is explained in the
explanation of parameters below.
I am using R 2.2.1.
Out of curiosity: Is the R core team still an all-male affair? I don't
think I have seen a single lady's name.
--
-- Vivek Satsangi
Student, Rochester, NY USA
2006 Feb 24
1
(Newbie) Aggregate for NA values
Folks,
Sorry if this question has been answered before or is obvious (or
worse, statistically "bad"). I don't understand what was said in one
of the search results that seems somewhat related.
I use aggregate to get a quick summary of the data. Part of what I am
looking for in the summary is, how much influence might the NA's have
had, if they were included, and is excluding
2013 Apr 30
3
Line similarity
Folks,
This is probably a "help me google this properly, please"-type of question.
In TIBCO Spotfire, there is a procedure called "line similarity". I use this to determine which observations show a growing, stable or declining pattern... sort of like a mini-regression on the time-line for each observation.
So of the input is
2005 Dec 08
2
Commented version of the home page graphics code
Folks,
I was drawn to R, like many others, partly for the opportunity
to draw nice, colorful graphs (occasionally ones with meaning, too :-)
). I am still quite a newbie to R.
As such, I have been trying to understand the code for the graphics on
the home page (the ones from the 2004 contest -- the dendrogram, the
cluster plot with different coloured circles, etc.) I was wondering
whether anyone
2009 Nov 18
2
Median on Aggregated data
Folks,
I have the following code, that works fine on smaller data sets. For
larger datasets, it runs out of memory and runs way too slow because we
are essentially creating large vectors with rep() and then calling
median() on it. (I learned this approach from a post on the web).
Below that, I have written the corresponding SAS code. The SAS code
works fast because I can just tell the proc
2006 Jan 15
8
/ Operator not meaningful for factors
Folks,
I have a very basic question. The solution eludes me perhaps because
of my own lack of creativity. I am not attaching a fully reproducible
session because the issue may well be becuase of the way the data file
is, and the data file is large (and I don't know whether I can legally
distribute it). If people can suggest things that might be wrong in my
data or the way that I am reading it,
2005 Nov 24
1
Suggested add to the documentation for the identify() function
Folks,
1. Is there a more appropriate list (r-devel?) for posting such
suggestions? I am a newbie to R, and doubtless will have some
suggestions for the documentation -- some good, others not quite so. I
would actually like to help give back to the community (I was
motivated by Prof. Ripley's 2001 talk in which he had commented that
open source software users rarely give back anything.) --
2005 Nov 21
1
Cacheing in read.table/ attached data?
Disclaimer/Apology: I am an R newbie
I am seeing some behaviour that seems to me to be the result of some
cacheing going on at some level, and perhaps this is expected behaviour. I
would just like to understand the basic rules.
What I have is a file with some data. I read it in and then do a summary on
the resulting dataframe. I find the some values are completely outside the
expected range,
2007 Jul 25
3
aggregate.ts
Consider the following scrap of code:
> x<- ts(1:50,start=c(1,11),freq=12)
> y <- aggregate(x,nfreq=4)
> c(y)
[1] 6 15 24 33 42 51 60 69 78 87 96 105 114 123 132 141
> y
Error in rep.int("", start.pad) : invalid number of copies in rep.int()
> tsp(y)
[1] 1.833333 5.583333 4.000000
So we can aggregate into quarters, but we cannot print it using
2007 Jul 25
3
aggregate.ts
Consider the following scrap of code:
> x<- ts(1:50,start=c(1,11),freq=12)
> y <- aggregate(x,nfreq=4)
> c(y)
[1] 6 15 24 33 42 51 60 69 78 87 96 105 114 123 132 141
> y
Error in rep.int("", start.pad) : invalid number of copies in rep.int()
> tsp(y)
[1] 1.833333 5.583333 4.000000
So we can aggregate into quarters, but we cannot print it using
2006 Mar 15
1
(newbie) Weighted qqplot?
Folks,
Normally, in a data frame, one observation counts as one observation
of the distribution. Thus one can easily produce a CDF and (in Splus
atleast) use cdf.compare to compare the CDF (BTW: what is the R
equivalent of the SPlus cdf.compare() function, if any?)
However, if each point should not count equally, how can I weight the
points before comparing the distributions? I was thinking of
2008 May 31
1
Representing 'Date' as 'Year - Quarter'
I have financial data on a a set of firms, with a quarterly period
(fundamental data). The data spans 10 years, and four quarters per
year. The present file (.csv) reads the Date columns as "200706" for
the second quarter of 2007; "199809" for the third quarter of 1997.
Is there a way I can convert it to something like "2007 Q2", "1998 Q3"?
I am aware of
2010 Nov 23
4
Tobit model on unbalanced panel
Appreciate any suggestions regarding how to fit an unbalanced panel data to
a Tobit model using R functions. I am trying to analyze how real estate
capital expenditures (CapEx) are affected by market conditions using a panel
Tobit model. The CapEx is either positive or 0, so it is censored. The data
are unbalanced panel, including the CapEx of about 5000 properties over
about 40 quarters, with the
2008 Jun 05
7
Improving data processing efficiency
Hi everyone!
I have a question about data processing efficiency.
My data are as follows: I have a data set on quarterly institutional
ownership of equities; some of them have had recent IPOs, some have not
(I have a binary flag set). The total dataset size is 700k+ rows.
My goal is this: For every quarter since issue for each IPO, I need to
find a "matched" firm in the same
2012 Nov 09
2
Creating yyyymm regexp strings on the fly for aggregation.
Folks,
This question is somewhat related to a previous posting of mine.
I just can't seem to create a generic solution.
Here is a function that I found searching around the internet:
splitIt <- function(x, n) {split(x, sort(rank(x) %% n))}
I use it like so:
> splitIt(1:12, 2)
$`0`
[1] 1 2 3 4 5 6
$`1`
[1] 7 8 9 10 11 12
Or
> splitIt(1:12, 4)
$`0`
[1] 1 2 3
$`1`
[1] 4 5 6
2006 Apr 23
2
models and views
Greetings.
I have an application which is used to track jobs and payments.
The billables table contains columns "job_date", "amount" and "receipt"
which indicate the date of the job, the amount due and whether or not
payment has been received, respectively. In addition to tracking and
manipulating individual jobs, the application is also used to generate
aggregate
2012 Apr 11
2
What is a better way to deal with lag/difference and loops in time series using R?
Hello,
I am writing codes for time series computation but encountering some
problems
Given the quarterly data from 1983Q1 to 1984Q2
PI1<-ts(c(2.747365190,2.791594762, -0.009953715, -0.015059485,
-1.190061246, -0.553031799, 0.686874720, 0.953911035),
start=c(1983,1), frequency=4)
> PI1
Qtr1 Qtr2 Qtr3 Qtr4
1983 2.747365190 2.791594762
2008 Aug 18
1
Converting monthly data to quarterly data
Dear R users,
I have a dataframe where column is has countries, column 2 is dates
(monthly) for each countrly, the next 10 columns are my factors where I have
measurements for each country and for each date. I have attached a sample
of the data in csv format with the data for 3 countries.
I would like to convert my monthly data into quarterly data, finding the
mean over 3 month periods for