similar to: Possible improvement in lm

Displaying 20 results from an estimated 10000 matches similar to: "Possible improvement in lm"

2006 Feb 17
3
(Newbie) Functions on vectors
Folks, I want to make the following function more efficient, by vectorizing it: getCriterionDecisionDate <- function (quarter , year) { if (length(quarter) != length(year)) stop ("Quarter and year vectors of unequal length!"); ret <- character(0); for (i in 1:length(quarter)) { currQuarter <- quarter[i]; currYear <- year[i]; if ((currQuarter < 1) |
2006 Mar 07
2
(newbie) Accessing the pieces of a 'by' object
Folks, I know that I can do the following using a loop. That's been a lot easier for me to write and understand. But I am trying to force myself to use more vectorized / matrixed code so that eventually I will become a better R programmer. I have a dataframe that has some values by Year, Quarter and Ranking. The variable of interest is the return (F3MRet), to be weighted averaged within the
2006 Feb 24
2
Minor documentation improvement
Gentlemen, In the documentation for reshape, in the function signature, the argument "direction" is not listed. However, it is explained in the explanation of parameters below. I am using R 2.2.1. Out of curiosity: Is the R core team still an all-male affair? I don't think I have seen a single lady's name. -- -- Vivek Satsangi Student, Rochester, NY USA
2006 Feb 24
1
(Newbie) Aggregate for NA values
Folks, Sorry if this question has been answered before or is obvious (or worse, statistically "bad"). I don't understand what was said in one of the search results that seems somewhat related. I use aggregate to get a quick summary of the data. Part of what I am looking for in the summary is, how much influence might the NA's have had, if they were included, and is excluding
2013 Apr 30
3
Line similarity
Folks, This is probably a "help me google this properly, please"-type of question. In TIBCO Spotfire, there is a procedure called "line similarity". I use this to determine which observations show a growing, stable or declining pattern... sort of like a mini-regression on the time-line for each observation. So of the input is
2005 Dec 08
2
Commented version of the home page graphics code
Folks, I was drawn to R, like many others, partly for the opportunity to draw nice, colorful graphs (occasionally ones with meaning, too :-) ). I am still quite a newbie to R. As such, I have been trying to understand the code for the graphics on the home page (the ones from the 2004 contest -- the dendrogram, the cluster plot with different coloured circles, etc.) I was wondering whether anyone
2009 Nov 18
2
Median on Aggregated data
Folks, I have the following code, that works fine on smaller data sets. For larger datasets, it runs out of memory and runs way too slow because we are essentially creating large vectors with rep() and then calling median() on it. (I learned this approach from a post on the web). Below that, I have written the corresponding SAS code. The SAS code works fast because I can just tell the proc
2006 Jan 15
8
/ Operator not meaningful for factors
Folks, I have a very basic question. The solution eludes me perhaps because of my own lack of creativity. I am not attaching a fully reproducible session because the issue may well be becuase of the way the data file is, and the data file is large (and I don't know whether I can legally distribute it). If people can suggest things that might be wrong in my data or the way that I am reading it,
2005 Nov 24
1
Suggested add to the documentation for the identify() function
Folks, 1. Is there a more appropriate list (r-devel?) for posting such suggestions? I am a newbie to R, and doubtless will have some suggestions for the documentation -- some good, others not quite so. I would actually like to help give back to the community (I was motivated by Prof. Ripley's 2001 talk in which he had commented that open source software users rarely give back anything.) --
2005 Nov 21
1
Cacheing in read.table/ attached data?
Disclaimer/Apology: I am an R newbie I am seeing some behaviour that seems to me to be the result of some cacheing going on at some level, and perhaps this is expected behaviour. I would just like to understand the basic rules. What I have is a file with some data. I read it in and then do a summary on the resulting dataframe. I find the some values are completely outside the expected range,
2007 Jul 25
3
aggregate.ts
Consider the following scrap of code: > x<- ts(1:50,start=c(1,11),freq=12) > y <- aggregate(x,nfreq=4) > c(y) [1] 6 15 24 33 42 51 60 69 78 87 96 105 114 123 132 141 > y Error in rep.int("", start.pad) : invalid number of copies in rep.int() > tsp(y) [1] 1.833333 5.583333 4.000000 So we can aggregate into quarters, but we cannot print it using
2007 Jul 25
3
aggregate.ts
Consider the following scrap of code: > x<- ts(1:50,start=c(1,11),freq=12) > y <- aggregate(x,nfreq=4) > c(y) [1] 6 15 24 33 42 51 60 69 78 87 96 105 114 123 132 141 > y Error in rep.int("", start.pad) : invalid number of copies in rep.int() > tsp(y) [1] 1.833333 5.583333 4.000000 So we can aggregate into quarters, but we cannot print it using
2006 Mar 15
1
(newbie) Weighted qqplot?
Folks, Normally, in a data frame, one observation counts as one observation of the distribution. Thus one can easily produce a CDF and (in Splus atleast) use cdf.compare to compare the CDF (BTW: what is the R equivalent of the SPlus cdf.compare() function, if any?) However, if each point should not count equally, how can I weight the points before comparing the distributions? I was thinking of
2008 May 31
1
Representing 'Date' as 'Year - Quarter'
I have financial data on a a set of firms, with a quarterly period (fundamental data). The data spans 10 years, and four quarters per year. The present file (.csv) reads the Date columns as "200706" for the second quarter of 2007; "199809" for the third quarter of 1997. Is there a way I can convert it to something like "2007 Q2", "1998 Q3"? I am aware of
2010 Nov 23
4
Tobit model on unbalanced panel
Appreciate any suggestions regarding how to fit an unbalanced panel data to a Tobit model using R functions. I am trying to analyze how real estate capital expenditures (CapEx) are affected by market conditions using a panel Tobit model. The CapEx is either positive or 0, so it is censored. The data are unbalanced panel, including the CapEx of about 5000 properties over about 40 quarters, with the
2008 Jun 05
7
Improving data processing efficiency
Hi everyone! I have a question about data processing efficiency. My data are as follows: I have a data set on quarterly institutional ownership of equities; some of them have had recent IPOs, some have not (I have a binary flag set). The total dataset size is 700k+ rows. My goal is this: For every quarter since issue for each IPO, I need to find a "matched" firm in the same
2012 Nov 09
2
Creating yyyymm regexp strings on the fly for aggregation.
Folks, This question is somewhat related to a previous posting of mine. I just can't seem to create a generic solution. Here is a function that I found searching around the internet: splitIt <- function(x, n) {split(x, sort(rank(x) %% n))} I use it like so: > splitIt(1:12, 2) $`0` [1] 1 2 3 4 5 6 $`1` [1] 7 8 9 10 11 12 Or > splitIt(1:12, 4) $`0` [1] 1 2 3 $`1` [1] 4 5 6
2006 Apr 23
2
models and views
Greetings. I have an application which is used to track jobs and payments. The billables table contains columns "job_date", "amount" and "receipt" which indicate the date of the job, the amount due and whether or not payment has been received, respectively. In addition to tracking and manipulating individual jobs, the application is also used to generate aggregate
2012 Apr 11
2
What is a better way to deal with lag/difference and loops in time series using R?
Hello, I am writing codes for time series computation but encountering some problems Given the quarterly data from 1983Q1 to 1984Q2 PI1<-ts(c(2.747365190,2.791594762, -0.009953715, -0.015059485, -1.190061246, -0.553031799, 0.686874720, 0.953911035), start=c(1983,1), frequency=4) > PI1 Qtr1 Qtr2 Qtr3 Qtr4 1983 2.747365190 2.791594762
2008 Aug 18
1
Converting monthly data to quarterly data
Dear R users, I have a dataframe where column is has countries, column 2 is dates (monthly) for each countrly, the next 10 columns are my factors where I have measurements for each country and for each date. I have attached a sample of the data in csv format with the data for 3 countries. I would like to convert my monthly data into quarterly data, finding the mean over 3 month periods for