thr3ads.net - similar to: "Possible improvement in lm"

Displaying 20 results from an estimated 10000 matches similar to: "Possible improvement in lm"

2006 Feb 17

(Newbie) Functions on vectors

Folks, I want to make the following function more efficient, by vectorizing it: getCriterionDecisionDate <- function (quarter , year) { if (length(quarter) != length(year)) stop ("Quarter and year vectors of unequal length!"); ret <- character(0); for (i in 1:length(quarter)) { currQuarter <- quarter[i]; currYear <- year[i]; if ((currQuarter < 1) |

(newbie) Accessing the pieces of a 'by' object

2006 Mar 07

(newbie) Accessing the pieces of a 'by' object

Folks, I know that I can do the following using a loop. That's been a lot easier for me to write and understand. But I am trying to force myself to use more vectorized / matrixed code so that eventually I will become a better R programmer. I have a dataframe that has some values by Year, Quarter and Ranking. The variable of interest is the return (F3MRet), to be weighted averaged within the

Minor documentation improvement

2006 Feb 24

Minor documentation improvement

Gentlemen, In the documentation for reshape, in the function signature, the argument "direction" is not listed. However, it is explained in the explanation of parameters below. I am using R 2.2.1. Out of curiosity: Is the R core team still an all-male affair? I don't think I have seen a single lady's name. -- -- Vivek Satsangi Student, Rochester, NY USA

(Newbie) Aggregate for NA values

2006 Feb 24

(Newbie) Aggregate for NA values

Folks, Sorry if this question has been answered before or is obvious (or worse, statistically "bad"). I don't understand what was said in one of the search results that seems somewhat related. I use aggregate to get a quick summary of the data. Part of what I am looking for in the summary is, how much influence might the NA's have had, if they were included, and is excluding

Line similarity

2013 Apr 30

Line similarity

Folks, This is probably a "help me google this properly, please"-type of question. In TIBCO Spotfire, there is a procedure called "line similarity". I use this to determine which observations show a growing, stable or declining pattern... sort of like a mini-regression on the time-line for each observation. So of the input is

Commented version of the home page graphics code

2005 Dec 08

Commented version of the home page graphics code

Folks, I was drawn to R, like many others, partly for the opportunity to draw nice, colorful graphs (occasionally ones with meaning, too :-) ). I am still quite a newbie to R. As such, I have been trying to understand the code for the graphics on the home page (the ones from the 2004 contest -- the dendrogram, the cluster plot with different coloured circles, etc.) I was wondering whether anyone

Median on Aggregated data

2009 Nov 18

Median on Aggregated data

Folks, I have the following code, that works fine on smaller data sets. For larger datasets, it runs out of memory and runs way too slow because we are essentially creating large vectors with rep() and then calling median() on it. (I learned this approach from a post on the web). Below that, I have written the corresponding SAS code. The SAS code works fast because I can just tell the proc

/ Operator not meaningful for factors

2006 Jan 15

/ Operator not meaningful for factors

Folks, I have a very basic question. The solution eludes me perhaps because of my own lack of creativity. I am not attaching a fully reproducible session because the issue may well be becuase of the way the data file is, and the data file is large (and I don't know whether I can legally distribute it). If people can suggest things that might be wrong in my data or the way that I am reading it,

2005 Nov 24

Suggested add to the documentation for the identify() function

Folks, 1. Is there a more appropriate list (r-devel?) for posting such suggestions? I am a newbie to R, and doubtless will have some suggestions for the documentation -- some good, others not quite so. I would actually like to help give back to the community (I was motivated by Prof. Ripley's 2001 talk in which he had commented that open source software users rarely give back anything.) --

Cacheing in read.table/ attached data?

2005 Nov 21

Cacheing in read.table/ attached data?

Disclaimer/Apology: I am an R newbie I am seeing some behaviour that seems to me to be the result of some cacheing going on at some level, and perhaps this is expected behaviour. I would just like to understand the basic rules. What I have is a file with some data. I read it in and then do a summary on the resulting dataframe. I find the some values are completely outside the expected range,

aggregate.ts

2007 Jul 25

aggregate.ts

Consider the following scrap of code: > x<- ts(1:50,start=c(1,11),freq=12) > y <- aggregate(x,nfreq=4) > c(y) [1] 6 15 24 33 42 51 60 69 78 87 96 105 114 123 132 141 > y Error in rep.int("", start.pad) : invalid number of copies in rep.int() > tsp(y) [1] 1.833333 5.583333 4.000000 So we can aggregate into quarters, but we cannot print it using

aggregate.ts

2007 Jul 25

aggregate.ts

(newbie) Weighted qqplot?

2006 Mar 15

(newbie) Weighted qqplot?

Folks, Normally, in a data frame, one observation counts as one observation of the distribution. Thus one can easily produce a CDF and (in Splus atleast) use cdf.compare to compare the CDF (BTW: what is the R equivalent of the SPlus cdf.compare() function, if any?) However, if each point should not count equally, how can I weight the points before comparing the distributions? I was thinking of

Representing 'Date' as 'Year - Quarter'

2008 May 31

Representing 'Date' as 'Year - Quarter'

I have financial data on a a set of firms, with a quarterly period (fundamental data). The data spans 10 years, and four quarters per year. The present file (.csv) reads the Date columns as "200706" for the second quarter of 2007; "199809" for the third quarter of 1997. Is there a way I can convert it to something like "2007 Q2", "1998 Q3"? I am aware of

Tobit model on unbalanced panel

2010 Nov 23

Tobit model on unbalanced panel

Appreciate any suggestions regarding how to fit an unbalanced panel data to a Tobit model using R functions. I am trying to analyze how real estate capital expenditures (CapEx) are affected by market conditions using a panel Tobit model. The CapEx is either positive or 0, so it is censored. The data are unbalanced panel, including the CapEx of about 5000 properties over about 40 quarters, with the

Improving data processing efficiency

2008 Jun 05

Improving data processing efficiency

Hi everyone! I have a question about data processing efficiency. My data are as follows: I have a data set on quarterly institutional ownership of equities; some of them have had recent IPOs, some have not (I have a binary flag set). The total dataset size is 700k+ rows. My goal is this: For every quarter since issue for each IPO, I need to find a "matched" firm in the same

Creating yyyymm regexp strings on the fly for aggregation.

2012 Nov 09

Creating yyyymm regexp strings on the fly for aggregation.

Folks, This question is somewhat related to a previous posting of mine. I just can't seem to create a generic solution. Here is a function that I found searching around the internet: splitIt <- function(x, n) {split(x, sort(rank(x) %% n))} I use it like so: > splitIt(1:12, 2) $`0` [1] 1 2 3 4 5 6 $`1` [1] 7 8 9 10 11 12 Or > splitIt(1:12, 4) $`0` [1] 1 2 3 $`1` [1] 4 5 6

models and views

2006 Apr 23

models and views

Greetings. I have an application which is used to track jobs and payments. The billables table contains columns "job_date", "amount" and "receipt" which indicate the date of the job, the amount due and whether or not payment has been received, respectively. In addition to tracking and manipulating individual jobs, the application is also used to generate aggregate

What is a better way to deal with lag/difference and loops in time series using R?

2012 Apr 11

What is a better way to deal with lag/difference and loops in time series using R?

Hello, I am writing codes for time series computation but encountering some problems Given the quarterly data from 1983Q1 to 1984Q2 PI1<-ts(c(2.747365190,2.791594762, -0.009953715, -0.015059485, -1.190061246, -0.553031799, 0.686874720, 0.953911035), start=c(1983,1), frequency=4) > PI1 Qtr1 Qtr2 Qtr3 Qtr4 1983 2.747365190 2.791594762

Converting monthly data to quarterly data

2008 Aug 18

Converting monthly data to quarterly data

Dear R users, I have a dataframe where column is has countries, column 2 is dates (monthly) for each countrly, the next 10 columns are my factors where I have measurements for each country and for each date. I have attached a sample of the data in csv format with the data for 3 countries. I would like to convert my monthly data into quarterly data, finding the mean over 3 month periods for

similar to: Possible improvement in lm