Hi.
I need to apply run a regression analysis for groups of data of fixed length:100
As, 100 Bs, 100 Cs etc.
eg
x
Key Value
A 1
A 21.2
A 4
A 6.5
...repeat 96 times with differing values of A
B 1
B 2.3
B NA
B 6.5
...repeat 96 times with differing values of B
etc
I run these against a linear model using tapply(data$Value,
data$Key,FUN=regr,100) where
regr<-function(x,w)
{
#run the model against the last w values of x
lm((x[length(x)-w):length(x)]~myModel(w))
}
In the results, I want to return NA for any Key group where one or more of the
values is NA. If I run the above I get a regression structure ignoring the
missing values and returning values for data that contains NA. Using
na.action=na.fail or na.action=NULL causes the whole tapply function to fail and
I get nothing. Is there a way I can get lm to return NA if any of the values in
the data are NA but valid numbers for complete data?
I realise that I could remove the groups with NAs but I'm running the
regressions over multiple time periods and most of the data groups will have a
full complement of data for at least some of these periods. It becomes a pain to
manage NAs if I do that.
Sorry if the above is a little unclear.
Thanks
Neil
.
This message is intended only for the use of the person(s) to whom it is
addressed. It may contain information which is privileged and confidential.
Accordingly any unauthorised use is strictly prohibited. If you are not the
intended recipient, please contact the sender as soon as possible.
It is not intended as an offer or solicitation for the purchase or sale of any
financial instrument or as an official confirmation of any transaction, unless
specifically agreed otherwise. All market prices, data and other information are
not warranted as to completeness or accuracy and are subject to change without
notice. Any opinions or advice contained in this Internet email are subject to
the terms and conditions expressed in any applicable governing Marble Bar Asset
Management LLP's terms and conditions of business or client agreement
letter. Any comments or statements made herein do not necessarily reflect those
of Marble Bar Asset Management LLP.
Marble Bar Asset Management LLP is regulated and authorised by the FSA.
[[alternative HTML version deleted]]
See ?na.exclude On Fri, 23 Jan 2009, Neil Beddoe wrote:> Hi. > > I need to apply run a regression analysis for groups of data of fixed length:100 As, 100 Bs, 100 Cs etc. > > eg > > x > Key Value > A 1 > A 21.2 > A 4 > A 6.5 > ...repeat 96 times with differing values of A > B 1 > B 2.3 > B NA > B 6.5 > ...repeat 96 times with differing values of B > etc > > I run these against a linear model using tapply(data$Value, data$Key,FUN=regr,100) where > regr<-function(x,w) > { > #run the model against the last w values of x > lm((x[length(x)-w):length(x)]~myModel(w)) > } > In the results, I want to return NA for any Key group where one or more of the values is NA. If I run the above I get a regression structure ignoring the missing values and returning values for data that contains NA. Using na.action=na.fail or na.action=NULL causes the whole tapply function to fail and I get nothing. Is there a way I can get lm to return NA if any of the values in the data are NA but valid numbers for complete data? > > I realise that I could remove the groups with NAs but I'm running the regressions over multiple time periods and most of the data groups will have a full complement of data for at least some of these periods. It becomes a pain to manage NAs if I do that. > > Sorry if the above is a little unclear. > > Thanks > > Neil-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595