Hi all,
The problem I am trying to solve is the following: I have a 'trace' of
half-hourly data across an entire year - i.e. I have 17,500
observations (one for every half-hour of every day of a year). The
data follows a daily, weekly and seasonal pattern that is important to
preserve. What I want to do is scale this 'trace' of data in a way
that minimizes distortion to the 'shape' of the original observations
while achieving two things: (i) I want the new mean of this set of
observations to be a certain value (lets say 5% higher than my initial
set) and I want the new maximum point of this set to be a certain
value (lets say 10% higher than my initial set).
I have approached this problem as a constrained-optimization problem:
to minimize the sum of squared deviations subject to the mean and max
of the new set of observations achieving certain values. I have
created a simple proof of concept in excel using Solver, and this
approach appears to work. My objective function to be minimized is:
sum((xi-1)^2), where each xi is a unique scaling factor for each
half-hourly observation and where these xi's represent the decision
variables of the constrained-optimization problem.
Put differently, the problem is to select each xi such that (i) you
minimize the sum of squared deviations from 0 (xi = 1 for all xi
results in the original data set and hence no deviation) while (ii)
achieving a new mean and max...
Having researched constrained optimization in R and having read the
"CRAN Task View: Optimization and Mathematical Programming" document
(http://cran.r-project.org/web/views/Optimization.html), I have been
experimenting with lsei() from the limSolve package. I have worked
through the simple examples in the limSolve user manual, but am
struggling to correctly set up and solve my problem.
My questions to anyone experienced in solving such problems and/or
with using lsei are the following:
- Do you think lsei() is the best function to use to solve this
problem? I need to be able to set equality constraints and the problem
is non-linear, which appears to rule out most other options (optim()
for example)
Assuming lsei() is appropriate:
- The lsei() documentation states that matrix 'A' is a "a numeric
matrix containing the coefficients of the quadratic function to be
minimised, (Ax - b)^2..." and that vector 'B' is a "numeric
vector
containing the right-hand side of the quadratic function to be
minimized".
- Assume that y = sum((Ax - b)^2) is the quadratic function to be minimized.
- Are the "coefficients of the quadratic function to be minimized"
'A'
and 'b' respectively? If so, how are they entered in matrix 'A'?
I.e.
is column 1 of 'A' = all the A's, column 2 of 'A' = all the
B's and
each row of 'A' a new argument of sum()? (In my case, a new data
point?)
- Alternatively, are the "coefficients of the quadratic to be
minimized" the coefficients after expanding y = (Ax - b)^2)? In which
case you would have: y = (A^2)*(x^2) -2Ab*x + b^2. Defining '(A^2)' as
'd', '2Ab' as 'e' and 'b^2' as 'f' would
then give you y = d(x^2) -ex
+ f, with the coefficients being 'd', 'e' and 'f'
respectively. If
this is the case, again how are they entered into matrix 'A' and in
what order?
My apologies for the verbose posting - I have attempted to be as
concise as possible.
R version: 2.9.2
OS: Windows 7, 64-bit
limSolve version: 1.5.1
Regards,
Liam