# source:nanovis/trunk/newmat11/newmatnl.h@4794

Last change on this file since 4794 was 2096, checked in by ldelgass, 9 years ago

Normalize line endings, set eol-style to native on *.cpp, *.h files

• Property svn:eol-style set to `native`
File size: 10.9 KB
Line
1/// \ingroup newmat
2///@{
3
4/// \file newmatnl.h
5/// Header file for non-linear optimisation
6
7// Copyright (C) 1993,4,5: R B Davies
8
9#ifndef NEWMATNL_LIB
10#define NEWMATNL_LIB 0
11
12#include "newmat.h"
13
14#ifdef use_namespace
15namespace NEWMAT {
16#endif
17
18
19
20/*
21
22This is a beginning of a series of classes for non-linear optimisation.
23
24At present there are two classes. FindMaximum2 is the basic optimisation
25strategy when one is doing an optimisation where one has first
26derivatives and estimates of the second derivatives. Class
27NonLinearLeastSquares is derived from FindMaximum2. This provides the
28functions that calculate function values and derivatives.
29
30A third class is now added. This is for doing maximum-likelihood when
31you have first derviatives and something like the Fisher Information
32matrix (eg the variance covariance matrix of the first derivatives or
33minus the second derivatives - this matrix is assumed to be positive
34definite).
35
36
37
38   class FindMaximum2
39
40Suppose T is the ColumnVector of parameters, F(T) the function we want
41to maximise, D(T) the ColumnVector of derivatives of F with respect to
42T, and S(T) the matrix of second derivatives.
43
44Then the basic iteration is given a value of T, update it to
45
46           T - S.i() * D
47
48where .i() denotes inverse.
49
50If F was quadratic this would give exactly the right answer (except it
51might get a minimum rather than a maximum). Since F is not usually
52quadratic, the simple procedure would be to recalculate S and D with the
53new value of T and keep iterating until the process converges. This is
54known as the method of conjugate gradients.
55
56In practice, this method may not converge. FindMaximum2 considers an
57iteration of the form
58
59           T - x * S.i() * D
60
61where x is a number. It tries x = 1 and uses the values of F and its
62slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
63choses x to maximise the resulting function. This gives our new value of
64T. The program checks that the value of F is getting better and carries
65out a variety of strategies if it is not.
66
67The program also has a second strategy. If the successive values of T
68seem to be lying along a curve - eg we are following along a curved
69ridge, the program will try to fit this ridge and project along it. This
70does not work at present and is commented out.
71
72FindMaximum2 has three virtual functions which need to be over-ridden by
73a derived class.
74
75   void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
76
77T is the column vector of parameters. The function returns the value of
78the function to f, but may instead set oorg to true if the parameter
79values are not valid. If wg is true it may also calculate and store the
80second derivative information.
81
82   bool NextPoint(ColumnVector& H, Real& d);
83
84Using the value of T provided in the previous call of Value, find the
85conjugate gradients adjustment to T, that is - S.i() * D. Also return
86
87           d = D.t() * S.i() * D.
88
89NextPoint should return true if it considers that the process has
90converged (d very small) and false otherwise. The previous call of Value
91will have set wg to true, so that S will be available.
92
93   Real LastDerivative(const ColumnVector& H);
94
95Return the scalar product of H and the vector of derivatives at the last
96value of T.
97
98The function Fit is the function that calls the iteration.
99
100   void Fit(ColumnVector&, int);
101
102The arguments are the trial parameter values as a ColumnVector and the
103maximum number of iterations. The program calls a DataException if the
104initial parameters are not valid and a ConvergenceException if the
105process fails to converge.
106
107
108   class NonLinearLeastSquares
109
110This class is derived from FindMaximum2 and carries out a non-linear
111least squares fit. It uses a QR decomposition to carry out the
112operations required by FindMaximum2.
113
114A prototype class R1_Col_I_D is provided. The user needs to derive a
115class from this which includes functions the predicted value of each
116observation its derivatives. An object from this class has to be
117provided to class NonLinearLeastSquares.
118
119Suppose we observe n normal random variables with the same unknown
120variance and such the i-th one has expected value given by f(i,P)
121where P is a column vector of unknown parameters and f is a known
122function. We wish to estimate P.
123
124First derive a class from R1_Col_I_D and override Real operator()(int i)
125to give the value of the function f in terms of i and the ColumnVector
126para defined in class R1_CoL_I_D. Also override ReturnMatrix
127Derivatives() to give the derivates of f at para and the value of i
128used in the preceeding call to operator(). Return the result as a
129RowVector. Construct an object from this class. Suppose in what follows
130it is called pred.
131
132Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
133an iteration limit and an accuracy critierion.
134
135   NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
136
137The accuracy critierion should be somewhat less than one and 0.0001 is
139
140Define a ColumnVector P containing a guess at the value of the unknown
141parameter, and a ColumnVector Y containing the unknown data. Call
142
143   NLLS.Fit(Y,P);
144
145If the process converges, P will contain the estimates of the unknown
146parameters. If it does not converge an exception will be generated.
147
148The following member functions can be called after you have done a fit.
149
150Real ResidualVariance() const;
151
152The estimate of the variance of the observations.
153
154void GetResiduals(ColumnVector& Z) const;
155
156The residuals of the individual observations.
157
158void GetStandardErrors(ColumnVector&);
159
160The standard errors of the observations.
161
162void GetCorrelations(SymmetricMatrix&);
163
164The correlations of the observations.
165
166void GetHatDiagonal(DiagonalMatrix&) const;
167
168Forms a diagonal matrix of values between 0 and 1. If the i-th value is
169larger than, say 0.2, then the i-th data value could have an undue
171
172
173*/
174
175class FindMaximum2
176{
177   virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
178   virtual bool NextPoint(ColumnVector&, Real&) = 0;
179   virtual Real LastDerivative(const ColumnVector&) = 0;
180public:
181   void Fit(ColumnVector&, int);
182   virtual ~FindMaximum2() {}            // to keep gnu happy
183};
184
185class R1_Col_I_D
186{
187   // The prototype for a Real function of a ColumnVector and an
188   // integer.
189   // You need to derive your function from this one and put in your
190   // function for operator() and Derivatives() at least.
191   // You may also want to set up a constructor to enter in additional
192   // parameter values (that will not vary during the solve).
193
194protected:
195   ColumnVector para;                 // Current x value
196
197public:
198   virtual bool IsValid() { return true; }
199                                       // is the current x value OK
200   virtual Real operator()(int i) = 0; // i-th function value at current para
201   virtual void Set(const ColumnVector& X) { para = X; }
202                                       // set current para
203   bool IsValid(const ColumnVector& X)
204      { Set(X); return IsValid(); }
205                                       // set para, check OK
206   Real operator()(int i, const ColumnVector& X)
207      { Set(X); return operator()(i); }
208                                       // set para, return value
209   virtual ReturnMatrix Derivatives() = 0;
210                                       // return derivatives as RowVector
211   virtual ~R1_Col_I_D() {}            // to keep gnu happy
212};
213
214
215class NonLinearLeastSquares : public FindMaximum2
216{
217   // these replace the corresponding functions in FindMaximum2
218   void Value(const ColumnVector&, bool, Real&, bool&);
219   bool NextPoint(ColumnVector&, Real&);
220   Real LastDerivative(const ColumnVector&);
221
222   Matrix X;                         // the things we need to do the
223   ColumnVector Y;                   // QR triangularisation
224   UpperTriangularMatrix U;          // see the write-up in newmata.txt
225   ColumnVector M;
226   Real errorvar, criterion;
227   int n_obs, n_param;
228   const ColumnVector* DataPointer;
229   RowVector Derivs;
230   SymmetricMatrix Covariance;
231   DiagonalMatrix SE;
232   R1_Col_I_D& Pred;                 // Reference to predictor object
233   int Lim;                          // maximum number of iterations
234
235public:
236   NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
237      : criterion(crit), Pred(pred), Lim(lim) {}
238   void Fit(const ColumnVector&, ColumnVector&);
239   Real ResidualVariance() const { return errorvar; }
240   void GetResiduals(ColumnVector& Z) const { Z = Y; }
241   void GetStandardErrors(ColumnVector&);
242   void GetCorrelations(SymmetricMatrix&);
243   void GetHatDiagonal(DiagonalMatrix&) const;
244
245private:
246   void MakeCovariance();
247};
248
249
250// The next class is the prototype class for calculating the
251// log-likelihood.
252// I assume first derivatives are available and something like the
253// Fisher Information or variance/covariance matrix of the first
254// derivatives or minus the matrix of second derivatives is
255// available. This matrix must be positive definite.
256
257class LL_D_FI
258{
259protected:
260   ColumnVector para;                  // current parameter values
261   bool wg;                         // true if FI matrix wanted
262
263public:
264   virtual void Set(const ColumnVector& X) { para = X; }
265                                       // set parameter values
266   virtual void WG(bool wgx) { wg = wgx; }
267                                       // set wg
268
269   virtual bool IsValid() { return true; }
270                                       // return true is para is OK
271   bool IsValid(const ColumnVector& X, bool wgx=true)
272      { Set(X); WG(wgx); return IsValid(); }
273
274   virtual Real LogLikelihood() = 0;   // return the loglikelihhod
275   Real LogLikelihood(const ColumnVector& X, bool wgx=true)
276      { Set(X); WG(wgx); return LogLikelihood(); }
277
278   virtual ReturnMatrix Derivatives() = 0;
279                                       // column vector of derivatives
280   virtual ReturnMatrix FI() = 0;      // Fisher Information matrix
281   virtual ~LL_D_FI() {}               // to keep gnu happy
282};
283
284// This is the class for doing the maximum likelihood estimation
285
286class MLE_D_FI : public FindMaximum2
287{
288   // these replace the corresponding functions in FindMaximum2
289   void Value(const ColumnVector&, bool, Real&, bool&);
290   bool NextPoint(ColumnVector&, Real&);
291   Real LastDerivative(const ColumnVector&);
292
293   // the things we need for the analysis
294   LL_D_FI& LL;                        // reference to log-likelihood
295   int Lim;                            // maximum number of iterations
296   Real Criterion;                     // convergence criterion
297   ColumnVector Derivs;                // for the derivatives
298   LowerTriangularMatrix LT;           // Cholesky decomposition of FI
299   SymmetricMatrix Covariance;
300   DiagonalMatrix SE;
301
302public:
303   MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
304      : LL(ll), Lim(lim), Criterion(criterion) {}
305   void Fit(ColumnVector& Parameters);
306   void GetStandardErrors(ColumnVector&);
307   void GetCorrelations(SymmetricMatrix&);
308
309private:
310   void MakeCovariance();
311};
312
313
314#ifdef use_namespace
315}
316#endif
317
318
319
320#endif
321
322// body file: newmatnl.cpp
323
324
325
326///@}
327
Note: See TracBrowser for help on using the repository browser.