Context Navigation

source: nanovis/trunk/newmat11/newmatnl.h @ 4587

Last change on this file since 4587 was 2096, checked in by ldelgass, 13 years ago
Normalize line endings, set eol-style to native on .cpp, .h files
Property svn:eol-style set to `native`
File size: 10.9 KB

Line
1	/// \ingroup newmat
2	///@{
3
4	/// \file newmatnl.h
5	/// Header file for non-linear optimisation
6
7	// Copyright (C) 1993,4,5: R B Davies
8
9	#ifndef NEWMATNL_LIB
10	#define NEWMATNL_LIB 0
11
12	#include "newmat.h"
13
14	#ifdef use_namespace
15	namespace NEWMAT {
16	#endif
17
18
19
20	/*
21
22	This is a beginning of a series of classes for non-linear optimisation.
23
24	At present there are two classes. FindMaximum2 is the basic optimisation
25	strategy when one is doing an optimisation where one has first
26	derivatives and estimates of the second derivatives. Class
27	NonLinearLeastSquares is derived from FindMaximum2. This provides the
28	functions that calculate function values and derivatives.
29
30	A third class is now added. This is for doing maximum-likelihood when
31	you have first derviatives and something like the Fisher Information
32	matrix (eg the variance covariance matrix of the first derivatives or
33	minus the second derivatives - this matrix is assumed to be positive
34	definite).
35
36
37
38	class FindMaximum2
39
40	Suppose T is the ColumnVector of parameters, F(T) the function we want
41	to maximise, D(T) the ColumnVector of derivatives of F with respect to
42	T, and S(T) the matrix of second derivatives.
43
44	Then the basic iteration is given a value of T, update it to
45
46	T - S.i() * D
47
48	where .i() denotes inverse.
49
50	If F was quadratic this would give exactly the right answer (except it
51	might get a minimum rather than a maximum). Since F is not usually
52	quadratic, the simple procedure would be to recalculate S and D with the
53	new value of T and keep iterating until the process converges. This is
54	known as the method of conjugate gradients.
55
56	In practice, this method may not converge. FindMaximum2 considers an
57	iteration of the form
58
59	T - x * S.i() * D
60
61	where x is a number. It tries x = 1 and uses the values of F and its
62	slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
63	choses x to maximise the resulting function. This gives our new value of
64	T. The program checks that the value of F is getting better and carries
65	out a variety of strategies if it is not.
66
67	The program also has a second strategy. If the successive values of T
68	seem to be lying along a curve - eg we are following along a curved
69	ridge, the program will try to fit this ridge and project along it. This
70	does not work at present and is commented out.
71
72	FindMaximum2 has three virtual functions which need to be over-ridden by
73	a derived class.
74
75	void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
76
77	T is the column vector of parameters. The function returns the value of
78	the function to f, but may instead set oorg to true if the parameter
79	values are not valid. If wg is true it may also calculate and store the
80	second derivative information.
81
82	bool NextPoint(ColumnVector& H, Real& d);
83
84	Using the value of T provided in the previous call of Value, find the
85	conjugate gradients adjustment to T, that is - S.i() * D. Also return
86
87	d = D.t() * S.i() * D.
88
89	NextPoint should return true if it considers that the process has
90	converged (d very small) and false otherwise. The previous call of Value
91	will have set wg to true, so that S will be available.
92
93	Real LastDerivative(const ColumnVector& H);
94
95	Return the scalar product of H and the vector of derivatives at the last
96	value of T.
97
98	The function Fit is the function that calls the iteration.
99
100	void Fit(ColumnVector&, int);
101
102	The arguments are the trial parameter values as a ColumnVector and the
103	maximum number of iterations. The program calls a DataException if the
104	initial parameters are not valid and a ConvergenceException if the
105	process fails to converge.
106
107
108	class NonLinearLeastSquares
109
110	This class is derived from FindMaximum2 and carries out a non-linear
111	least squares fit. It uses a QR decomposition to carry out the
112	operations required by FindMaximum2.
113
114	A prototype class R1_Col_I_D is provided. The user needs to derive a
115	class from this which includes functions the predicted value of each
116	observation its derivatives. An object from this class has to be
117	provided to class NonLinearLeastSquares.
118
119	Suppose we observe n normal random variables with the same unknown
120	variance and such the i-th one has expected value given by f(i,P)
121	where P is a column vector of unknown parameters and f is a known
122	function. We wish to estimate P.
123
124	First derive a class from R1_Col_I_D and override Real operator()(int i)
125	to give the value of the function f in terms of i and the ColumnVector
126	para defined in class R1_CoL_I_D. Also override ReturnMatrix
127	Derivatives() to give the derivates of f at para and the value of i
128	used in the preceeding call to operator(). Return the result as a
129	RowVector. Construct an object from this class. Suppose in what follows
130	it is called pred.
131
132	Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
133	an iteration limit and an accuracy critierion.
134
135	NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
136
137	The accuracy critierion should be somewhat less than one and 0.0001 is
138	about the smallest sensible value.
139
140	Define a ColumnVector P containing a guess at the value of the unknown
141	parameter, and a ColumnVector Y containing the unknown data. Call
142
143	NLLS.Fit(Y,P);
144
145	If the process converges, P will contain the estimates of the unknown
146	parameters. If it does not converge an exception will be generated.
147
148	The following member functions can be called after you have done a fit.
149
150	Real ResidualVariance() const;
151
152	The estimate of the variance of the observations.
153
154	void GetResiduals(ColumnVector& Z) const;
155
156	The residuals of the individual observations.
157
158	void GetStandardErrors(ColumnVector&);
159
160	The standard errors of the observations.
161
162	void GetCorrelations(SymmetricMatrix&);
163
164	The correlations of the observations.
165
166	void GetHatDiagonal(DiagonalMatrix&) const;
167
168	Forms a diagonal matrix of values between 0 and 1. If the i-th value is
169	larger than, say 0.2, then the i-th data value could have an undue
170	influence on your estimates.
171
172
173	*/
174
175	class FindMaximum2
176	{
177	virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
178	virtual bool NextPoint(ColumnVector&, Real&) = 0;
179	virtual Real LastDerivative(const ColumnVector&) = 0;
180	public:
181	void Fit(ColumnVector&, int);
182	virtual ~FindMaximum2() {} // to keep gnu happy
183	};
184
185	class R1_Col_I_D
186	{
187	// The prototype for a Real function of a ColumnVector and an
188	// integer.
189	// You need to derive your function from this one and put in your
190	// function for operator() and Derivatives() at least.
191	// You may also want to set up a constructor to enter in additional
192	// parameter values (that will not vary during the solve).
193
194	protected:
195	ColumnVector para; // Current x value
196
197	public:
198	virtual bool IsValid() { return true; }
199	// is the current x value OK
200	virtual Real operator()(int i) = 0; // i-th function value at current para
201	virtual void Set(const ColumnVector& X) { para = X; }
202	// set current para
203	bool IsValid(const ColumnVector& X)
204	{ Set(X); return IsValid(); }
205	// set para, check OK
206	Real operator()(int i, const ColumnVector& X)
207	{ Set(X); return operator()(i); }
208	// set para, return value
209	virtual ReturnMatrix Derivatives() = 0;
210	// return derivatives as RowVector
211	virtual ~R1_Col_I_D() {} // to keep gnu happy
212	};
213
214
215	class NonLinearLeastSquares : public FindMaximum2
216	{
217	// these replace the corresponding functions in FindMaximum2
218	void Value(const ColumnVector&, bool, Real&, bool&);
219	bool NextPoint(ColumnVector&, Real&);
220	Real LastDerivative(const ColumnVector&);
221
222	Matrix X; // the things we need to do the
223	ColumnVector Y; // QR triangularisation
224	UpperTriangularMatrix U; // see the write-up in newmata.txt
225	ColumnVector M;
226	Real errorvar, criterion;
227	int n_obs, n_param;
228	const ColumnVector* DataPointer;
229	RowVector Derivs;
230	SymmetricMatrix Covariance;
231	DiagonalMatrix SE;
232	R1_Col_I_D& Pred; // Reference to predictor object
233	int Lim; // maximum number of iterations
234
235	public:
236	NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
237	: criterion(crit), Pred(pred), Lim(lim) {}
238	void Fit(const ColumnVector&, ColumnVector&);
239	Real ResidualVariance() const { return errorvar; }
240	void GetResiduals(ColumnVector& Z) const { Z = Y; }
241	void GetStandardErrors(ColumnVector&);
242	void GetCorrelations(SymmetricMatrix&);
243	void GetHatDiagonal(DiagonalMatrix&) const;
244
245	private:
246	void MakeCovariance();
247	};
248
249
250	// The next class is the prototype class for calculating the
251	// log-likelihood.
252	// I assume first derivatives are available and something like the
253	// Fisher Information or variance/covariance matrix of the first
254	// derivatives or minus the matrix of second derivatives is
255	// available. This matrix must be positive definite.
256
257	class LL_D_FI
258	{
259	protected:
260	ColumnVector para; // current parameter values
261	bool wg; // true if FI matrix wanted
262
263	public:
264	virtual void Set(const ColumnVector& X) { para = X; }
265	// set parameter values
266	virtual void WG(bool wgx) { wg = wgx; }
267	// set wg
268
269	virtual bool IsValid() { return true; }
270	// return true is para is OK
271	bool IsValid(const ColumnVector& X, bool wgx=true)
272	{ Set(X); WG(wgx); return IsValid(); }
273
274	virtual Real LogLikelihood() = 0; // return the loglikelihhod
275	Real LogLikelihood(const ColumnVector& X, bool wgx=true)
276	{ Set(X); WG(wgx); return LogLikelihood(); }
277
278	virtual ReturnMatrix Derivatives() = 0;
279	// column vector of derivatives
280	virtual ReturnMatrix FI() = 0; // Fisher Information matrix
281	virtual ~LL_D_FI() {} // to keep gnu happy
282	};
283
284	// This is the class for doing the maximum likelihood estimation
285
286	class MLE_D_FI : public FindMaximum2
287	{
288	// these replace the corresponding functions in FindMaximum2
289	void Value(const ColumnVector&, bool, Real&, bool&);
290	bool NextPoint(ColumnVector&, Real&);
291	Real LastDerivative(const ColumnVector&);
292
293	// the things we need for the analysis
294	LL_D_FI& LL; // reference to log-likelihood
295	int Lim; // maximum number of iterations
296	Real Criterion; // convergence criterion
297	ColumnVector Derivs; // for the derivatives
298	LowerTriangularMatrix LT; // Cholesky decomposition of FI
299	SymmetricMatrix Covariance;
300	DiagonalMatrix SE;
301
302	public:
303	MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
304	: LL(ll), Lim(lim), Criterion(criterion) {}
305	void Fit(ColumnVector& Parameters);
306	void GetStandardErrors(ColumnVector&);
307	void GetCorrelations(SymmetricMatrix&);
308
309	private:
310	void MakeCovariance();
311	};
312
313
314	#ifdef use_namespace
315	}
316	#endif
317
318
319
320	#endif
321
322	// body file: newmatnl.cpp
323
324
325
326	///@}
327

Note: See TracBrowser for help on using the repository browser.

Download in other formats: