next up previous contents index
Next: Convolution Up: No Title Previous: Confidence Level

Constraints

  In classical mechanics, constraint and degree of freedom are complementary terms: adding constraints reduces the number of degrees of freedom.

In statistics, on the other hand, the two terms are used with identical meaning, i.e. the number of degrees of freedom is equal to the number of independent constraints. Note that constraint equations are not independent if they contain free parameters, as eliminating one unknown costs one equation.

Example 1: In classical mechanics let a particle be constrained to move on the surface of a sphere of radius r. There are three coordinates x, y and z, and one constraint

leaving 3-1=2 degrees of freedom (for the particle to move). In other words, the position of the particle is described by two independent coordinates, e.g. the polar angles and , where

Assume now that independent measurements of x,y,z are carried out. Then there is said to be one (statistical) degree of freedom, meaning that there is one constraint equation with no unknown.

The true values of x,y,z must satisfy the constraint equation c(x,y,z)=0, but the observed values will usually fail to do so because of measurement errors. Given the true values x,y,z the observed values are random variables such that

is the probability that , , . In the maximum likelihood method, estimates for x,y,z are determined by the condition that should be maximal, while at the same time c(x,y,z)=0.

If the probability distribution f is Gaussian, with variances independent of x, y and z, then the maximum likelihood method reduces to the least squares method. If for example

and is independent of x,y,z, then the maximum of f is the minimum of S2. The least squares method provides not only a best fit for x,y,z, but also a test of the hypothesis c(x,y,z)=0. Define as the minimum value of S2(x,y,z) with the constraint c(x,y,z)=0. Then in the above example follows approximately a chi-square distribution with one degree of freedom, provided the hypothesis is true. It is not an exact - distribution because the equation c(x,y,z)=0 is non-linear, however, the non-linearity is unimportant as long as the residuals , etc. are small, which is true when .

A general method for solving constrained minimization problems is the Lagrange multiplier method. In this example it will result in four equations

for the four unknowns x,y,z and , where

and is a Lagrange multiplier.

A more efficient method in the present case is to use the constraint c(x,y,z)=0 to eliminate one variable, writing for example

This elimination method gives 3-1=2 equations

for two unknowns and , instead of the 3+1=4 equations of the Lagrange multiplier method. The chain rule ( Jacobi Matrix) is useful in computing and . Counting constraints, one has three equations

with two free parameters and , so the number of degrees of freedom is 3-2=1, as before. Note that x,y,z here are measured quantities and therefore not free parameters.

Another possible method is to add a penalty function kc2 to S2, with k a large constant, and to minimize the sum S2(x,y,z)+k[c(x,y,z)]2.

Example 2: Assume an event in a scattering experiment where the energy and momentum of every particle is measured. Then the conservation of energy and momentum imposes four constraints, so there are four degrees of freedom.

This example may also be treated differently. If N particle tracks are observed, meeting at the same vertex, then the 3N+3 physically interesting variables are the vertex position and the N 3-momenta . However, these are not directly measured, instead one measures altogether M coordinates on the N tracks, which are functions of the physical variables, i.e.

These are M equations with 3N+3 unknowns, so in this treatment there are M-3N-3 degrees of freedom. Adding the four energy- and momentum conservation equations gives M-3N+1 degrees of freedom.

In the last example the number of degrees of freedom happens to be equal to the number of measurements minus the number of parameters. Note that this relation is only true in the special case when there is one equation for every measured quantity, a common situation when fitting curves in two or three dimensions.


next up previous contents index
Next: Convolution Up: No Title Previous: Confidence Level

Rudolf K. Bock, 7 April 1998