by cschwarz@stat.sfu.ca » Wed Jan 20, 2010 6:47 pm
As dhewitt noted the use of "confounded parameters" is very confusing and you should really be thinking of estimable parameters.
Here is my take on the problem. Assume 3 stocks each of which has 2 releases.
Model 1: phi(time:stock) p(time:stock).
Let us look at the final survival * final capture terms that would appear in the likelihoods. These take the form
phi(s1,k-1) p(s1, k)
phi(s2,k-1) p(s2, k)
phi(s3,k-1) p(s3, k)
where phi(s1,k-1) is the survival term for stock 1 between sampling occasions k-1 and k; p(s1, k) is the capture rate for stock s1 at time k.
The likelihood only has 3 expressions and therefore ALL 6 parameters (3 phi and 3 p) are "confounded", i.e. cannot be estimated separately. It doesn't make sense to say that the p's are confounded but the phi's are not or vice versa. To say that there are 3 confounded parameters really doesn't have an interpretation other than perhaps saying only 3 functions of parameters can be estimated? I think the only sensible thing to say here is that there are 6 "raw" or "fundamental" parameters that cannot all be separately estimated.
So there are 6 fundamental parameters, but only 3 estimable (functions of) parameters being the 3 products above. The number of estimable parameters for these 6 end of experiment "raw" parameters is 3. When you do the parameter counting, add 3 for the final sets at the end of the experiment.
Model 2: Phi(time:stock) p(time:release). There are a total of 9 raw parameters at the end of the experiment that appear in the form:
phi(s1, k-1) p(s1-r1, k)
phi(s1, k-1) p(s1-r2, k)
phi(s2, k-1) p(s2-r1, k)
phi(s2, k-1) p(s2-r2, k)
phi(s3, k-1) p(s3-r1, k)
phi(s3, k-1) p(s3-r2, k)
where p(s1-r1,k) is the recapture rate of release 1 of stock 1 at time k, etc.
All 9 parameters are confounded and cannot be individually estimated, but 6 estimable functions of parameters can be found as follows:
(1) phi(s1, k-1) p(s1-r1, k) giving the first product above
(2) p(s1-r2)/p(s1-r1) because now [phi(s1, k-1) p(s1-r1, k)] *[p(s1-r2)/p(s1-r1)] gives the second term.
A similar patterns for stocks 2 and 3 gives estimable parameters 4...6.
Hence there are 6 estimate (functions) of the 9 raw parameters and so the parameter count for the end of study terms is only 6 -- 3 terms are lost.
Model 3: phi(time:stock) p(time). There are a total of 4 raw parameters (3 phi's and 1 p) for the end of study terms:
phi(s1, k-1) p(*,k)
phi(s2, k-1) p(*,k)
phi(s3, k-1) p(*,k)
But you can get the same 3 terms above with 3 estimable parameters being
phi(s1, k-1) p(*,k) which gives the first term above
phi(s2, k-1)/phi(s1,k-1) and the second term above can be derived as : [phi(s1, k-1) p(*,k)] * [phi(s2, k-1)/phi(s1,k-1)]
phi(s3, k-1)/phi(s1,k-1) and the third term above can be derived as: [phi(s1, k-1) p(*,k)] * [phi(s3, k-1)/phi(s1,k-1)]
So rather than counting 4 raw parameters for the end of study terms, there are only 3 estimable parameters.
In each of the cases above, the estimable functions are NOT unique and there are other functions of parameters that would give the same results, but the number of estimable functions of parameters is the same.
So how does Mark determine the number of estimable parameters. It uses a singular value decomposition of the information matrix (the inverse of the variance-covariance matrix) and looks for the number of singular values (related to eigenvalues that are close to 0). If all parameters are estimable, then the information matrix will be full rank and there will be no singular values. If the information matrix is less than full rank, this indicates that some rows (corresponding to the second derivatives of the log-likelihood) are proportional to each other and the matrix cannot be inverted (is singular). I believe this will only detect simple confounding as seen in the expressions above but won't detect "weird" confounding, but don't quote me on this. So if there is one singular value, there is one row that is a linear combination of other rows, i.e. a redundancy has been detected, but you can't associate the redundancy with a single parameter in an unambiguous way. For example, for the term phi(s1,k-1) p(s1, k-1) which of the parameters is "redundant"?
The determination of singular values is done numerically. Typically the singular values are sorted and any small singular value that is less than a very small multiple (e.g. 10**(-6)) of the largest singular value is declared to be singular. This can miss some small singular values and hence give rise to a "mis-count" of redundancies. For example, suppose the singular values were 1,000,000 1.1 .9 and the "ratio" used to determine the singular values was 10**(-6). Then the value 1.1 would NOT be declared as a "zero" singular value, but clearly is!
Now going back to the list of beta estimates that was also presented earlier, it is true that RMark only highlighted some rows, but if you examine the se for the terms, you will see that the se of the raw parameters that are involved in the confounding relationship are either 0 or enormous. Both are indications of problems regardless of what RMark flags.
These "weird" se will "identify" which parameters are involved in confounding relationships, but do NOT tell you how many estimate functions must be counted. There is no easy, automatic way way to do this (but see the papers by Catchpole and Gimenez for symbolic and numerical ways to do this respectively), and you need to write out the final terms of the likelihood and use "experience", i.e. with parameter sets as in Model 2 and 3, one estimable term is usually a ratio, to figure out what can be estimated. The first 1000 times doing this is the hardest.
So... the moral of the story is
- don't rely on automatic counting of parameters but do a systematic count based on how the PIM or DESIGN matrix is constructed.
- write out the likelihood terms near the end of the study for CJS (and related models) and similar terms near the start of the study for JS (and related models)
- use "experience" to try and figure out how many estimable functions should exist. For most CJS (and related models) these will be product terms and ratios as noted above. With experimence, you will be able to spot these fairly easily.
Carl Schwarz.