Failed variance calculation and missing parameters

questions concerning anlysis/theory using program DENSITY and R package secr. Focus on spatially-explicit analysis.

Failed variance calculation and missing parameters

Postby cebert » Tue Oct 06, 2015 9:37 am

Hello,
I am rather new to both R and secr, so maybe the answer to my question is totally obvious - however, I searched the forum for problems with failed variance calculations and I am not sure if my problem suits to one of the previous posts.
I am working on a red deer faecal DNA data set and I tried fitting different models. However, the heterogeneity model g0 ~ h2 seemed to yield quite strange results. Its estimates (particularly D, g0 and sigma) are exactly the same as in the g0 ~ 1 model (which is very unlikely given the obvious heterogeneiyt in the capture frequencies), and there are warnings and NaNs. Here is what I entered:
secr.fit(capthist = Soonwald_ROW, model = g0 ~ h2, buffer = 4000,
detectfn = 2, method = "BFGS", trace = FALSE)

And here are the warnings/ error messages:
1: In secr.fit(Soonwald_ROW, model = g0 ~ h2, buffer = 4000, trace = FALSE, :
using default starting values
2: In log(1 - default$g0) : NaNs wurden erzeugt
3: In secr.fit(Soonwald_ROW, model = g0 ~ h2, buffer = 4000, trace = FALSE, :
at least one variance calculation failed

I don't know how to deal with this problem or if it is a problem of my input data (even if it said "no errors found" in read.capthist). Below, you also find the detailed output of model g0 ~ h2.
I would be very happy about any help or advice!
Thanks in advance,
Cornelia

Beta parameters (coefficients)
beta SE.beta lcl ucl
D -3.0559400 0.04458198 -3.143319 -2.968561
g0 -0.1712308 1.45407916 -3.021174 2.678712
g0.h22 0.6826030 1.45246795 -2.164182 3.529388
sigma 6.0112541 0.02819196 5.955999 6.066509
pmix.h22 5.7484330 NaN NaN NaN

Variance-covariance matrix of beta parameters
D g0 g0.h22 sigma pmix.h22
D 1.987553e-03 -1.927760e-03 1.257344e-03 -9.825988e-05 5.999664e-03
g0 -1.927760e-03 2.114346e+00 -2.111312e+00 6.008587e-05 -2.495463e+00
g0.h22 1.257344e-03 -2.111312e+00 2.109663e+00 2.674135e-07 2.503180e+00
sigma -9.825988e-05 6.008587e-05 2.674135e-07 7.947868e-04 -7.035829e-05
pmix.h22 5.999664e-03 -2.495463e+00 2.503180e+00 -7.035829e-05 -5.741327e+00

Fitted (real) parameters evaluated at base levels of covariates

session = Soon_alle, h2 = 1
link estimate SE.estimate lcl ucl
D log 4.707844e-02 0.002099894 0.04313938 0.05137719
g0 log 8.426271e-01 2.274182970 0.04874398 14.56632033
sigma log 4.079947e+02 11.504456524 386.06233798 431.17296529
pmix logit 3.177642e-03 NaN NaN NaN

session = Soon_alle, h2 = 2
link estimate SE.estimate lcl ucl
D log 0.04707844 0.002099894 0.04313938 0.05137719
g0 log 1.66757803 0.062094328 1.55024951 1.79378640
sigma log 407.99466057 11.504456524 386.06233798 431.17296529
pmix logit 0.99682236 NaN NaN NaN
cebert
 
Posts: 13
Joined: Wed Jun 17, 2009 2:34 pm

Re: Failed variance calculation and missing parameters

Postby murray.efford » Tue Oct 06, 2015 3:00 pm

Hello Cornelia
The 2-class finite mixture (h2) model includes the parameter 'pmix' for the probability that an individual belongs to each latent class. Your results show a tiny 'pmix' for the first class and a large 'pmix' (0.997) for the second class, so it's not surprising that the estimates from g0 ~ h2 are essentially the same as a 1-class model g0 ~ 1, or that there were problems with variance estimation. The mystery is why the numerical maximization goes to that boundary value ('pmix' approximately 1.0 for class 2). Possibly you just do not have enough data, although the SE for density look quite good.

You say there is "obvious heterogeneity in the capture frequencies", but I wonder how you are judging this. I have reservations about routine use of 'h2' because there can be fitting problems and perhaps heterogeneity is not such a big worry, but I cannot really offer a clear explanation in your case. I may be able to say more if you want to send me the data offline, but no promises!
Murray
murray.efford
 
Posts: 686
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand

Re: Failed variance calculation and missing parameters

Postby cebert » Tue Oct 06, 2015 4:09 pm

Hello Murray,
thanks a lot for your quick reply!!
I will send you the data set via email.
I think it is quite large (more than 1200 samples from 600 individuals).
Concerning the heterogeneity, I carried out the Bayesian hetrogeneity test (after Puechmaille & Petit 2007), furthermore the likelihood ratio test in CAPWIRE indicated the use of the two innate rates model which also indicates heterogeneous data (of course, one could argue about the informative value or significance of these tests). Anyway, I am very excited about reading your comments on the data set!
All the best,
Cornelia
cebert
 
Posts: 13
Joined: Wed Jun 17, 2009 2:34 pm

Re: Failed variance calculation and missing parameters

Postby murray.efford » Tue Oct 06, 2015 4:21 pm

Yes, you seem to have plenty of data!

I would warn against using non-spatial tests for heterogeneity. These almost inevitably give a positive answer when the data are spatial. Remember that one big plus for spatial methods is that they model (and accommodate) heterogeneity due to differential access to detectors (spatially induced heterogeneity). There is also my other point that even when individual heterogeneity is present it may not be large enough to cause significant bias in the estimates - that point is admittedly not so strong or general.

Murray
murray.efford
 
Posts: 686
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand

Re: Failed variance calculation and missing parameters

Postby murray.efford » Sat Oct 10, 2015 7:44 pm

It seems this was a problem with the choice of numerical optimization algorithm. The model fits using the default method (rather than 'BFGS'), and I would expect method = 'Nelder-Mead' to be even more robust, if slightly slower.

The warning "In log(1 - default$g0) : NaNs wurden erzeugt" was due to a data-dependent bug in the ad hoc calculation of default starting values for the numerical maximization for 'polygon' detectors. It has been fixed for the next release, and can safely be ignored in the meantime. (An unrelated issue in the use of mixture models with 3-parameter detection functions has also been fixed; it would seem this combination has not been working for a while).

Murray
murray.efford
 
Posts: 686
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand

Re: Failed variance calculation and missing parameters

Postby emma.s » Wed Mar 29, 2017 6:39 pm

Hi Murray,

I am very new to this method of estimating densities. I had a similar problem where the message appeared after the model ran. However, I went through and used your recommeded solution to change the method to "Nelder-Mead" and it worked.
The question is, whether I would need to change the method for all the models so that they are comparable? And, what are the possible reasons it did not work?

I have one year dataset of jaguar photo captures divided into 12 sessions of 29-31 days (i.e., one month=one session). I had ran three months with using the default method without any problems.

Thanks
Emma
emma.s
 
Posts: 4
Joined: Thu Mar 16, 2017 5:14 pm

Re: Failed variance calculation and missing parameters

Postby murray.efford » Wed Mar 29, 2017 6:49 pm

There should be no problem comparing models using different numerical algorithms for likelihood maximisation.

Another likely solution is to simply recalculate the variance-covariance matrix and confidence intervals using your fitted model for starting values
Code: Select all
newfit <- secr.fit(CH, [..other arguments..], method = 'none', start = oldfit)
predict(newfit)

where oldfit is the name of the previously fitted model. This recomputes the variance covariance matrix without remaximising the likelihood.

Be aware that if the default numerical algorithm is struggling it probably means the data are weak and/or model fit is poor, so you shouldn't build too much on it!

Murray
murray.efford
 
Posts: 686
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand

Re: Failed variance calculation and missing parameters

Postby emma.s » Sat Apr 01, 2017 2:52 pm

Hi,

Thank you for the response. It is much appreciated! I am working on analysis for my Master's thesis at the moment. For this reason, I feel that at some point I will need to justify the use of a different maximization method or why it failed to calculate the variances.
I have been looking into possible reason's why it may have failed. I ran a few more months to see if there is a trend and I think we may found a common characterisitic. Those sessions have that failed in calculating variances have one or two individuals with a greater number of recpatures when compared to others. I will point out that the models that have failed are not consistently using the same detection function. I am using detection functions 0,1,2 and 6 with the null model. Sessions have between 29-31 occasions.
What are your thoughts about this? If you feel like you need more information to make a more infromed argument, let me know.

Thank you again for your assistance.

Best,
Emma
emma.s
 
Posts: 4
Joined: Thu Mar 16, 2017 5:14 pm

Re: Failed variance calculation and missing parameters

Postby murray.efford » Mon Apr 03, 2017 4:55 am

Hi

We are talking about a numerical issue. Numerical issues happen, especially with sparse data.

The default maximisation method was chosen for secr.fit because it was usually the fastest of those available, but it is not the most robust. The variance estimates are based on the Hessian (essentially, the curvature) of the likelihood surface at the maximum, which happens to pop out of the gradient calculations used in the default method (Newton-Raphson in package nlm). The alternative method Nelder-Mead does not use the gradient to find the maximum, so a separate estimate of the Hessian is made automatically once the maximum has been found. The algorithm for that add-on estimate happens to be more sophisticated and delivers results even when the default nlm method fails. It is also the algorithm used when you set method="none" as I suggested.

If you are nervous about using different maximisers then the obvious solution is to use the most robust for all datasets.

I would be more concerned about your rationale for trying 4 different detection functions when the data are sparse! The 'hazard-rate' function is a dog and needs real care. Annular ranges sound nice, but do you have enough data to recognise them?

Murray
murray.efford
 
Posts: 686
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand


Return to analysis help

Who is online

Users browsing this forum: No registered users and 9 guests

cron