Hi,
I struggle with multiple saddle point warnings and difficulties finding a global minimum – my lowest deviance is found when the model converges on a saddle point, despite having reduced the tolerance.
I have a multi-event model with 52 occasions, five age classes, four states (Alive - High resighting probability, Alive - Low resighting probability, Newly dead and Dead) and three events (0 Not seen, 1 Observed alive, 2 Found dead), so the model incorporates heterogeneity in resightability by using two classes of individuals. The transition part in E-Surge is decomposed into two steps, first survival and then a class transition step, where transition from Alive-High to Alive-Low can happen. The heterogeneity in resighting rates is modelled during the last 10 occasions, due to study ‘design’. The model has mainly additive effects between age and time, and in total 233 parameters. All parameters can be estimated.
In the first run, when the model converged on deviance 66513.4, accompanied with a warning for saddle point, I had reduced the tolerance 10 times compared to the default settings (Tolerance to parameter change: 1*10^-6 and Tolerance on gradient: 1*10^-7). Three out of five sets of random starting values converged at, or near, this point.
After that I have run the model again, with the same and even further reduced tolerance, without the model converging on a lower deviance. Once the model converged on deviance 66514.1, when ending with “Failling with line search” which I understand means that parameter estimates cannot be improved further, despite not having reached the precision defined by my tolerance settings. In total, the model has been run with 9 sets of random starting values, without – as far as I understand – ever finding a global minimum. (Working with other models in the past I have experienced the same problem.)
1)Does the saddle point warning in E-Surge always mean that convergence has been reached at a saddle point? Can there be cases when the mathematical diagnosis of a saddle point can results in a warning, despite the point of convergence not being a saddle point? I picture in my head a case with a valley in the likelihood where a row of points would have the same likelihood?
2) Does a saddle point, in a parameter space of this many dimensions, always mean that there exists a global minimum with lower deviance?
3)If the answers to both questions above are “yes” and reducing the tolerance even more and rerunning the model would not help me find a global minimum, are there any conditions when I can base my continued model selection on the first run? (Running the model with reduced tolerance and 2 sets of random starting values takes about 30 hours, so doing many runs is not feasible.)
Any help on how to think about these matters would be very much appreciated.
Martina