I just want to start off by saying I really appreciate all the quick responses I've been getting in this forum. It's been extremely helpful. The following question is going to be rather long winded as I'm a bit stumped on a population model I'm trying to run. Here it goes...
The data collected for this project is slightly complex so I'll try to explain some of the intricacies first:
1. Fish were collected on 7 occasions (2014,2015,2016,2018,2019,2020,2021). I've made sure to include the 2 year time interval between the 2016 and 2018.
2. For 3 of those occasions (2018, 2019, 2021) fish were collected in two "sub-occasions" - spring (June) and fall (September). The previous researcher who worked on this model grouped the spring and fall data together into one year, so that's what I've done. In the other years, data was only collected in the spring. The purpose of adding the fall sampling period in later study years was to target juvenile fish, whereas spawning adults are primarily caught in the spring. The thought was that combining the two would provide a more realistic abundance estimate for the total population of fish.
On to the model....
I have run all the logical iterations of the model that I can think of and the lowest AICc has constant survivorship, time-varying probability of capture and time varying pent. See the results below.
Real Estimates:
- Code: Select all
Standard Error and Confidence Intervals Corrected for c-hat = 4.9700000
Real Function Parameters of {phi(.)p(t)pent(t)}
95% Confidence Interval
Parameter Estimate Standard Error Lower Upper
------------------------- -------------- -------------- -------------- --------------
1:Phi 0.9466707 0.0515098 0.7061360 0.9924321
2:p 0.9416312 106.3553601 0.0000000 1.0000000
3:p 0.2473903 0.0577810 0.1517632 0.3765258
4:p 0.1008674 0.0263458 0.0596930 0.1654455
5:p 0.0623677 0.0151611 0.0384767 0.0995570
6:p 0.1268955 0.0328089 0.0752226 0.2061516
7:p 0.0431800 0.0121553 0.0247255 0.0743583
8:p 0.0910026 0.0268993 0.0502681 0.1592125
9:pent 0.3528534 6.8133958 0.0000000 1.0000000
10:pent 0.2186957 0.1780907 0.0350218 0.6834275
11:pent 0.1563846 0.1659115 0.0155159 0.6855708
12:pent 0.0000018 0.0008882 0.0000000 0.9988262
13:pent 0.2093793 0.1542012 0.0409173 0.6217733
14:pent 0.0002111 0.0461906 0.0000000 0.9999902
15:N 5286.6608000 2095.0512485 1180.3603530 9392.9612
It's pretty obvious that I'm dealing with some non-identifiability, specifically with that first "p" (2014) and at least 3 of the pents look to be non-identifiable. Also, the c-hat for this data is very high, at 4.97.
Derived Estimates:
- Code: Select all
Estimates of Derived Parameters
Gross Birth+Immigration Estimates of {phi(.)p(t)pent(t)}
95% Confidence Interval
Grp. Occ. B*-hat Standard Error Lower Upper
---- ---- -------------- -------------- -------------- --------------
1 1 1916.9993 16279.348 33.047162 111201.27
1 2 1188.1407 384.14401 640.42444 2204.2856
1 3 872.88915 384.88352 382.09626 1994.0930
1 4 0.0095290 2.1643154 0.1496431E-04 6.0678806
1 5 1137.5260 360.21724 620.61499 2084.9727
1 6 1.1468448 112.57871 0.0030295 434.14646
Net Birth+Immigration Estimates of {phi(.)p(t)pent(t)}
95% Confidence Interval
Grp. Occ. B-hat Standard Error Lower Upper
---- ---- -------------- -------------- -------------- --------------
1 1 1865.4163 15840.857 32.158761 108206.23
1 2 1156.1700 373.98274 623.01981 2145.5643
1 3 826.75229 363.91276 362.37244 1886.2344
1 4 0.0092726 2.1060772 0.1456165E-04 5.9046041
1 5 1106.9172 349.89026 604.54714 2026.7496
1 6 1.1159853 109.54927 0.0029480 422.46398
Population Estimates of {phi(.)p(t)pent(t)}
95% Confidence Interval
Grp. Occ. N-hat Standard Error Lower Upper
---- ---- -------------- -------------- -------------- --------------
1 1 330.27973 16733.264 1.3609120 80155.585
1 2 2178.0825 215.86356 1794.3975 2643.8084
1 3 3213.3635 337.40462 2617.1376 3945.4192
1 4 3705.6255 328.63245 3115.4397 4407.6155
1 5 3507.0697 377.32051 2842.0162 4327.7507
1 6 4423.1706 463.37755 3604.1446 5428.3166
1 7 4186.5086 516.89656 3289.6661 5327.8519
Gross Population Estimates of {phi(.)p(t)pent(t)}
95% Confidence Interval
Grp. Occ. N*-hat Standard Error Lower Upper
---- ---- -------------- -------------- -------------- --------------
1 0 5446.9913 540.73908 4486.0169 6613.8214
The only really wonky estimate here is the first N estimate (330.27). Otherwise these look reasonable for what we'd expect to see. Perhaps these numbers are meaningless based on the non-identifiability in many of the real estimates? I'm unsure of how to interpret these.
Important Info from Release GOF test:
- Code: Select all
Observed Recaptures for Group 1
Group 1
i R(i) m(i,j) r(i)
j= 2 3 4 5 6 7
1 311 73 26 9 23 12 3 146
2 534 46 22 32 13 16 129
3 323 19 49 13 15 96
4 230 25 7 44 76
5 441 8 31 39
6 189 9 9
m(j) 73 72 50 129 53 118
z(j) 73 130 176 123 109 0
Sums for the above Groups
m. 0 73 72 50 129 53 118
z. 0 73 130 176 123 109 0
R. 311 534 323 230 441 189
r. 146 129 96 76 39 9
Summary of TEST 3 (Goodness of fit) Results
Group Component Chi-square df P-level Sufficient Data
----- --------- ---------- ---- ------- ---------------
1 3.SR2 3.1373 1 0.0765 Yes
1 3.SR3 9.2718 1 0.0023 Yes
1 3.SR4 0.0277 1 0.8677 Yes
1 3.SR5 5.7300 1 0.0167 Yes
1 3.SR6 3.3711 1 0.0664 Yes
Group 1 3.SR 21.5379 5 0.0006
1 3.Sm2 8.2774 1 0.0040 Yes
1 3.Sm3 0.0000 1 1.0000 Yes
1 3.Sm4 15.6052 1 0.0001 Yes
1 3.Sm5 0.0000 1 1.0000 Yes
Group 1 3.Sm 23.8826 4 0.0001
Group 1 TEST 3 45.4205 9 0.0000
Summary of TEST 2 (Goodness of fit) Results
Group Component Chi-square df P-level Sufficient Data
----- --------- ---------- ---- ------- ---------------
1 2.C2 6.3802 4 0.1725 Yes
1 2.C3 2.4261 3 0.4888 Yes
1 2.C4 37.1915 2 0.0000 Yes
1 2.C5 3.1041 1 0.0781 Yes
Group 1 TEST 2 49.1018 10 0.0000
Goodness of Fit Results (TEST 2 + TEST 3) by Group
Group Chi-square df P-level
----- ---------- ---- -------
1 94.5224 19 0.0000
I'm still learning how to interpret these tests, so I'm curious if anything jumps off the page in certain years. I understand that the high chi-squared values and low p-values are a red flag, but I'm having a hard time figuring out why that's the case (see the m-array above). From what I can tell, these results indicate that the data simply doesn't fit the model. If that's the case, I'm curious about what I could try next. A few options I've looked into include:
1. Filtering data by "adults only". This eliminates the potential differences in some of the parameters between adults and juveniles (thinking survival mostly), however I still get identifiability issues, but a much lower c-hat (~1.4). An option I haven't yet tried is adding a grouping variable for adults/juveniles for the whole dataset. One potential issue that I can see arising is that some individuals were juveniles when the study began, but are adults now. Also, juveniles were only really targeted in the latter half of the study which isn't ideal.
2. I tried simplifying capture probability to "high" and "low" water years. Basically, when the water is high, it is more difficult to catch the fish and researchers know this. This model has a higher AIC than the one above (by >10) and still has issues with non-identifiability. Interestingly, the probability of capture estimates meet expectations and are about half as high in high water years vs. low water years.
3. I've tried simplifying the pent structure (looked a high/low water years), but I get crazy high AIC values for these models (>100000). I'm still using the Mlogit(1) link when simplifying so maybe that's incorrect?
Things I think I could try, but haven't:
1. A separate capture event for those fall sampling periods (i.e. separate fall and spring data, and adjust time intervals accordingly). I'm unsure if this is worthwhile, since the fall sampling mostly targets juveniles.
If anyone could provide some insight/advice on how I could proceed it would be much appreciated!
Cody