POPAN Analysis Help

questions concerning analysis/theory using program MARK

POPAN Analysis Help

Postby cjackson1373 » Tue Apr 26, 2022 2:39 pm

Hi,

I just want to start off by saying I really appreciate all the quick responses I've been getting in this forum. It's been extremely helpful. The following question is going to be rather long winded as I'm a bit stumped on a population model I'm trying to run. Here it goes...

The data collected for this project is slightly complex so I'll try to explain some of the intricacies first:

1. Fish were collected on 7 occasions (2014,2015,2016,2018,2019,2020,2021). I've made sure to include the 2 year time interval between the 2016 and 2018.

2. For 3 of those occasions (2018, 2019, 2021) fish were collected in two "sub-occasions" - spring (June) and fall (September). The previous researcher who worked on this model grouped the spring and fall data together into one year, so that's what I've done. In the other years, data was only collected in the spring. The purpose of adding the fall sampling period in later study years was to target juvenile fish, whereas spawning adults are primarily caught in the spring. The thought was that combining the two would provide a more realistic abundance estimate for the total population of fish.

On to the model....

I have run all the logical iterations of the model that I can think of and the lowest AICc has constant survivorship, time-varying probability of capture and time varying pent. See the results below.

Real Estimates:
Code: Select all
Standard Error and Confidence Intervals Corrected for c-hat = 4.9700000

                          Real Function Parameters of {phi(.)p(t)pent(t)}           
                                                               95% Confidence Interval   
 Parameter                   Estimate       Standard Error      Lower           Upper     
 -------------------------  --------------  --------------  --------------  --------------
    1:Phi                   0.9466707       0.0515098       0.7061360       0.9924321     
    2:p                     0.9416312       106.3553601     0.0000000       1.0000000     
    3:p                     0.2473903       0.0577810       0.1517632       0.3765258     
    4:p                     0.1008674       0.0263458       0.0596930       0.1654455     
    5:p                     0.0623677       0.0151611       0.0384767       0.0995570     
    6:p                     0.1268955       0.0328089       0.0752226       0.2061516     
    7:p                     0.0431800       0.0121553       0.0247255       0.0743583     
    8:p                     0.0910026       0.0268993       0.0502681       0.1592125     
    9:pent                  0.3528534       6.8133958       0.0000000       1.0000000     
   10:pent                  0.2186957       0.1780907       0.0350218       0.6834275     
   11:pent                  0.1563846       0.1659115       0.0155159       0.6855708     
   12:pent                  0.0000018       0.0008882       0.0000000       0.9988262     
   13:pent                  0.2093793       0.1542012       0.0409173       0.6217733     
   14:pent                  0.0002111       0.0461906       0.0000000       0.9999902     
   15:N                     5286.6608000    2095.0512485    1180.3603530    9392.9612 


It's pretty obvious that I'm dealing with some non-identifiability, specifically with that first "p" (2014) and at least 3 of the pents look to be non-identifiable. Also, the c-hat for this data is very high, at 4.97.

Derived Estimates:
Code: Select all
                      Estimates of Derived Parameters
         Gross Birth+Immigration Estimates of {phi(.)p(t)pent(t)}
                                                95% Confidence Interval
 Grp. Occ.   B*-hat          Standard Error      Lower           Upper
 ---- ----   --------------  --------------  --------------  --------------
   1     1    1916.9993       16279.348       33.047162       111201.27   
   1     2    1188.1407       384.14401       640.42444       2204.2856   
   1     3    872.88915       384.88352       382.09626       1994.0930   
   1     4    0.0095290       2.1643154       0.1496431E-04   6.0678806   
   1     5    1137.5260       360.21724       620.61499       2084.9727   
   1     6    1.1468448       112.57871       0.0030295       434.14646   
          Net Birth+Immigration Estimates of {phi(.)p(t)pent(t)}
                                                95% Confidence Interval
 Grp. Occ.   B-hat           Standard Error      Lower           Upper
 ---- ----   --------------  --------------  --------------  --------------
   1     1    1865.4163       15840.857       32.158761       108206.23   
   1     2    1156.1700       373.98274       623.01981       2145.5643   
   1     3    826.75229       363.91276       362.37244       1886.2344   
   1     4    0.0092726       2.1060772       0.1456165E-04   5.9046041   
   1     5    1106.9172       349.89026       604.54714       2026.7496   
   1     6    1.1159853       109.54927       0.0029480       422.46398   
               Population Estimates of {phi(.)p(t)pent(t)}
                                                95% Confidence Interval
 Grp. Occ.   N-hat           Standard Error      Lower           Upper
 ---- ----   --------------  --------------  --------------  --------------
   1     1    330.27973       16733.264       1.3609120       80155.585   
   1     2    2178.0825       215.86356       1794.3975       2643.8084   
   1     3    3213.3635       337.40462       2617.1376       3945.4192   
   1     4    3705.6255       328.63245       3115.4397       4407.6155   
   1     5    3507.0697       377.32051       2842.0162       4327.7507   
   1     6    4423.1706       463.37755       3604.1446       5428.3166   
   1     7    4186.5086       516.89656       3289.6661       5327.8519   
            Gross Population Estimates of {phi(.)p(t)pent(t)}
                                                95% Confidence Interval
 Grp. Occ.   N*-hat          Standard Error      Lower           Upper
 ---- ----   --------------  --------------  --------------  --------------
   1     0    5446.9913       540.73908       4486.0169       6613.8214   


The only really wonky estimate here is the first N estimate (330.27). Otherwise these look reasonable for what we'd expect to see. Perhaps these numbers are meaningless based on the non-identifiability in many of the real estimates? I'm unsure of how to interpret these.

Important Info from Release GOF test:
Code: Select all
                        Observed Recaptures for Group 1
                                    Group 1

             i   R(i)                     m(i,j)                    r(i)
                        j=  2      3      4      5      6      7
             1    311      73     26      9     23     12      3     146
             2    534             46     22     32     13     16     129
             3    323                    19     49     13     15      96
             4    230                           25      7     44      76
             5    441                                   8     31      39
             6    189                                          9       9

            m(j)           73     72     50    129     53    118
            z(j)           73    130    176    123    109      0

                           Sums for the above Groups

            m.       0     73     72     50    129     53    118
            z.       0     73    130    176    123    109      0
            R.     311    534    323    230    441    189
            r.     146    129     96     76     39      9

                 Summary of TEST 3 (Goodness of fit) Results
 
          Group  Component  Chi-square   df   P-level  Sufficient Data
          -----  ---------  ----------  ----  -------  ---------------
            1    3.SR2         3.1373     1    0.0765       Yes
            1    3.SR3         9.2718     1    0.0023       Yes
            1    3.SR4         0.0277     1    0.8677       Yes
            1    3.SR5         5.7300     1    0.0167       Yes
            1    3.SR6         3.3711     1    0.0664       Yes
          Group 1 3.SR        21.5379     5    0.0006
            1    3.Sm2         8.2774     1    0.0040       Yes
            1    3.Sm3         0.0000     1    1.0000       Yes
            1    3.Sm4        15.6052     1    0.0001       Yes
            1    3.Sm5         0.0000     1    1.0000       Yes
          Group 1 3.Sm        23.8826     4    0.0001
          Group 1 TEST 3      45.4205     9    0.0000
 
 
                   Summary of TEST 2 (Goodness of fit) Results
 
          Group  Component  Chi-square   df   P-level  Sufficient Data
          -----  ---------  ----------  ----  -------  ---------------
            1    2.C2          6.3802     4    0.1725       Yes
            1    2.C3          2.4261     3    0.4888       Yes
            1    2.C4         37.1915     2    0.0000       Yes
            1    2.C5          3.1041     1    0.0781       Yes
          Group 1 TEST 2      49.1018    10    0.0000
 
 
               Goodness of Fit Results (TEST 2 + TEST 3) by Group
 
                        Group  Chi-square   df   P-level
                        -----  ----------  ----  -------
                          1      94.5224    19    0.0000


I'm still learning how to interpret these tests, so I'm curious if anything jumps off the page in certain years. I understand that the high chi-squared values and low p-values are a red flag, but I'm having a hard time figuring out why that's the case (see the m-array above). From what I can tell, these results indicate that the data simply doesn't fit the model. If that's the case, I'm curious about what I could try next. A few options I've looked into include:

1. Filtering data by "adults only". This eliminates the potential differences in some of the parameters between adults and juveniles (thinking survival mostly), however I still get identifiability issues, but a much lower c-hat (~1.4). An option I haven't yet tried is adding a grouping variable for adults/juveniles for the whole dataset. One potential issue that I can see arising is that some individuals were juveniles when the study began, but are adults now. Also, juveniles were only really targeted in the latter half of the study which isn't ideal.

2. I tried simplifying capture probability to "high" and "low" water years. Basically, when the water is high, it is more difficult to catch the fish and researchers know this. This model has a higher AIC than the one above (by >10) and still has issues with non-identifiability. Interestingly, the probability of capture estimates meet expectations and are about half as high in high water years vs. low water years.

3. I've tried simplifying the pent structure (looked a high/low water years), but I get crazy high AIC values for these models (>100000). I'm still using the Mlogit(1) link when simplifying so maybe that's incorrect?

Things I think I could try, but haven't:

1. A separate capture event for those fall sampling periods (i.e. separate fall and spring data, and adjust time intervals accordingly). I'm unsure if this is worthwhile, since the fall sampling mostly targets juveniles.

If anyone could provide some insight/advice on how I could proceed it would be much appreciated!

Cody
cjackson1373
 
Posts: 5
Joined: Tue Apr 19, 2022 12:03 pm

Re: POPAN Analysis Help

Postby stshroye » Fri Apr 29, 2022 3:55 pm

I have a couple of suggestions, but not a complete answer. I definitely would not use the model in your example. You sampled two different populations (adult and juvenile) at different times of year, which violated assumptions of your models and resulted in the lack of fit detected by RELEASE. The c-hat for adults was OK, so I would limit the analysis to adults only and recognize that the statistical population you actually sampled was not the entire fish population. You might be able to model juveniles separately if your data are adequate.

Review the section on POPAN models in the MARK book. In your example, the first p and first pent were confounded, so the N-hat for 2014 was meaningless. Since phi was constant, I think all the other pents should have been identifiable, but recruitment actually may have been virtually zero in a couple of years. Data cloning might help you understand this.
stshroye
 
Posts: 27
Joined: Wed Sep 22, 2021 2:26 pm

Re: POPAN Analysis Help

Postby cjackson1373 » Mon May 02, 2022 4:00 pm

Thanks so much for the response!

I agree, the adults only model is the best way to go. After pairing down the data, the model with the lowest AICc has a time varying Phi, two p's (two companies ran the netting program so a parameter for each) and time-varying pent.

Code: Select all
  Standard Error and Confidence Intervals Corrected for c-hat = 2.1350000

                        Real Function Parameters of {phi(t)p(study)pent(t)}       
                                                               95% Confidence Interval   
 Parameter                   Estimate       Standard Error      Lower           Upper     
 -------------------------  --------------  --------------  --------------  --------------
    1:Phi                   0.8888915       0.0140101       0.8583232       0.9135288     
    2:Phi                   1.0000000       0.0000002       0.9999997       1.0000003     
    3:Phi                   0.8198095       0.1222901       0.4731487       0.9584183     
    4:Phi                   0.8421802       0.1018547       0.5430279       0.9599419     
    5:Phi                   1.0000000       0.0000030       0.9999941       1.0000059     
    6:Phi                   0.3625162       0.0439553       0.2814551       0.4522302     
    7:Phi                   1.0000000       0.0000089       0.9999825       1.0000175     
    8:p                     0.1943581       0.0266892       0.1472903       0.2520215     
    9:p                     0.0939167       0.0179050       0.0642166       0.1353661     
   10:pent                  0.0647776       0.0253445       0.0296039       0.1358903     
   11:pent                  0.0418411       0.0231584       0.0138799       0.1193152     
   12:pent                  0.0446228       0.0272227       0.0131846       0.1403613     
   13:pent                  0.1205948       0.0409690       0.0604275       0.2262446     
   14:pent                  0.2705980       0.0480220       0.1871759       0.3740888     
   15:pent                  0.0000000       0.0000001       0.0000000       0.1382659     
   16:pent                  0.0000000       0.0000578       0.0000000       0.9111818     
   17:N                     5869.0990000    542.7089075     4805.3895412    6932.8084588 


Most of the parameters are estimable, but several are near the 0 or 1 boundaries, and the 95% confidence intervals appear to indicate that extrinsic non-identifiability could be an issue. To assess this, I followed your advice and cloned the data. For the identifiable parameters, all SE's were reduced by a factor of 10 so that's good (cloned the data by 100). Here are the cloning results for the parameters in question: 2, 5, 7, 15 and 16:

Code: Select all
Label   Estimate X 100   SE X 100       LCB X 100       UCB X 100      SE Ratio
2           1              8.32679E-08   0.999999837   1.000000163   1.20094237
5           1              2.81417E-07   0.999999448   1.000000552   7.462229026
7           1              9.37773E-07   0.999998162   1.000001838   6.504771079
15           4.28857E-09   3.09463E-07   -6.02259E-07   6.10836E-07   0
16           1.26459E-08   0              1.26459E-08   1.26459E-08   #DIV/0!


From reading the cloning section in the Mark book, I think the Phi parameters are good (and I think they were fine in the first place) since the confidence interval got smaller around the estimate of 1. The results for parameters 15 and 16 are a bit more puzzling. For parameter 15, I get a negative value for the LCB which I don't quite understand... and parameter 16 estimates the SE to be zero. Since the LCBx100 and UCBx100 for these two parameters is essentially zero does that mean that the model is correctly estimating these parameters? A probability of entry of zero would make biological sense in this study.

Note: Standard errors for the model were corrected with c-hat. When cloning, I set the value back to 1.
cjackson1373
 
Posts: 5
Joined: Tue Apr 19, 2022 12:03 pm


Return to analysis help

Who is online

Users browsing this forum: No registered users and 14 guests