POPAN N lower than total animals included in dataset

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

POPAN N lower than total animals included in dataset

Postby j.harv3y » Tue Jun 02, 2020 7:24 am

Hi,
I'm running a POPAN model with ~100 individuals over 6 years. N is estimated to be only around 57.
I'm a little confused as to why the N is estimated as lower than the number of animals included in the dataset.
The standard error is reasonable (1.85x10-04) and everything else with the model seems to check out ok.

Could someone explain why this may be the case?

Many thanks!

Jess
j.harv3y
 
Posts: 45
Joined: Mon Oct 08, 2018 4:45 am

Re: POPAN N lower than total animals included in dataset

Postby Bill Kendall » Tue Jun 02, 2020 12:25 pm

That is strange. So it's the estimate of the superpopulation (N*) that's 57, and not N for an individual year? I didn't know it could do that. One thing that stands out is the standard error you list. Whichever way I read that (1.85 times 10 to the 4th or 1.85 times 10 to the -4th), it's not reasonable. With 100 individuals over 6 years, unless detection is very high I suspect the sparseness of the data is causing major problems.
Bill Kendall
 
Posts: 96
Joined: Wed Jun 04, 2003 8:58 am

Re: POPAN N lower than total animals included in dataset

Postby j.harv3y » Tue Jun 02, 2020 12:38 pm

Hi Bill,
Thanks so much for the reply. So this is N.sex ("real parameter N") in the immediate model outputs, not N* from the derived parameters. Sorry I just double checked and had made a mistake- the SE is 1.51 so actually pretty big? Would this support sparseness of data? If data sparseness was the issue would there not be a problem with other parameters?
P value's were calculated relatively high at 0.91 (SE =0.03) for males and 0.55 (SE =0.18)for females if that helps?
Thanks again for your thoughts!
Jess
j.harv3y
 
Posts: 45
Joined: Mon Oct 08, 2018 4:45 am

Re: POPAN N lower than total animals included in dataset

Postby jlaake » Tue Jun 02, 2020 12:53 pm

You didn't describe where you got the N from but I'll guess that the problem is that the N you are getting is actually f0 - the number not caught. Initially Gary had everything labelled as N and then switched it to f0 for most abundance models except for POPAN. But in POPAN the parameter being estimated is f0 but is labelled N in beta which uses a log link. The super population size N is total number caught + f0. Below is an example with the dipper data. What you want is the derived estimates which are in model$results$derived and model$results$derived.vcv.

Code: Select all
> library(RMark)
> data(dipper)
> model=mark(dipper,model="POPAN")
  The file is mark005.inp on unit 4.
  INPUT --- proc title ;

     CPU Time in seconds for last procedure was 0.00
  INPUT --- proc chmatrix occasions= 7 groups= 1 etype= POPAN Nodes=
  INPUT --- 101 ICMeans NoHist hist= 32  ;
  INPUT ---    time interval 1 1 1 1 1 1 ;
  INPUT ---    glabel(1)=Group 1;
   Number of unique encounter histories read was 32.
   Number of individual covariates read was 0.
   Time interval lengths are all equal to 1.

     CPU Time in seconds for last procedure was 0.00
  INPUT --- proc estimate link=Parm-Specific NOLOOP varest=2ndPart    ;
   Model is { Phi(~1)p(~1)pent(~1)N(~1) }
  INPUT --- model={ Phi(~1)p(~1)pent(~1)N(~1) };
  INPUT ---    group=1 Phi    rows=1 cols=6 Square ;
  INPUT ---        1 1 1 1 1 1 ;
  INPUT ---    group=1 p    rows=1 cols=7 Square ;
  INPUT ---        2 2 2 2 2 2 2 ;
  INPUT ---    group=1 pent    rows=1 cols=6 Square ;
  INPUT ---        4 5 6 7 8 9 ;
  INPUT ---    group=1 N    rows=1 cols=1 Square ;
  INPUT ---        3 ;
  INPUT ---    design matrix constraints=9 covariates=4;
  INPUT ---        1 0 0 0;
  INPUT ---        0 1 0 0;
  INPUT ---        0 0 0 1;
  INPUT ---        0 0 1 0;
  INPUT ---        0 0 1 0;
  INPUT ---        0 0 1 0;
  INPUT ---        0 0 1 0;
  INPUT ---        0 0 1 0;
  INPUT ---        0 0 1 0;
  INPUT ---    links=9;
  INPUT ---    Logit;
  INPUT ---    Logit;
  INPUT ---    Log;
  INPUT ---    mlogit(1);
  INPUT ---    mlogit(1);
  INPUT ---    mlogit(1);
  INPUT ---    mlogit(1);
  INPUT ---    mlogit(1);
  INPUT ---    mlogit(1);
  INPUT ---       blabel(1)=Phi:(Intercept);
  INPUT ---       blabel(2)=p:(Intercept);
  INPUT ---       blabel(3)=pent:(Intercept);
  INPUT ---       blabel(4)=N:(Intercept);
  INPUT ---       rlabel(1)=Phi g1 a0 t1;
  INPUT ---       rlabel(2)=p g1 a0 t1;
  INPUT ---       rlabel(3)=N g1 a0 t1;
  INPUT ---       rlabel(4)=pent g1 a1 t2;
  INPUT ---       rlabel(5)=pent g1 a2 t3;
  INPUT ---       rlabel(6)=pent g1 a3 t4;
  INPUT ---       rlabel(7)=pent g1 a4 t5;
  INPUT ---       rlabel(8)=pent g1 a5 t6;
  INPUT ---       rlabel(9)=pent g1 a6 t7;
   Model is { Phi(~1)p(~1)pent(~1)N(~1) }
   Link Function Used is PARM-SPECIFIC
   Variance Estimation Procedure Used is 2ndPart

   M(t+1):
       294


   -2logL(saturated) = 1737.1857     
 Effective Sample Size = 519
{ Phi(~1)p(~1)pent(~1)N(~1) } Iteration 22    Time for numerical optimization was 0.01 seconds.
   -2logL { Phi(~1)p(~1)pent(~1)N(~1) } = 705.56563     
   Penalty { Phi(~1)p(~1)pent(~1)N(~1) } = -0.0000000   
   Gradient { Phi(~1)p(~1)pent(~1)N(~1) }:
   -0.7344951E-04-0.2763189E-04 0.2725769E-04-0.1222586E-04
{ Phi(~1)p(~1)pent(~1)N(~1) } VC Matrix: 100% done.    Time to compute number of parameters was 0.02 seconds.
   DEVIANCE { Phi(~1)p(~1)pent(~1)N(~1) } = -1031.6201                   
 DEVIANCE Degrees of Freedom { Phi(~1)p(~1)pent(~1)N(~1) } = 28           
   c-hat { Phi(~1)p(~1)pent(~1)N(~1) } = -36.843575                   
   AIC { Phi(~1)p(~1)pent(~1)N(~1) } = 713.56563     
   AICc { Phi(~1)p(~1)pent(~1)N(~1) } = 713.64346     
   BIC { Phi(~1)p(~1)pent(~1)N(~1) } = 730.57325     
   Pearson Chisquare { Phi(~1)p(~1)pent(~1)N(~1) } = 1436.7942     
 Possible Encounter Histories { Phi(~1)p(~1)pent(~1)N(~1) } = 127             
 Pearson Chisquare df { Phi(~1)p(~1)pent(~1)N(~1) } = 122       
   Pearson chat { Phi(~1)p(~1)pent(~1)N(~1) } = 11.777002     
 Sum(Observed/Expected) { Phi(~1)p(~1)pent(~1)N(~1) } = 13.273155                 
   Fletcher chat { Phi(~1)p(~1)pent(~1)N(~1) } = 112.68453     

     CPU Time in seconds for last procedure was 0.02
  INPUT --- proc stop;

     CPU Time in minutes for this job was 0.00

     Time Start = 09:48:19.547   Time End = 09:48:19.620

     Wall Clock Time in minutes for this job was 0.00
STOP NORMAL EXIT

Output summary for POPAN model
Name : Phi(~1)p(~1)pent(~1)N(~1)

Npar :  4
-2lnL:  705.5656
AICc :  713.6435

Beta
                  estimate        se       lcl       ucl
Phi:(Intercept)  0.2382584 0.1016105 0.0391018 0.4374149
p:(Intercept)    2.2914677 0.3323920 1.6399794 2.9429560
pent:(Intercept) 0.6683269 0.2233365 0.2305873 1.1060665
N:(Intercept)    2.7195523 0.4296118 1.8775132 3.5615913


Real Parameter Phi
         1         2         3         4         5         6
 0.5592844 0.5592844 0.5592844 0.5592844 0.5592844 0.5592844


Real Parameter p
         1         2         3         4         5         6         7
 0.9081679 0.9081679 0.9081679 0.9081679 0.9081679 0.9081679 0.9081679


Real Parameter pent
         2         3         4         5         6         7
 0.1535493 0.1535493 0.1535493 0.1535493 0.1535493 0.1535493


Real Parameter N
        1
 309.1735
> dim(dipper)[1] + exp(2.7195523)
[1] 309.1735
>
[1] 309.1735
> model$results$derived
$`B* Gross Births`
  estimate      se      lcl      ucl
1 62.59513 2.03682 58.72859 66.71623
2 62.59513 2.03682 58.72859 66.71623
3 62.59513 2.03682 58.72859 66.71623
4 62.59513 2.03682 58.72859 66.71623
5 62.59513 2.03682 58.72859 66.71623
6 62.59513 2.03682 58.72859 66.71623

$`B Net Births`
  estimate       se     lcl      ucl
1 47.47339 1.267816 45.0528 50.02402
2 47.47339 1.267816 45.0528 50.02402
3 47.47339 1.267816 45.0528 50.02402
4 47.47339 1.267816 45.0528 50.02402
5 47.47339 1.267816 45.0528 50.02402
6 47.47339 1.267816 45.0528 50.02402

$`N Population Size`
   estimate       se      lcl       ucl
1  24.33322 5.060683 16.25695  36.42169
2  61.08257 2.529976 56.32172  66.24586
3  81.63592 2.694210 76.52378  87.08957
4  93.13108 3.849889 85.88587 100.98749
5  99.56015 4.849043 90.50068 109.52650
6 103.15582 5.568719 92.80587 114.66003
7 105.16683 6.056556 93.95021 117.72258

$`Gross N* Population Size`
  estimate       se    lcl      ucl
1  399.904 10.85389 379.19 421.7495

jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: POPAN N lower than total animals included in dataset

Postby j.harv3y » Tue Jun 02, 2020 1:05 pm

Ah this is very helpful, thanks so much! I was under the impression that the N was the superpopulation size :)

I still seem to be having the same problem in the derived outputs;

Code: Select all
$`N Population Size`
       estimate           se          lcl          ucl
1  3.000000e+00 0.0000000000 3.000000e+00  3.000000000
2  2.444217e+00 0.1836385986 2.109969e+00  2.831413302
3  2.311075e+00 0.2058761092 1.941491e+00  2.751013771
4  1.544443e+00 0.2179497048 1.172846e+00  2.033774647
5  1.394994e+00 0.2178685528 1.029036e+00  1.891099378
6  1.019225e+00 0.2064839124 6.879451e-01  1.510032397
7  2.826043e+01 3.1592794717 2.271513e+01 35.159482066
8  2.877279e+01 2.6496214762 2.403038e+01 34.451108879
9  3.295339e+01 2.4387065776 2.850966e+01 38.089757235
10 2.776997e+01 2.7985309381 2.280396e+01 33.817430618
11 3.083071e+01 3.0021800682 2.548535e+01 37.297213033
12 2.827376e+01 3.6890420053 2.191738e+01 36.473585216
13 1.000000e+00 0.0000000000 1.000000e+00  1.000000000
14 1.615316e-01 0.0789939579 6.517649e-02  0.400335514
15 6.977277e-02 0.0478237560 2.067434e-02  0.235472515
16 5.658199e-03 0.0062376663 9.851937e-04  0.032496366
17 1.642157e-03 0.0022776140 2.156457e-04  0.012505136
18 1.743975e-04 0.0003177812 1.628700e-05  0.001867409
19 1.586551e+01 1.7736305989 1.275235e+01 19.738656795
20 5.789680e+00 1.2546900260 3.804507e+00  8.810704766
21 5.727722e+00 1.2738208273 3.723530e+00  8.810669100
22 3.691386e+00 0.5037530088 2.828536e+00  4.817450303
23 4.298235e+00 0.7348153978 3.081875e+00  5.994670965
24 3.683372e+00 0.5071915493 2.815691e+00  4.818437622

$`Gross N* Population Size`
  estimate        se      lcl      ucl
1  3.00000 0.0000000  3.00000  3.00000
2 60.25390 0.7421778 58.81670 61.72621
3  1.00000 0.0000000  1.00000  1.00000
4 50.19226 4.9926516 41.32131 60.96766


I'm assuming the 1-4 (gross N*) are the group covariates I added (sex and injury presence). If I add two of the groups together this gives me ~ 110 males and 4 females- is this the correct interpretation, sounds more reasonable? This way the population number exceeds the number of individuals in my input data for males. There are only 4 females in the original dataset.
j.harv3y
 
Posts: 45
Joined: Mon Oct 08, 2018 4:45 am

Re: POPAN N lower than total animals included in dataset

Postby jlaake » Tue Jun 02, 2020 3:35 pm

Yes those are estimates by group. But Gross N* is the super-population size plus an added term of individuals that could have entered and left the population in between occasions. See original article by Schwarz for the definition. You can get the super population sizes from the MARK output by typing model name or from get.real function.
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: POPAN N lower than total animals included in dataset

Postby j.harv3y » Wed Jun 03, 2020 3:16 pm

Thanks so much for the advice. The get.real function seems to present the same problem as the original numbers- still smaller than the number of individuals.

Code: Select all
> get.real(short.popan$Phi.ti.p.is.pent.sex.N.sex,"N")
                       2014
Group:sexF.injurymajor    3
Group:sexM.injurymajor   57
Group:sexF.injurynone     3
Group:sexM.injurynone    57


Am I missing something, or does this still suggest somethings wrong,as Bill suggested about the previous superpopulation N?

Thanks!
j.harv3y
 
Posts: 45
Joined: Mon Oct 08, 2018 4:45 am

Re: POPAN N lower than total animals included in dataset

Postby jlaake » Fri Jun 05, 2020 9:54 am

I don't now what version of RMark you are using but this is the NEWS from the most recent version on CRAN where I addressed this issue. If you still can't work this out what you have contact me offlist.

RMark 2.2.7 (4 November 2019)

...

* get.real was inconsistent about what it returned. If vcv=FALSE it returns N because it takes the results from the MARK output. If vcv=TRUE, it returns f0 because the estimated parameter is actually f0 even though MARK refers to it as N and RMark works with the actual parameters and not derived values. I have chosen to let this problem remain because it would take some specific coding just for this parameter in get.real. To convert to N, simply add the number of animals caught to f0 when vcv=TRUE. I have added a warning message. Thanks to Carl Schwarz for bringing the problem to my attention.
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: POPAN N lower than total animals included in dataset

Postby jlaake » Mon Jun 08, 2020 2:24 pm

I've been working with Jessica offlist and want to point out a few things. Notice that she posted

Code: Select all
Group:sexF.injurymajor    3
Group:sexM.injurymajor   57
Group:sexF.injurynone     3
Group:sexM.injurynone    57


where it shows N was the same across injury groups within a sex. If you use groups you should use N=list(formula=~group) whereas she used ~sex. What you are modelling is f0, the number not caught and while you might think N might be the same across categories, that does not mean that f0 will be the same because it will vary with p.

Also, in the MARK output it provides M_t+1 which is the total number caught in each group. The M_t+1 values where lower or the same as the N values RMark and MARK were giving so it was correct. The beta for N is on the log link scale. If you were to use N=list(formula=~-1+group) then each beta per group are separate (no intercept) so exp(beta) is the estimate of f0 for that group and if you add it to the M_t+1 for that group it will equal N for that group. That N is often referred to as the super-population size which is the number of animals that were in the population at some time and available to be captured. Typically it is not the total number at any one occasion (unless they all entered prior to the study). The Gross N* is subtly different and includes an estimate of animals that entered and left between occasions and thus where never available to be caught.

Note that additional values are provided in the RMark function popan.derived but I just discovered that the code is not correct if you have losses on capture (freq=-1). I'll correct that in a future version. If you have losses on capture, the results you get from model$results$derived are correct.

--jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA


Return to RMark

Who is online

Users browsing this forum: No registered users and 9 guests