Trouble comparing N for two groups using Huggins models

questions concerning analysis/theory using program MARK

Trouble comparing N for two groups using Huggins models

Postby gpugesek » Sun Jan 06, 2019 1:25 pm

Hello,

I estimated population sizes for two groups of ground nesting organisms (I "captured" individuals every time I found a nest). I fit closed population models to my data set because nests were sessile, few individuals died, and no new nests entered the population while I was surveying plots. I wanted to account for "trap happiness" (i.e. remembering where nests were located and thus recapturing them), so I used huggins closed population models.

Now that I have estimated N for each group, I am not sure how to tell if these two population sizes differ from each other. Because N is a derived parameter, I can only determine if including group label as a covariate for p improves the fit of the model using AIC.

I suppose I could compare the confidence intervals of the population estimate, but my confidence intervals overlap a bit, and I know that this does not mean the population estimates do not differ.

Is there a strategy you would suggest???

Jenny
gpugesek
 
Posts: 8
Joined: Thu Oct 29, 2015 10:32 am

Re: Trouble comparing N for two groups using Huggins models

Postby cooch » Sun Jan 06, 2019 6:32 pm

gpugesek wrote:Is there a strategy you would suggest???


First, congratulations on really doing a bit of background homework.

Second, yes, there is a robust, fairly easy way forward, that not only will answer your question, but will let you say 'MCMC' in your paper/report, letting you run with the cool kids.

1\ Take your basic model with the 2 groups. Run it using MCMC (Appendix E in the book). The MCMC.BIN file that will get created will have the posterior estimates for the two derived parameters (being, estimated abundance), for each group.

2\ such MCMC.BIN into R. There is a script in the MARK helpfile (largely thanks to Jeff Laake) that will pull in the binary file, and massage it to the point where it creates an MNCMC object called 'mcmcdata'.

3\ post-process mcmcdata, creating a variable you might call 'diff', which is the differenced between derived1 and derived2 (i.e., the two abundance estimates).

4\ then, look at the HPD for this 'diff' variable, and if it doesn't bound 0, you might safely conclude there is a real difference in estimated abundance between the two groups.

The basic idea (albeit in a different context) is described at the end of the Delta method appendix (Appendix B), and is really quite straightforward.

Here is a worked example (which I should probably add to the book).

Here are some simulated closed abundance data, 2 groups - true abundance is 100, and 125, respectively. The underlying encounter probabilities were high, so there is good precision (0.75 and 0.65, respecitviely).

Code: Select all
11011 6 8;
11110 9 6;
10101 3 7;
11101 11 8;
11111 19 8;
00011 2 1;
10111 5 5;
00111 3 10;
01001 1 1;
00101 2 5;
01111 12 10;
11001 4 7;
11000 2 3;
01011 1 4;
00110 2 2;
01101 3 4;
10100 2 2;
10110 1 4;
01110 3 3;
10011 2 6;
11010 1 3;
01100 1 6;
10000 1 3;
11100 2 2;
01010 1 0;
00001 1 0;
00010 0 1;
10001 0 1;
01000 0 2;
00100 0 1;
10010 0 1;


Fit Huggins model {p=c,grp} (i.e., no time var in p or c within group, p=c within group, but separate estimates of encounter parameters between groups). Then, re-run this model using MCMC (if you've never used MCMC in MARK before, Appendix E is your starting point. But, for this example, you really don't need to use anything more than the defaults). Once run, MCMC.BIN will be created in the directory where the .inp file was located.

Then, you simply post-process MCMC.BIN using the R script from the helpfile as a starting point. For this example, the macro vars you need to put at the top of the R script would be:

Code: Select all
      ncovs <- 2; # Number of beta estimates
      nmeans <- 0; # Number of mean estimates
      ndesigns <- 0; # Number of design matrix estimates
      nsigmas <- 0; # Number of sigma estimates
      nrhos <- 0; # Number of rho estimates
      nlogit <- 2; # Number of real estimates
      nderived <- 2; # Number of derived estimates
      filename <- "C:\\USERS\\EGC\\DESKTOP\\MCMC.BIN"; # Name and path to the MCMC.BIN file


Once the script is done, you have an MCMC object called mcmcdata. From there, the following will suffice:

Code: Select all
# now set things up for derived parameter calculation - difference of derived1 and derived2
chaindata <- as.data.frame(mcmcdata);
chaindata$diff <- chaindata$derived1-chaindata$derived2;

cat("\n")
cat(" basic summary stats of diff in derived abundance estimates \n")
print(summary(chaindata$diff));
cat("variance of difference...\n")
print(var(chaindata$diff));

hold <- as.mcmc(chaindata$diff)
hpd <- HPDinterval(hold,prob=0.95)
print(hpd)

hist(chaindata$diff,xlab = expression(paste(diff((derived[1] - derived[2])))),
   main="Posterior density of difference in abundance estimates")
abline(v=hpd[1],col="blue")
abline(v=hpd[2],col="blue")


If you run this additional bit of code, the summary stats for the difference between the 2 derived estimates are:

basic summary stats of diff in derived abundance estimates
Min. 1st Qu. Median Mean 3rd Qu. Max.
24.13 24.64 24.80 24.84 24.99 26.15
variance of difference...
[1] 0.07644926


with an HPD (highest posterior density) of

lower upper
var1 24.3471 25.40487


So, for this example, its very obvious that the HPD doesn't include 0, so, you can safely conclude there is a difference (and, not surprisingly, given these were 'good simulated data'), the mean of the posterior (24.84) is very close to the true difference (25).

So there you go...the MCMC approach in MARK has a lot of value added beyond estimating mu and sigma for hyperparameters. Its fairly easy to use it to generate mean/variances of all sorts of functions of derived parameters. In fact, the only difference between what MARK does, and what (say) JAGS or OpenBUGS does it that the latter calculate the derived values at each step of the chain, whereas with MARK, you post-process the chain. The former (i.e., what JAGS does, for instance) is computationally more convenient, but the latter approach (that MARK uses) yields identical results.
cooch
 
Posts: 1360
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Trouble comparing N for two groups using Huggins models

Postby cooch » Mon Jan 07, 2019 10:06 pm

I should add, though, that if you have >2 groups, there is no simple way to 'do elegant hypothesis testing' among the derived parameters. I can imagine using a series of paired contrasts, and summing 'something' in a fashion to generate some overall F statistic, but beyond that...I have an idea involving the var-covar matrix which can be output for some derived parameters (abundances being one of them), but its just an idea.

For the moment, you 'luck out' in having only 2 groups, which makes the hypothesis of 'different or not' quite straightforward (as described).
cooch
 
Posts: 1360
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University


Return to analysis help

Who is online

Users browsing this forum: No registered users and 1 guest

cron