Trouble comparing N for two groups using Huggins models

questions concerning analysis/theory using program MARK

Trouble comparing N for two groups using Huggins models

Postby gpugesek » Sun Jan 06, 2019 1:25 pm

Hello,

I estimated population sizes for two groups of ground nesting organisms (I "captured" individuals every time I found a nest). I fit closed population models to my data set because nests were sessile, few individuals died, and no new nests entered the population while I was surveying plots. I wanted to account for "trap happiness" (i.e. remembering where nests were located and thus recapturing them), so I used huggins closed population models.

Now that I have estimated N for each group, I am not sure how to tell if these two population sizes differ from each other. Because N is a derived parameter, I can only determine if including group label as a covariate for p improves the fit of the model using AIC.

I suppose I could compare the confidence intervals of the population estimate, but my confidence intervals overlap a bit, and I know that this does not mean the population estimates do not differ.

Is there a strategy you would suggest???

Jenny
gpugesek
 
Posts: 9
Joined: Thu Oct 29, 2015 10:32 am

Re: Trouble comparing N for two groups using Huggins models

Postby cooch » Sun Jan 06, 2019 6:32 pm

gpugesek wrote:Is there a strategy you would suggest???


First, congratulations on really doing a bit of background homework.

Second, yes, there is a robust, fairly easy way forward, that not only will answer your question, but will let you say 'MCMC' in your paper/report, letting you run with the cool kids.

1\ Take your basic model with the 2 groups. Run it using MCMC (Appendix E in the book). The MCMC.BIN file that will get created will have the posterior estimates for the two derived parameters (being, estimated abundance), for each group.

2\ suck MCMC.BIN into R. There is a script in the MARK helpfile (largely thanks to Jeff Laake) that will pull in the binary file, and massage it to the point where it creates an MCMC object called 'mcmcdata'.

3\ post-process mcmcdata, creating a variable you might call 'diff', which is the difference between derived1 and derived2 (i.e., the two abundance estimates).

4\ then, look at the HPD for this 'diff' variable, and if it doesn't bound 0, you might safely conclude there is a real difference in estimated abundance between the two groups.

The basic idea (albeit in a different context) is described at the end of the Delta method appendix (Appendix B), and is really quite straightforward.

Here is a worked example (which I should probably add to the book).

Here are some simulated closed abundance data, 2 groups - true abundance is 100, and 125, respectively. The underlying encounter probabilities were high, so there is good precision (0.75 and 0.65, respecitviely).

Code: Select all
11011 6 8;
11110 9 6;
10101 3 7;
11101 11 8;
11111 19 8;
00011 2 1;
10111 5 5;
00111 3 10;
01001 1 1;
00101 2 5;
01111 12 10;
11001 4 7;
11000 2 3;
01011 1 4;
00110 2 2;
01101 3 4;
10100 2 2;
10110 1 4;
01110 3 3;
10011 2 6;
11010 1 3;
01100 1 6;
10000 1 3;
11100 2 2;
01010 1 0;
00001 1 0;
00010 0 1;
10001 0 1;
01000 0 2;
00100 0 1;
10010 0 1;


Fit Huggins model {p=c,grp} (i.e., no time var in p or c within group, p=c within group, but separate estimates of encounter parameters between groups). Then, re-run this model using MCMC (if you've never used MCMC in MARK before, Appendix E is your starting point. But, for this example, you really don't need to use anything more than the defaults). Once run, MCMC.BIN will be created in the directory where the .inp file was located.

Then, you simply post-process MCMC.BIN using the R script from the helpfile as a starting point. For this example, the macro vars you need to put at the top of the R script would be:

Code: Select all
      ncovs <- 2; # Number of beta estimates
      nmeans <- 0; # Number of mean estimates
      ndesigns <- 0; # Number of design matrix estimates
      nsigmas <- 0; # Number of sigma estimates
      nrhos <- 0; # Number of rho estimates
      nlogit <- 2; # Number of real estimates
      nderived <- 2; # Number of derived estimates
      filename <- "C:\\USERS\\EGC\\DESKTOP\\MCMC.BIN"; # Name and path to the MCMC.BIN file


Once the script is done, you have an MCMC object called mcmcdata. From there, the following will suffice:

Code: Select all
# now set things up for derived parameter calculation - difference of derived1 and derived2
chaindata <- as.data.frame(mcmcdata);
chaindata$diff <- chaindata$derived1-chaindata$derived2;

cat("\n")
cat(" basic summary stats of diff in derived abundance estimates \n")
print(summary(chaindata$diff));
cat("variance of difference...\n")
print(var(chaindata$diff));

hold <- as.mcmc(chaindata$diff)
hpd <- HPDinterval(hold,prob=0.95)
print(hpd)

hist(chaindata$diff,xlab = expression(paste(diff((derived[1] - derived[2])))),
   main="Posterior density of difference in abundance estimates")
abline(v=hpd[1],col="blue")
abline(v=hpd[2],col="blue")


If you run this additional bit of code, the summary stats for the difference between the 2 derived estimates are:

basic summary stats of diff in derived abundance estimates
Min. 1st Qu. Median Mean 3rd Qu. Max.
24.13 24.64 24.80 24.84 24.99 26.15
variance of difference...
[1] 0.07644926


with an HPD (highest posterior density) of

lower upper
var1 24.3471 25.40487


So, for this example, its very obvious that the HPD doesn't include 0, so, you can safely conclude there is a difference (and, not surprisingly, given these were 'good simulated data'), the mean of the posterior (24.84) is very close to the true difference (25).

So there you go...the MCMC approach in MARK has a lot of value added beyond estimating mu and sigma for hyperparameters. Its fairly easy to use it to generate mean/variances of all sorts of functions of derived parameters. In fact, the only difference between what MARK does, and what (say) JAGS or OpenBUGS does it that the latter calculates the derived values at each step of the chain, whereas with MARK, you post-process the chain. The former (i.e., what JAGS does, for instance) is computationally more convenient, but the latter approach (that MARK uses) yields identical results.
cooch
 
Posts: 1416
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Trouble comparing N for two groups using Huggins models

Postby cooch » Mon Jan 07, 2019 10:06 pm

I should add, though, that if you have >2 groups, there is no simple way to 'do elegant hypothesis testing' among the derived parameters. I can imagine using a series of paired contrasts, and summing 'something' in a fashion to generate some overall F statistic, but beyond that...I have an idea involving the var-covar matrix which can be output for some derived parameters (abundances being one of them), but its just an idea.

For the moment, you 'luck out' in having only 2 groups, which makes the hypothesis of 'different or not' quite straightforward (as described).
cooch
 
Posts: 1416
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Trouble comparing N for two groups using Huggins models

Postby gpugesek » Wed Jan 23, 2019 4:05 pm

Thank you so much Cooch (your code worked like a charm, and the populations sizes were different from each other).

Since we are on the topic of >2 groups, may I ask you another question?

I previously mentioned that I estimated populations sized for two groups of ground nesting organisms using Huggins Closed Capture models. However, I collected capture history data for three groups. For the third group, I never encountered a single nest (N=0). However, I would like to estimate confidence intervals for this group, assuming my detection probability does not differ across groups. I was wondering if you had any suggestions?

My PI put together a bit of code to estimate the confidence intervals post-hoc using likelihood profiling techniques (r code attached), but we were wondering if there was a more elegant solution, or if anyone has run into this issue before?

Thanks!
Jenny

Code: Select all
# confidence intervals for nests in group 3, assuming same detection as group 1 and 2

# MLE detection from model = 0.412, SE = 0.084, 95% CI = 0.261, 0.581
# logit-scale detection parameters: -0.173, SE = 0.310
detect = seq(-2,2,0.001)
probs = dnorm(detect, mean = -0.173, sd = 0.310)/sum(dnorm(detect, mean = -0.173, sd = 0.310))

#likelihood profiling technique values
CIs = array()
for(i in 1:length(detect)){
  detect.i = plogis(detect[i]) # detection probability per nest
  nums = seq(0,20,1) # hypothesized # of nests
  LLs = log(dbinom(0, nums, detect.i)^4)
  # LR test - 2x difference in LLs = 3.86 #minimum significant value
  # I added some linear interpolation for when the critical value (is the value different?) fell between 2 numbers
  pos2 = min(which(LLs < -3.86/2))
  pos1 = pos2-1
  dLL = LLs[pos2]-LLs[pos1]
  dcrit = -1.92 - LLs[pos1]
  CIs[i] = nums[pos1]+(dcrit/dLL)
 
}
sum(probs*CIs, na.rm = T) # upper confidence limit for group 1 is 0.814
gpugesek
 
Posts: 9
Joined: Thu Oct 29, 2015 10:32 am

Re: Trouble comparing N for two groups using Huggins models

Postby cooch » Wed Jan 23, 2019 7:41 pm

gpugesek wrote:Thank you so much Cooch (your code worked like a charm, and the populations sizes were different from each other).


Glad it worked.

Since we are on the topic of >2 groups, may I ask you another question?


You just did, but I'm guessing you now want yet another. ;-)

I previously mentioned that I estimated populations sized for two groups of ground nesting organisms using Huggins Closed Capture models. However, I collected capture history data for three groups. For the third group, I never encountered a single nest (N=0). However, I would like to estimate confidence intervals for this group, assuming my detection probability does not differ across groups. I was wondering if you had any suggestions?


Stop now. Canonical estimate of abundance is \hat{N}=C/\hat{p}. You can tell stories to your hearts content about the denominator on the RHS, but you have no information in the numerator (C, the count), so, you're out of luck. Pretending you know p would allow you to estimate the probability of missing all N nests, and let you construct a CI of a sort, if you knew what N was. Which you don't. You're basically asking 'if I don't see anything at all, how many were really there?'. Can't be done. If you go all Bayesian, and do a data augmentation thing, I still don't think it works (although Bayesian inference can occasionally let you get something from nothing, which IMO is not necessarily a good thing).
cooch
 
Posts: 1416
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Trouble comparing N for two groups using Huggins models

Postby murray.efford » Thu Jan 24, 2019 8:21 pm

...you have no information in the numerator (C, the count), so, you're out of luck

But, but, but, C = 0. No argument that the best estimate is N-hat = 0. The OP is asking for a CI for N-hat, which still seems a reasonable request.
murray.efford
 
Posts: 564
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand

Re: Trouble comparing N for two groups using Huggins models

Postby cooch » Thu Jan 24, 2019 10:17 pm

murray.efford wrote:
...you have no information in the numerator (C, the count), so, you're out of luck

But, but, but, C = 0. No argument that the best estimate is N-hat = 0. The OP is asking for a CI for N-hat, which still seems a reasonable request.


Said CI being interpretable...how? The lower bound for an estimate of 0 would be...0. The upper bound would be...something bigger than 0. While it is technically feasible to construct a CI in this case, my argument is more philosophical, in that I see little value in constructing an artificial CI when there is no information. Just because you *can*, doesn't mean you *should*. At least, IMO.

To resurrect my favourite John Tukey quote: 'The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.'

[Point count people, I'm also looking at you...] ;-)
cooch
 
Posts: 1416
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Trouble comparing N for two groups using Huggins models

Postby darryl » Thu Jan 24, 2019 10:36 pm

My 2 cents worth....
Haven't thought about this in an abundance estimation situation, but have been recently for some simple binomial sampling situations where you get 0 observed successes, and can construct a binomial CI still. Yes, estimate and lower bound are 0, but upper bound can be something else. With small sample sizes it can be quite a lot bigger than 0, which can be useful and illustrate that your sample size may have been too small to give reliable results.

I guess the same could apply here, you may have caught 0 animals, but if p was low, maybe you just didn't deploy enough effort. A 0 isn't no information, if you're willing to use an estimate of p from elsewhere (which is of course an untestable assumption).

Cheers
Darryl
darryl
 
Posts: 457
Joined: Thu Jun 12, 2003 3:04 pm
Location: Dunedin, New Zealand

Re: Trouble comparing N for two groups using Huggins models

Postby cooch » Fri Jan 25, 2019 8:36 am

darryl wrote:My 2 cents worth....
Haven't thought about this in an abundance estimation situation, but have been recently for some simple binomial sampling situations where you get 0 observed successes, and can construct a binomial CI still. Yes, estimate and lower bound are 0, but upper bound can be something else.


With said upper bound being determined (largely) by encounter probability, p, which...

A 0 isn't no information, if you're willing to use an estimate of p from elsewhere (which is of course an untestable assumption).


...exactly. Its not testable. I suppose I'm simply pushing back at a somewhat increasing tendency to build stories from model-driven approaches that often rely on largely or completely untestable assumptions.

[Point count people, I'm now definitely looking at you...] :?
cooch
 
Posts: 1416
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Trouble comparing N for two groups using Huggins models

Postby darryl » Fri Jan 25, 2019 6:54 pm

What!! Models aren't real?!?! :lol: :lol:

Fair enough, some push-back is needed
darryl
 
Posts: 457
Joined: Thu Jun 12, 2003 3:04 pm
Location: Dunedin, New Zealand


Return to analysis help

Who is online

Users browsing this forum: Google [Bot] and 1 guest

cron