www.phidot.org

by **jCeradini** » Fri Mar 20, 2015 3:04 pm

Hi folks,

I thought it would be most helpful to ask this as a new question based on the end of the discussion from this recent thread: viewtopic.php?f=21&t=2407

Cooch said:
You need to be somewhat careful in constructing a valid CI for abundance, especially f you're model averaging (which you almost inevitably will be). In particular, evidence suggests that f0 follows a log-normal distribution, and this approaches to constructing the CI based on any normal asymptotics are probably wrong.

You are referred to section 14.1`0.1 in chapter 14 of the MARK book. While MARK (and thus, presumably, RMark) correctly estimates the unconditional variance, and SE, the reported UCI and LCI (which are based on normal asymptotic) are incorrect, and the correct CI for model averaged estimate of N require you do them by hand. I'm guessing that RMark doesn't do this correction, but Jeff will need to advise.

Note that this issue applies regardless of whether N is included in the likelihood, or conditioned out (i.e., Huggins models).

In section C.8 (C-36) of the MARK book, Laake discusses estimating CI for model averaged estimates using the Delta Method. There's no mention of extracting values to estimate CI by hand, such as is discussed in section 14.10.1.

So, are the CI for model averaged estimates reported in RMark valid, or is calculation by hand still necessary?

Also, Evan and Jeff discussed this in a thread from 2008, but I thought it was worth asking again in case there have been any changes. (viewtopic.php?f=21&t=770&p=1909&hilit=confidence+interval+model+average#p1909)

Thanks!
Joe

by **cooch** » Fri Mar 20, 2015 4:00 pm

In section C.8 (C-36) of the MARK book, Laake discusses estimating CI for model averaged estimates using the Delta Method. There's no mention of extracting values to estimate CI by hand, such as is discussed in section 14.10.1.

So, are the CI for model averaged estimates reported in RMark valid, or is calculation by hand still necessary?

Rather than ask, why not simply try it for yourself? Run the analysis in MARK, generate model averaged estimates of N, and then generate the correct CI by hand.

Then, see what RMark returns for the CI. If they're the same, then RMark is doing the 'correction' internally. If not, then see if what is being reported by RMark is the same as the incorrect CI reported by MARK. If it is, then RMark might need a tweak. If it is different than either the manually derived CI, or the erroneous one MARK reports, then Jeff is doing something altogether different. You can use the Delta method for generating the CI on the back-transformed (real) scale, and this is what MARK does for most things (and presumably, so does RMark). Covered in some detail in Appendix B. But, abundance is a different beast, and requires a different approach.

by **jCeradini** » Sat Mar 21, 2015 1:57 pm

Rather than ask, why not simply try it for yourself? Run the analysis in MARK, generate model averaged estimates of N, and then generate the correct CI by hand.

Good point...shoulda spent more time on this one before posting. I recreated Jeff's example from this 2008 post: viewtopic.php?f=21&t=770&p=1909&hilit=confidence+interval+model+average#p1909. It appears the CI for abundance is still estimated just as Jeff describes in C.8 of the MARK book.
I then ran the models and model averaged abundance in MARK as well ("full" closed model type) (was my 1st time model averaging with the MARK interface, but it seemed pretty straightforward...)

Back in 2008, Jeff said:
# confidence interval in RMark
# computes se of log(f0), normal ci on log(f0), back-transformation of endpoints
> exp(log(19.07818)-1.96*(6.613443/19.07818))
[1] 9.67081
> exp(log(19.07818)+1.96*(6.613443/19.07818))
[1] 37.63665

# log-normal confidence interval <-- section 14.10.1
> exp(1.96*sqrt(log(1+(6.61344/19.078)^2))) # computation of C
[1] 1.935291
# lower limit - divide estimate by C
> 19.078/1.935
[1] 9.859432
# upper limit - multiply estimate by C
> 19.078*1.935
[1] 36.91593

Code for creating the model set (changed "N" to "f0" in model average call).
Ran in RMark 2.1.12.

Code: Select all: data(edwards.eberhardt) run.edwards.eberhardt <- function() { # Define parameter models pdotshared = list(formula= ~ 1, share=TRUE) ptimeshared = list(formula= ~ time, share=TRUE) # Closed capture models # constant p=c ee.closed.m0 = mark(edwards.eberhardt,model="Closed",model.parameters=list(p=pdotshared)) # constant p and constant c but different ee.closed.m0c = mark(edwards.eberhardt,model="Closed") # time varying p=c ee.closed.mt = mark(edwards.eberhardt,model="Closed",model.parameters=list(p=ptimeshared)) # Return model table and list of models return(collect.models() ) } # Run function and save output ee.results = run.edwards.eberhardt() # model average mAve <- model.average(ee.results, "f0", vcv = TRUE) # CI matches the above hand calculations by Jeff mAve$estimates # estimate se lcl ucl # 19.0781 6.613454 9.670848 37.63619

RMark model average
Mt+1 = 76
76 + 9.670848 = 85.67085
76 + 37.63619 = 113.6362

MARK model average
95% CI for Weighted Average Estimate is 82.1157185 to 108.0404484

Log-normal model average
76 + 9.859432 = 85.85943
76 + 36.91593 = 112.9159

Obviously not a lot of difference in these estimates but I believe that depends on the underlying data:

jlaake wrote:
On another point in regard to my previous posting, the reason for the difference in the CI for N between RMark and the log-normal is in the approximation of the variance with the delta method. The delta method uses cv^2 for the variance and the log-normal variance is log(1+cv^2). The delta method variance is slightly larger but as long as the cv<0.5 the difference doesn't exceed 5% for the se

Cooch wrote:
Which, in effect, is due to the non-linearity of the log transform. The standard application of the 'Delta method' involves a simple first order Taylor expansion, and thus is a bit 'twitchy' when the transform is non-linear, and if the variation in the data is 'too big'. This is discussed in a bit of detail in Appendix B (see section B.3.1). In such cases, a second-order Taylor expansion can help, but it doesn't always solve the problem.

- Joe

by **cooch** » Sun Mar 22, 2015 11:07 am

I suppose I'm not surprised that your results using RMark, and those from MARK (doing it 'by hand') are very close. To some degree, this is a function of the data.

It might be worth trying the same comparison using a somewhat more 'pathological' data set (bbsample.inp, as discussed on pp. 36-37 of chapter 14). Here is the .inp file (to save you having to find it).

Code: Select all: 00000100 3; 00010100 1; 11000000 1; 00000010 4; 00000011 1; 10000010 1; 01000100 1; 00110000 1; 01001000 2; 00100001 1; 10100100 1; 00010000 3; 00100010 1; 00000001 2; 00100000 3; 10001100 1; 00101000 1; 01001110 1; 01000010 2; 01100100 1; 01000000 3; 10000000 1; 00001010 1; 01110000 1; 00110110 1; 10100000 1; 11000100 1; 01000011 1; 00010001 1;

M(t+1)=43. Fit models {p(t)=c(t),N},{p(.),c(.),N} and {p(.)=c(.),N} to the data. (MARK results are on noted pages in Chapter 14). The uncorrected 95% CI for model-averaged N is [41.78,63.91]. Clearly, problematic, since this implies the lower CI is <M(t+1), which clearly makes no sense. The corrected CI (as described) is [46.46,71.00]. which is at least logically consistent with M(t+1).

Be worth comparing with what RMark returns.

by **jCeradini** » Sun Mar 22, 2015 4:53 pm

Cooch wrote:
It might be worth trying the same comparison using a somewhat more 'pathological' data set (bbsample.inp, as discussed on pp. 36-37 of chapter 14)....
M(t+1)=43. Fit models {p(t)=c(t),N},{p(.),c(.),N} and {p(.)=c(.),N} to the data. (MARK results are on noted pages in Chapter 14). The uncorrected 95% CI for model-averaged N is [41.78,63.91]. Clearly, problematic, since this implies the lower CI is <M(t+1), which clearly makes no sense. The corrected CI (as described) is [46.46,71.00]. which is at least logically consistent with M(t+1).

M(t+1) = 43

RMark model ave results
f0: estimate - se - lcl - ucl
9.839294 - 5.645021 - 3.196029 - 30.29125

43 + 3.196029 = 46.19603
43 + 30.29125 = 73.29125

Corrected CI model ave results ("log-normal")
[46.46, 71.00]

Uncorrected CI model ave results (MARK)
[41.78, 63.91]

RMark model average results are fairly similar to the corrected CI and, most importantly, CI do not go below M(t+1). Awesome.

- Joe

Code for RMark models

Code: Select all: bb <- import.chdata("bbsample.txt", header=TRUE, field.types = "n") str(bb) unique(nchar(bb$ch)) # Make function #--------------- run.bb <- function() { # Define parameter models #------------------------ pdotshared = list(formula= ~ 1, share=TRUE) ptimeshared = list(formula= ~ time, share=TRUE) # Closed capture models #-------------------- # constant p = c bb.closed.m0 = mark(bb, model="Closed", model.parameters=list(p=pdotshared)) # constant p and constant c but different bb.closed.m0c = mark(bb, model="Closed") # time varying p = c bb.closed.mt = mark(bb, model="Closed", model.parameters=list(p=ptimeshared)) # Return model table and list of models return(collect.models() ) } # Run function and save output bb.results = run.bb() # model average mAve <- model.average(bb.results, "f0", vcv = TRUE) # f0 #----- mAve$estimates # estimate se lcl ucl # 9.839294 5.645021 3.196029 30.29125

www.phidot.org

Confidence intervals for model averaged estimates in RMark

Confidence intervals for model averaged estimates in RMark

Re: Confidence intervals for model averaged estimates in RMa

Re: Confidence intervals for model averaged estimates in RMa

Re: Confidence intervals for model averaged estimates in RMa

Re: Confidence intervals for model averaged estimates in RMa

Who is online