Hello,
I am trying to estimate yearly population numbers from 25 years worth of mark-recapture data. I was given this data set and found that there was little to no sampling design throughout the years and only some sampling was standardized. Because of this I have a very limited number of individuals for most years of the study (ex: smallest year is 15 captures and 6 captures). I am assuming that my population is open because fish were captured only 5 months of the year in one small location which is why I have chosen model POPAN to get yearly superpopulation estimates.
I initially ran the data with individual day capture histories but got questionable pop estimates with ridiculous confidence intervals for years with little data. I then re-ran all years with capture histories pooled into months for a total of 5. Pop estimates and CI's improved for some years but got worse for others. Some years were almost identical between daily and pooled ch while others were drastically different.
My main question is what should I consider to be a more accurate fit (pooled or daily) when there are "good" and "bad" population estimates in both versions. I know that release.gof does not work for model POPAN but I did look at AIC and deviance for the most likely model each year and I was getting negative deviance values even in years with a decent amount of captures (ex: 605 captures with 155 recaps). are negative deviance values normal? Do I just not have enough data and it is too under dispersed?
I also noticed while looking at my AIC table that AIC values and parameter estimates do not match between the table and when I run GOF for the individual best fit model. For example the table will say 40 parameters but the model GOF will say 38. Is something in my code wrong or is that also normal?
I know that there are a lot of questions embedded into this post but I am a new graduate student and I am teaching myself the art of R/experimental design! Thank you for your time.
Code:
library(RMark)
##### 2003 Data #####
pool_2003 <- convert.inp("C:/Users/mkt03/OneDrive/Documents/Edisto Data/DBF files/2003 pooled")
data.2003.ch <- process.data(data = pool_2003, model = 'POPAN', nocc = 5)
data.2003.dm <- make.design.data(data.2003.ch)
##### Set up model #####
run.mod <- function(ch, dm){
#List of formulas
Phi.dot <- list(formula = ~ 1, link = 'logit')
Phi.t <- list(formula = ~ -1 + time, link = 'logit')
p.dot <- list(formula = ~ 1, link = 'logit')
p.t <- list(formula = ~ -1 + time, link = 'logit')
pent.dot <- list(formula = ~ 1, link = 'mlogit')
pent.t <- list(formula = ~ -1 + time, link = 'mlogit')
N.dot <- list(formula = ~1, link = 'log')
##fitting the models##
Phi.dot_p.dot_pent.dot <- mark(ch, dm, model.parameters = list(Phi = Phi.dot, p = p.dot, pent = pent.dot, N = N.dot), invisible = T)
Phi.dot_p.t_pent.dot <- mark(ch, dm, model.parameters = list(Phi = Phi.dot, p = p.t, pent = pent.dot, N = N.dot), invisible = T)
Phi.dot_p.dot_pent.t <- mark(ch, dm, model.parameters = list(Phi = Phi.dot, p = p.dot, pent = pent.t, N = N.dot), invisible = T)
Phi.dot_p.t_pent.t <- mark(ch, dm, model.parameters = list(Phi = Phi.dot, p = p.t, pent = pent.t, N = N.dot), invisible = T)
Phi.t_p.dot_pent.dot <- mark(ch, dm, model.parameters = list(Phi = Phi.t, p = p.dot, pent = pent.dot, N = N.dot), invisible = T)
Phi.t_p.t_pent.dot <- mark(ch, dm, model.parameters = list(Phi = Phi.t, p = p.t, pent = pent.dot, N = N.dot), invisible = T)
Phi.t_p.dot_pent.t <- mark(ch, dm, model.parameters = list(Phi = Phi.t, p = p.dot, pent = pent.t, N = N.dot), invisible = T)
Phi.t_p.t_pent.t <- mark(ch, dm, model.parameters = list(Phi = Phi.t, p = p.t, pent = pent.t, N = N.dot), invisible = T)
return(collect.models(external = F))}
##### running the model #####
mod_2003 <- run.mod(data.2003.ch, data.2003.dm)
mod_2003
mod_2003$model.table
mod_2003$Phi.dot_p.t_pent.dot$results$real