large dataset and problems with GOF and c-hat

Forum for discussion of general questions related to study design and/or analysis of existing data - software neutral.

large dataset and problems with GOF and c-hat

Postby mariekedelange » Fri Nov 05, 2010 7:15 am

hi everybody,

I'm a novice user of MARK, and I'm encountering some difficulties in the analysis of my data. I have read the manual, and posted topics on this forum, but there are remaining questions.

I have a dataset on Greylag goose in the Netherlands, banded as gosling and resighted afterwards. It's a large Dutch project, started in 1993, and continued ever since. Some years only a few goslings were banded, the last couple of years larger numbers (>100 per year) have been banded. Banding took place in different areas of the Netherlands, I've grouped that in 4 regions. Number of bandings were different between the regions. There are a lot of resightings.
I've collated all the resighting data in three-months intervals, and also in 1-year intervals. (so a 0 is not seen, and a 1 is seen).
Since we marked as goslings, I can differentiate between gosling survival and adult survival, which I know from literature can be different. I could also differentiate further between subadult and adult survival.

I started with what I thought a simple analysis: all years and regions together, running the CJS model. Then with program RELEASE I get highly significant values for GOF test, and a c-hat of around 6. Which leads me to the conclusions that the data are not in agreement with the model.
If I run the analysis for subsets, e.g. shorter time periods, or per region, than with RELEASE the GOF-test is OK, and c-hat is smaller (around 1.75). So then I proceed with my further model comparisons.

My question is, what is the problem with the large dataset? Is it too unbalanced?
From the subsets, in some cases I get a constant p value (p.) as best model, and sometimes a time-dependent p value (p t). Same for the Phi, in some subsets Phi is constant, in others Phi is time-dependent.
Is that the cause of the large dataset of not being in agreement with CJS-model?

My apologies for being ignorant in this matter. I'm doing trial and error here, but I would like to understand why taking the large set or taking subsets differs so much.

I woudl really appreciate any input in this matter.

best wishes, Marieke
mariekedelange
 
Posts: 2
Joined: Fri Nov 05, 2010 5:42 am

Re: large dataset and problems with GOF and c-hat

Postby brouwern » Mon Nov 22, 2010 1:52 pm

Sounds like a complex data set. I'm a beginner myself, but here are some suggestions.

1. Check out
Grosbois, V., M. P. Harris, T. Anker-Nilssen, R. H. McCleery, D. N. Shaw, B. J. T. Morgan, and O. Gimenez. 2009. Modeling survival at multi-population scales using mark–recapture data. Ecology 90:2922–2932. [doi:10.1890/08-1657.1]
They used an advanced Bayesian approach, but maybe the article can give you a perspective on how to approach your data.

2. The CJS model assumes that all individuals have the same survival/recapture probabilities, regardless of where they were captured. Perhaps there are strong differences in mortality or dispersal at different locations of your survey. If there is dispersal between sites, you could try a multi-state model, which have a different GOF

3. You may want to look at an Age/Cohort model. Perhaps there are strong inter-annual effects between years that affect survival. I believe these models have a modified GOF test.

I've found GOF tests to be very hard to understand. I think one problem is that they are only recently being emphasized and not all of the theory - let alone good manuals for the rest of us - hasn't been worked out. The chapter in "A Gentle Introduction" is great but still hard to grasp. Good luck!
brouwern
 
Posts: 9
Joined: Thu Jan 21, 2010 9:51 am
Location: USA

Re: large dataset and problems with GOF and c-hat

Postby mariekedelange » Tue Nov 30, 2010 8:19 am

Thanks for the answer. I've looked up the reference, I need to study it to see whether it's applicable for our dataset. (I'm not familiiar with Bayesian statistics yet)
The data are indeed suitable for an age/cohort analysis, and I've applied the modified GOF test.
I have been checking the data, and came up with some errors in it. (it's a large Dutch database, depending on sightings of marked goose made by volunteers, sometimes they make reading errors). So I still need to do more analyses, and your answers are of help.
best Marieke
mariekedelange
 
Posts: 2
Joined: Fri Nov 05, 2010 5:42 am


Return to analysis & design questions

Who is online

Users browsing this forum: No registered users and 1 guest

cron