Censoring individuals with missing data in multi-state model

Forum for discussion of general questions related to study design and/or analysis of existing data - software neutral.

Censoring individuals with missing data in multi-state model

Postby jbauder » Thu Sep 05, 2019 11:04 am

I have an analysis of translocated bears that I am using in a multi-state survival model to estimate cause-specific mortality rates. We have records for about 1,000 bears, many of which were harvested by hunters and about 20 were killed by vehicles with six individuals being found dead of unknown causes. Our states are alive, dead from harvest, dead from vehicle (if sample sizes allow), and dead from all other mortality causes, with transition probabilities between dead states fixed to zero. This model works well with our data but we are ultimately interested in modeling survival (i.e., transition from alive to alive) as a function of translocation. Because bears were only translocated during years (i.e., capture events) in which they were captured, we were thinking of estimating survival separately for intervals following translocation events vs. intervals not following translocation events (I think this would just be a binary time-varying individual covariate). So asking, is survival less in intervals when a bear was translocated vs. intervals when it was not translocated? Or, for intervals following translocation events, modeling survival as a continuous function of translocation distance.

One issue is that the translocation distance, or even whether a bear was translocated, is sometimes unknown. For example, a bear was captured at time t, translocated Y km, recaptured at time t+X, but we do not know if it was translocated or how far it was translocated at time t+X. We could remove such individuals from our analyses but we would prefer not to if possible.

Would an appropriate option be to censor such individuals after their last confirmed capture? It seems this would allow us to use all information prior to the last capture to inform survival without needing information about the translocation status at that last capture. I think this approach would be standard in known-fate models but I'm not sure about multi-state models, especially since censoring would not denote "loss at capture" but rather loss of information at capture. Does anyone have an idea about if/how censoring individuals might affect parameter estimates of a multi-state model?

Thanks for any input!
Javan
jbauder
 
Posts: 52
Joined: Wed May 25, 2011 12:01 pm

Re: Censoring individuals with missing data in multi-state m

Postby simone77 » Tue Sep 10, 2019 5:38 am

I am not sure to understand how your data look like.
1. Do you have the classical temporal structure required by (most) capture-recapture models, i.e. with discrete capture sessions separated by intervals?
2. When you say
Because bears were only translocated during years (i.e., capture events) in which they were captured
, do you mean that they were translocated at the beginning of their (individuals’) encounter history?
One issue is that the translocation distance, or even whether a bear was translocated, is sometimes unknown. For example, a bear was captured at time t, translocated Y km, recaptured at time t+X, but we do not know if it was translocated or how far it was translocated at time t+X. We could remove such individuals from our analyses but we would prefer not to if possible.

3. Do you mean some of the bears you have captured were already marked and you do not know anything neither about the location they proceeded from nor about whether they had been translocated?

My answers are assuming that the answers to the three questions are 1 – yes, 2 – yes, and 3 – yes. However, I am not sure at all the answer to 3 is yes, what puzzles me is that you go on asking
Would an appropriate option be to censor such individuals after their last confirmed capture?
Why that? I am afraid I have not understood something here.

Just a few thoughts. You might handle the information on translocation using a grouping factor, therefore defining a group of translocated bears and one of non-translocated bears. You would test the hypothesis that translocation event has an effect on apparent survival (in the translocated group) using an age-effect model with a specific survival probability for the first interval (after capture). I know that in MARK you have a format that is multi-strata live and dead encounters that I guess would be suitable for your case but I am not sure it would allow you handling the incomplete information about translocation. In multievent modelling (E-SURGE) you would probably able to do that by defining it as a state (translocated – not translocated) and building up a parameter for the probability that the translocation state of some individuals may be unknown (three events associated to this state: translocated, not translocated, translocation unknown). I probably could be more helpful if you elaborate more about the details of your study.
simone77
 
Posts: 157
Joined: Mon Aug 10, 2009 2:52 pm

Re: Censoring individuals with missing data in multi-state m

Postby jbauder » Tue Sep 10, 2019 4:04 pm

Thank you for replying to my post and I am very sorry for the confusion. I will re-post my question in the analysis & design questions forum, but I wanted to give you a quick reply here and try to answer your questions. The temporal structure of my data does have capture sessions (approximately spring-fall) separated by intervals (approximately fall-spring) and bears are translocated (or not) immediately following their capture. But you are right, for some bears we know that they were translocated but not where (so we cannot measure translocation distance) and for others we only know that they were recaptured (but not if they were translocated). I think this stems from the fact that the study has been going on since 1979 and was overseen by different agencies with different research/management questions over time. I do like your idea of using a grouping factor with the age-effect model and will definitely give that a try, although the E-SURGE approach you mention also seems like it would help with this problem.

My question about censoring individuals was motivated by the following scenario in my data: Bear A is captured and translocated at Year 1. The bear is recaptured and translocated again at Year 2. The bear is then recaptured at Year 3 but we do not know if it was translocated (or we do not know how far it was translocated). So with Bear A, we have information about its survival from Years 1-3 but the information regarding its translocation status is incomplete. My thought with censoring was that if I censored Bear A after Year 3, only the information up to Year 3 would inform survival, including survival as a function of translocation. The censoring should remove Bear A after its Year 3 capture so the bear’s unknown translocation status in Year 3 would not affect the survival estimation. At least that is how I suppose censoring to work!

Thanks again for your reply!

Javan
jbauder
 
Posts: 52
Joined: Wed May 25, 2011 12:01 pm

Re: Censoring individuals with missing data in multi-state m

Postby simone77 » Wed Sep 11, 2019 6:04 am

Thanks for the clarifications. So you have discrete sessions and I assume you are doing some observations' pooling (e.g. in fall and spring). It is important that your pooling periods are not too large compared to the intervals, two important papers on this topic are Hargrove et al. 1994 and O'Brien et al. 2005. If you have unequal time intervals between sessions, you have to let it know to MARK or E-SURGE or whatever you will use.

I think you first need to figure out how to better handle the information about translocation. I understood that a maximum of one single translocation could happen for each individual. This justified the idea of defining groups of non-translocated vs. translocated individuals. If you use, as I suggested, a grouping factor you would be pooling in the same "translocated" group individuals that have been translocated a different number of times. By doing that you would be allowed to test wether translocated individuals - no matter how many times - have different probabilities of apparent survival (and potentially recapture) compared to the non-translocated individuals. This can make little biological sense I guess. You better handle translocation as a state. The main difference between a group and a state in CMR analyses is that a grouping factor is a fixed individual trait (e.g. sex) whereas a state is a dynamic one (e.g. breeder/non-breeder, infected/non-infected, etc.). In your case "translocated" seems more like a state, am I right?

By the way, in CMR when you deal with states you need multistate models. However I see you have a further level of complexity in your data and this has to do with the translocation distance. With other animals you might have a situation where your data proceed from, say, three locations (A,B,C) and therefore the movements between sites (A-B,A-C,B-C and viceversa) already "include" the information about distance, however you do not have this framework and you need to figure out something else. A possiblity would be treating translocation as a different state depending on some ranging criterion of distances. For instance:
Non-translocated -> state A
< 100Km -> state B
> 100Km & <300Km -> state C
> 300Km - state D

The you may have encounter histories like this:
A00ABD00C0
00AA0A0B00
0C0D0B0000
000AC0C000
...
where the first individual has been captured in the first session of the study period, not captured in the 2nd and 3rd, recaptured in the fourth, recaptured in the 5th and translocated <100Km away, recaptured in the 6th and translocated >300Km away, not recaptured in the 7th and 8th, recaptured in 9th and translocated between 100Km and 300Km away, and not recaptured in the 10th. In E-SURGE you would need to replace letters with numbers. This would give you more flexibility than you had with the grouping factor approach and you would be able to test all the intermediate hypotheses you may think of (just to say one: only translocation have an effect, regardless of distance).

Now, regarding the censoring. For me, in general, censoring data is not a good idea. It is problematic for sure when the censored individuals have some characteristic(s) that make them different from the non-censored individuals. Sometimes this is obvious but sometimes it is not and may lead to biased estimates in a hard-to-predict way. Up to now the best way of handling incomplete or partial information on states is probably by using a multievent approach*, something that you can do in E-SURGE (a software that was created ad-hoc for this) and that at some time will be probably possible to do in MARK too (see this answer of E. Cooch and the Kendall et al. (2012)). Multievent modelling can be seen as a further generalization of the multistate models which in turn are a generalization of the classic Cormack-Jolly-Seber models. Conceptually, what a multievent approach does is mapping the events, which define the information you collect in the field (that may be incomplete or unsure), to (hidden) states which are the real object of interest.

In the example I made before you would then have:
Five states (dead; captured non-translocated; captured B-like translocated; captured C-like translocated; captured D-like translocated) and six events (0= not captured; 1= captured and non-translocated; 2= captured and B-like translocated; 3= captured and C-like translocated; 4= captured and D-like translocated; 5= captured and unknown if translocated).

Then you would need to set up a parameter that would be used to include the probability that a captured individual in one of the possible states has been indeed captured without information on translocation. If you decide to give it a try a good starting point is Pradel (2005) and the E-SURGE manual, also you may find a lot of papers using E-SURGE with a step-by-step explanation in the supplementary materials, and by the way, the phidot forum may be of great help too. Feel free to contact me in private if you need more specific help.
simone77
 
Posts: 157
Joined: Mon Aug 10, 2009 2:52 pm

Re: Censoring individuals with missing data in multi-state m

Postby jbauder » Wed Sep 11, 2019 2:51 pm

Thank you very much for your thorough reply! I had not thought about my data in the context of states and events (except for using states to estimate cause-specific survival) but it does make a lot of sense, particularly for individuals with multiple captures where translocation status is unknown for some of those captures. Binning my translocation distances into groups also makes a lot sense. So to estimate cause-specific mortality would I change the "dead" state to "dead-from harvest," "dead-from vehicle," and "dead-from other" and keep the events the same?

I have very little experience in E-SURGE so I will look at the resources you mentioned but I would also really appreciate the chance to follow up privately with some specific questions.

Thanks again
Javan
jbauder
 
Posts: 52
Joined: Wed May 25, 2011 12:01 pm


Return to analysis & design questions

Who is online

Users browsing this forum: No registered users and 1 guest

cron