www.phidot.org

by **aswea** » Mon Jun 08, 2020 3:56 pm

Hi Phidot,

We got an interesting comment in a review on a recent manuscript. Could someone could check my response? The Reviewer is clearly knowledgeable, but the approach has thrown me somewhat and I’d appreciate advice.

Background: We’ve been using spatial variants of the CJS live-recaptures model to estimate survival and detection efficiency for migratory salmon. In this particular study, we released 151 fish to swim over 4 different detection sites. The fish were ~evenly divided into two groups by the type of acoustic tag they were implanted with (V7=larger tag; V4=smaller tag), and each of the tag type groups were evenly divided into two groups depending on if they had a small gill biopsy collect at tagging (biopsy=no/yes). We want to know if tag type (i.e. tag size/tag burden) or gill biopsy affected fish survival.

For the manuscript, we came up with a series of 20 candidate models to address this question where survival was modelled as a function of time (i.e. migration segment, and/or the type of tag and the presence/absence of gill biopsy. Detection probability was modelled the same way in all models as time*tag_type because it was not likely that gill biopsy would affect detection probability. I used AIC to rank the models and then model-averaged.

Reviewer’s comment: (The Reviewer refers to “Model 16” which is the fully-varying model Phi(time*tag_type*biopsy)p(time*tag_type).)

“The value of the study is in the comparison of treatment effects at the three points in the migration. As currently presented, with 20 hypotheses with combinations of treatments in different segments, it detracts from the most relevant information which is the comparison of treatments at different points in the migration. Model selection for parsimony and model averaging would be appropriate if the purpose was to estimate survival through the migratory stage controlling for experimental factors (tag type, biopsies) that can affect survival of individual fish. But that is not the purpose of this experiment, to obtain the most parsimonious model. Rather, the purpose is to determine if the treatments result in statistically significant differences in inferred survivals at three points in time. The analysis is already done, you simply present the parameter estimates from model 16.

Model 16 from table 1 is much clearer to me if you write out the equation for the two-factor design of the experiment. I assume MARK or the R package uses the logit transformation for both detection and survival.

In each of segment j, tag type V4 no biopsy (mu), tag type V4 with biopsy (gamma), tag type V7 without biopsy (alpha) and tag type v7 with biopsy (delta)

logit(∅_i,j)= mu_j + alpha_j *T1_i + gamma_j * T2_i + (delta_j * T3_i ) (model 16)
With i fish id subscript
j subscript for the segment of river for which survival is inferred from detections
mu the survival rate of fish with V4 tags and no biopsy (in this structure, it is the reference condition)
alpha is the incremental change in survival of fish with V7 tag and no gill biopsy relative to the reference state
gamma is the incremental change in survival of fish with V4 tag and a gill biopsy relative to the reference state
delta is the incremental change in survival of fish with V7 tag and a gill biopsy relative to the reference state.
T1_i to T3_i are fish specific binary variables identifying treatments T1 = V7 no biopsy, T2 = V4 with biopsy, and T3 = V7 with biopsy

So treatment effects, relative to V4 as the reference, are directly inferred from the CJS model estimates of the parameters.”

Suggested Response: The Reviewer is correct that our main goal for this paper is the comparison of treatments at three points in the migration rather than estimating survival, and has suggested an interesting approach. However, the multi-model comparison approach we used is appropriate for our study for a number of reasons. First, we are dealing with a limited sample size. The Reviewer suggests that we base our inferences on the most highly-parameterized model which means that the errors on the beta parameter estimates are larger than they would be for less-parameterized candidate models and reduces our ability to detect effects (e.g. for just tag type, just biopsy, or for additive effects only). Later in the review, the Reviewer acknowledges that we may have sample size issues. Secondly, the reviewer is advocating that we base statistical significance on the size of the 95% beta parameter estimates which is a traditional significance test. In contrast, we would prefer to continue with the evidence-based approach not based on a specific significance level. Finally, although the survival estimates are of reduced importance in our study, we do wish to present the best possible estimates given the data.

The Reviewer’s suggested parameterization for our fully-varying model (Model 16) is interesting and we were able to reproduce it in program Mark; however, we’d like to clarify that we did not have the model ‘already done’ using this method. The RMark interace to Program Mark (which we used for this analysis) has a different approach. While Program Mark has a flexible design matix that allows specification of the beta parameters in different ways, the method proposed by the Reviewer is advanced.

We would also like to clarify that the Reviewer is talking about comparing the beta parameters to identify statistical significance rather than the real survival estimates. There can be multiple beta parameters combined to estimate each real parameter and each beta has an associated error. In our case, the real parameters contain error for the estimation of survival in each segment (time) in addition to the variables that we are interested in testing (tag_type and biopsy).

Further questions: Is the reviewer’s approach preferable in situations where there is a large sample size? What does the reviewer mean that our goal is not to obtain the most parsimonious model? I know what a parsimonious model is, but not why one would prefer to use a more parameterized model.

Thank you!!
Aswea

Website · by **simone77** » Tue Jun 09, 2020 6:37 am

We use models to estimate the parameters of interest (e.g. survival, dispersal) and to understand the dynamics of the process (e.g. causal factors, the difference between groups, states, etc.). Sometimes we are particularly interested in the estimate of a parameter that we need, for example, to feed another model (e.g. population growth), to perform a sensitivity analysis, or simply because a manager has asked us to calculate it for a report. In that case, we need a credible estimate. Model-averaging is already a traditional way of doing this, as it accounts for the ever-present uncertainty in the set of candidate models. Saying otherwise, it accounts for the uncertainty about which model best describes the process (among the descriptions/models you've put together). I am not sure if you model-averaged values of parameters (e.g. tag-type survival of V4 individuals) or estimates of effect sizes (e.g. tag-type V4 vs tag-type V7). The reviewer appears to be more interested in effect sizes than, for instance, in the individuals' probability of survival according to some covariate. Model-averaging effect sizes is not that trivial. You may find some discussion about this point in this forum, for example here.

Then, the reviewer argues you don't need model averaging because your study's goal is to test hypotheses, not to estimate parameters. Just like a side observation on this point: while testing-hypothesis may be the study's primary goal, nothing prevents one from pursuing a credible estimate of some of the parameters of the model. Anyway, you may use model selection (AIC comparison of the a priori defined set of candidate models) or p-values to test hypotheses. Model selection is useless if you do not pay attention to the estimated effect size. This is an important point. Sometimes we forget that if the candidate set of models is poorly defined meaning that we are not accounting for the relevant drivers of the process, the model selection will still pick up the best model (lower AIC). Such a best model will sometimes perform considerably better than others (e.g. deltaAIC > 2) but if you look at the effect size you will realize it is biologically irrelevant. The reviewer (I am not him/her) is suggesting you pay attention to the effect size and I guess is giving you some hint to find them. I see nothing wrong in looking at the effect sizes AND the 95% CI of those estimates because it gives me an idea of the uncertainty of the estimated effect size (according to the model and the data available). Confidence intervals and p-values are not the same. Reporting the 95%CI (I do not find where the Reviewer asks you to do that) does not translate into a significance test. You may want to take a look here about this.

Finally, I don't know if your set of candidate models include models that allow you to isolate the effect of tag-type or biopsy by comparing the AIC of that model with the AIC of a nested model without that effect. I think it would be recommendable but how much it makes sense depends also on whether their effects, if any, interact with some other variable (like the river's segment). I have no all the elements to understand if you really need to estimate the effect of tag-type and biopsy from the most parameterized model. I think you would not need to do so if you have some simpler (less parameterized) model with good AIC support that contains the, for example, tag-type effect and you may compare it with an otherwise identical model without that effect (e.g. {survival(tag-type) recapture(segment)} vs {survival(constant) recapture(segment)}).

by **aswea** » Wed Jun 10, 2020 10:04 am

Thank you for your advice and time Simone! Having this list as a resource has certainly been helpful and I’m grateful that people make the effort to assist.

From your answer, I’m reassured that there is nothing wrong with our approach for this manuscript. However, I’ll modify my response to remove the reference to the 95% CIs of the betas as the ‘statistical test’ recommended by the reviewer. I did a search for ‘z-test’ on the list and Jeff states that beta/se can be treated as an approximate z-test for null that beta=0. For this particular paper however, I used the model-averaged real parameter estimates to present ‘effect size’ rather than the beta estimates because of the complexity of model-averaging the betas (I haven’t looked into this much), and because I think the reals are easier to understand. But the reals also get confusing because people tend to compare the confidence intervals on the real estimates and if they overlap they conclude there is no difference between treatments—instead they should be using the CIs for the betas and the AIC ranking. I will try to clarify in the paper.

by **mcmelnychuk** » Wed Jun 10, 2020 1:28 pm

hi Aswea,

I think you can get the reviewer's suggested structure from a simple tweak in your RMark formula.

Instead of specifying like this:
Phi(time*tag_type*biopsy)

Try this:
Phi(time:tag_type:biopsy)

For Phi, instead of returning main effects for time, tag_type, and biopsy, as well as a number of interactions required to describe differences from these main effects, that will return beta parameters for each combination "tag_type:biopsy" at each time. The real parameters should be the same resulting from the two formulations, but the beta parameters will differ. The second one will assign one of the "tag_type:biopsy" groups as the reference group (calling them mu parameters), and if you have three estimable times, then you'll have three beta parameters that correspond to alpha, three that correspond to gamma, and three that correspond to delta. You'll be able to tell which betas go with which of those a/g/d groups based on the subscripts returned on the beta parameters (e.g. ...tag_typeV7biopsyyes). They represent differences from the beta parameters called mu.

Since it's a full model anyway, it might be easier to have beta parameters directly represent estimates of (logit) survival rather than represent those differences with mu suggested by the reviewer. You could get that by specifying:
Phi(-1 + time:tag_type:biopsy)

That should also give the same number of betas and return the same real params, it will just change what the beta params represent.

Hope that helps (hope all's well!),

Mike

by **aswea** » Wed Jun 10, 2020 3:23 pm

Hi Mike!

It's good to hear from you! Thanks for taking the time to think about this with me especially since I don't think you've been doing this type of work lately..?

I was able to create the Reviewer's suggested model, but I did it in Mark where I could better control the design matrix. I'd like to better understand your alternatives.

Phi(-1 + time:tag_type:biopsy) is the fully-varying model I actually built for the paper, but I don't think it is useful in this situation because the beta parameters include the effects of estimating segment-specific survival (time). To get to the effect of just tag_type and/or just biopsy, it's necessary to subtract the right betas and fiddle with the standard errors.

I'm not following the approach with the other model Phi(time:tag_type:biopsy). As you say, it returns beta parameters for each combination of "tag_type:biopsy" at each time, but then the intercept doesn't mean anything. However, the second part of the paragraph you're describing how one of the groups is the intercept...which doesn't jive with there being betas for each combination at each time.

Can I contact you off-list? I would also really appreciate your advice on how best to present effect-size when using model-averaged survival estimates.

Aswea

by **mcmelnychuk** » Wed Jun 10, 2020 8:52 pm

Thanks for taking the time to think about this with me especially since I don't think you've been doing this type of work lately..?

Correct - it seems I just can't step away from phidot.

I was following what you'd written (or what the reviewer interpreted) about your goals being to estimate how tag_type:biopsy changes over time, hence my suggestion for
Phi(-1 + time:tag_type:biopsy). But if you're instead interested in overall effects of tag_type, biopsy, and tag_type:biopsy, then you're right, it's ideal if there are estimated beta parameters directly associated with these terms. You might be able to get this by specifying:
Phi(tag_type*biopsy + time:tag_type:biopsy)
or you might also need a "... + time" term added to that, either with or without "-1 + ...".

Sure, feel free to contact me off-list. cheers!

www.phidot.org

When to use model-selection

When to use model-selection

Re: When to use model-selection

Re: When to use model-selection

Re: When to use model-selection

Re: When to use model-selection

Re: When to use model-selection

Who is online