brid0030 wrote:Both reviewers seemed to home in on the fact that there is no p-value associated with analyses. Do any of you have any sage advice for making MARK results more palatable to those who crave p-values? I consulted section 4.6 of the MARK book in preparing my manuscript, and as I reread it, I didn't see a succinct way (i.e. something that fits into a couple of sentences) to relate the notion of "significance" in AIC. I am considering likelihood ratio tests, but then not all of the models are nested. Thanks for any wisdom you can impart.
Whether you get pulled over by the brown-shirts of the P-value army, or the AIC police (I could list journals where if you don't use AIC as the basis for inference you're likely to get rejected), is somewhat of a realization of some annoying stochastic bits of the 'review' process (i.e., it all depends on the journal, the 'ass editor', and his/her selection of reviewers - at least for journals that actually allow the 'ass editor' to select reviewers. I digress...).
More to the point, your question touches at an important, and fairly fluid issue. As noted at the end of section 4.6 (current online version)
"While this approach (summing cumulative AIC weights) seems to have some merit, there is by no means consensus that this is the ‘best approach’ - see for example Murray & Conner (2009: ‘Methods to quantify variable importance: implications for the analysis of noisy ecological data. Ecology 90:348-355). Stay tuned - assessment of ‘relative importance’ of variables in complex models is a ‘work in progress’."
There are are several quick (very quick - I'm already late) points to make:
1. first, I'd be somewhat uncomfortable thinking people were using 'the MARK book' as a canonical reference on anything more than 'how to use MARK'. The issue you're addressing goes far beyond what 'the book' is 'qualified' (or intended) to address. That being said, the 'book' does at least mention in broad terms some of the approaches, ideas, and concepts that relate to multi-model inference, since such inference is very much at the conceptual heart of MARK, and it would make little sense not to address 'big issues' to at least some degree. But, 'the book' is *not* a substitute for the primary literature (which is why we ask people not to cite it as such).
2. model averaging in some fashion yields parsimonious estimates of things of interest - be they parameter values, or estimates of effect size. There is general agreement that in this context, multimodel inference is a superior paradigm. There are high level debates about just how to do this averaging, but in broad principle, people agree (generally) that it is a robust framework.
3. there is far less agreement, however, on how to handle 'factor importance'. The Murray & Connor paper (noted above) addresses this issue, but it is clearly not the final answer to the problem. There isn't even general agreement on how to construct very large candidate model sets to avoid model redundancy over various factors. This is all (as noted) a work in progress.
4. probably the best thing you can do that will not offend the 'purists' on either side (which is clearly of practical interest when you're trying to get something published) is to consider effect size - this is the basic admonition of B&A in the first place. Specifying a priori how big of an effect of some set of factors is *biologically* meaningful is arguably the way you/we should proceed - when the interest is on 'effect', and not simply 'parsimonious estimation of a parameter' (although clearly estimation of effect size is related to the latter).
5. you mention 'experiments'. B&A acknowledge the role of experiments, and a properly conducted experiment is always encouraged. Whether you address the data collected during the experiment using classical approaches (like a LRT - very planned experiments often yield a classical nested structure), or using an AIC approach is not particularly important (or shouldn't be). Using the AIC approach, if you have a carefully constructed set of candidate models with a balance between those that include and exclude the factor(s) of interest, then you can in fact, make inference that can be relatively robust. Or, you could model average effects of those factors using the same candidate model set and work from there.
In short, the use of the word 'significant' by the reviewers is (hopefully) intended to get you to provide a robust summary of the importance of a factor. They may want it in terms of P=values, but you could make the argument that this is inappropriate for many of the reasons discussed in B&A. Read the literature, and go from there.
Hope this helps.