www.phidot.org

by **arpat** » Sat Oct 28, 2006 1:27 am

I have been working with multistate models in RMark, and I have learned some nice R tricks from Jeff Laake through email. Upon his suggestion, I will try to summarize some of these, so others can have access to the information until Appendix C is updated.

1. How to constrain some of the transition rates to be the same (e.g., B to C and C to C)?

Jeff's suggestion:
This is accomplished in RMark by creating a dummy variable 0/1 in the design data (e.g., "bc"):

ddl$Psi$bc=0
ddl$Psi$bc[ddl$Psi$stratum%in%c("B","C")&ddl$Psi$tostratum=="C"]=1

Then one can use "bc" in the model formula. E.g., if you were to use the formula ~bc+time then the intercept would be for the other strata and bc would be for C and D and the time effect would be additive. (Note that in this example other strata are constrained to be the same. One can also create separate design data columns for other parameters.)

2. How to fix some of the model parameters to zero (or another value)?

R help file for make.mark.model indicates that there are a number of options for fixing parameters in RMark: one can fix certain indices, times, ages, cohorts, or groups. Fixing indices is the most flexible one, and if you want to fix, for example, some of the transition rates to zero, it is the only option. However, note that these index numbers are the design data indices (row numbers), and not the PIM indices (it was not very clear to me reading the help file). PIM indices are stacked on top of each other, but design data indices are not, which makes index identification much easier. To see this more clearly, run the following code:

data(mstrata)
mproc=process.data(mstrata,model="Multistrata")
m.ddl=make.design.data(mproc)
test=mark(mproc,m.ddl,model.parameters=list(S=list(fixed=list(index=c(1,4,6),value=0)),p=list(fixed=list(index=c(3,8),value=1))))
m.ddl$S
m.ddl$p

Here's Jeff's suggestion on how to identify index numbers in a sleek way and fix the corresponding parameters in a model:

> One of the ways you can extract them is to construct a selection on the design data. What if you could only start in stratum C but couldn't transition there. Then all the Psi for tostratum="C" from A or B would be 0. Now also consider a strange situation in which it stays in B for only one time step and automatically transitions to A so Psi from B to A =1. Below in a series of steps I show how to construct the two sets of indices and values and then create a list for the fixed Psi parameters which is in turn used in the call to mark. Note that this is all built on the previous example which created mproc and m.ddl for the mstrata data. The way I've done it below is the best way to do things because it will extract the correct indices even if you were to change the data to create more capture occasions or even strata. If you were to specify the indices specifically by number then you'ld have to regenerate the set of indices. The code below is invariant to that kind of change. Also by creating a separate myPsifixed object it can be used with any of the formula and models for that data. This type of code can be used with any of the parameters and fields in the design data.

toC.zeroes=as.numeric(row.names(m.ddl$Psi[m.ddl$Psi$tostratum=="C",]))
toC.zeroes.values=rep(0,length(toC.zeroes))
fromBtoA=as.numeric(row.names(m.ddl$Psi[m.ddl$Psi$stratum=="B"&m.ddl$Psi$tostratum=="A",]))
fromBtoA.values=rep(1,length(fromBtoA))
myPsifixed=list(index=c(toC.zeroes,fromBtoA),value=c(toC.zeroes.values, fromBtoA.values))
test=mark(mproc,m.ddl,model.parameters=list(Psi=list(fixed=myPsifixed)) )

Two useful R functions used in this code: (1) "row.names" extracts the design data indices, and (2) "as.numeric" classifies them as numeric value.

Thanks to Jeff Laake for being very kind and generous in replying to my questions and providing the example code.

Arpat

by **cooch** » Sat Oct 28, 2006 1:17 pm

arpat wrote:
toC.zeroes=as.numeric(row.names(m.ddl$Psi[m.ddl$Psi$tostratum=="C",]))
toC.zeroes.values=rep(0,length(toC.zeroes))
fromBtoA=as.numeric(row.names(m.ddl$Psi[m.ddl$Psi$stratum=="B"&m.ddl$Psi$tostratum=="A",]))
fromBtoA.values=rep(1,length(fromBtoA))
myPsifixed=list(index=c(toC.zeroes,fromBtoA),value=c(toC.zeroes.values, fromBtoA.values))
test=mark(mproc,m.ddl,model.parameters=list(Psi=list(fixed=myPsifixed)) )

It is worth noting that one of the larger motivations for RMark is to shield the user from having to work directly with the design matrix - in theory, minimizing the potential for errors in the process.

However, it is reasonable to argue the potential for making errors in complex R code (see above) is no less - and perhaps even more - likely. One of the points of the GUI interface to the DM in MARK (including little 'crutches' like colored boxes to indicate non-zero elements) is that users are less likely to make mistakes working with things graphical, rather than a complex parsed set of R code. There is *nothing* intuitive about the preceding code (except to an R user).

Sure, if you 'speak R', that is perhaps a non-issue, but, most people don't (a fact which frequently shocks participants on the R listserv ;-)

by **arpat** » Sat Oct 28, 2006 2:12 pm

cooch wrote:There is *nothing* intuitive about the preceding code (except to an R user).

I agree that it looks pretty complicated to an R beginner. However, I am not a fluent R speaker either. It took me quite a while to understand it. I had to run the code line by line, and see what it is doing at each step.

I guess learning process includes copy&pasting others' codes, then learning to manipulate the code for your purposes, and then eventually writing your own.

I think, if you have time to invest in it, it eventually pays off. You increase your flexibility in working with Mark, and you become fluent in R, which is useful for many other analysis tasks.

by **egc** » Mon Oct 30, 2006 3:16 pm

I think, if you have time to invest in it, it eventually pays off. You increase your flexibility in working with Mark,

Not really - RMark doesn't give you new capabilities, simply new ways to run analyses using the same MARK numerical routines.

and you become fluent in R, which is useful for many other analysis tasks.

If you extend the logic of that argument, then getting everyone to (say) derive the information matrix themselves (rather than having MARK do it for you) would be useful, since it would force a deeper level of understanding. But, it takes a long time to do such things.

All software is a trade-off between ease/speed of use, and 'first principles' (e.g., using TeX to format documents, versus a WYSIWYG word processor). RMark's big strength - once you master it - is the ability to do a lot of things quickly, without the need to fiddle (from the Latin) with the DM. M-SURGE adopts pretty much the same approach. But, in many cases, substituting one level of complexity (e.g., obtuse R code) for another (big ugly DM's) may not actually yield large gains for the average use who isn't motivated to invest in learning a more general tool.

There is also a heuristic pedagogical argument to be made that forcing the user to interact with the DM is useful in that it forces users to think more deeply about linear models, which, arguably, is of more general utility than learning R (since learning R is only useful if you plan on using R, whereas learning something about linear models is useful regardless of which statistical software you use).

www.phidot.org

RMark tips to constrain and fix parameters in Mstate models

RMark tips to constrain and fix parameters in Mstate models

Re: RMark tips to constrain and fix parameters in Mstate mod

Who is online