Real parameter estimates using indiv cov's: MARK vs RMark

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

Real parameter estimates using indiv cov's: MARK vs RMark

Postby jlaufenb » Fri Oct 31, 2008 10:31 am

I am building equivalent Huggins models in RMark and MARK to make sure my R code is correct. When I build a model where p=c and p varies as a function of a transformed distance covariate I get different real and derived estimates, AICc, etc from MARK vs RMark. I'm using an inverse squared transformation (i.e., dist^-2). The design matrix in MARK consists of 2 columns: intercept (all 1's) and covariate (all power function: power(dist,-2)). In RMark, I use the parameter model specification: pInvDistSq=list(formula=~I((dist)^-2),share=TRUE). When I build a similar model, where p varies by the untransformed distance, I get matching estimates, AICc, etc from both. Is there an issue with how these 2 programs perform the transformation or is it something inherently obvious that I'm missing?
jlaufenb
 
Posts: 49
Joined: Tue Aug 05, 2008 2:12 pm
Location: Anchorage, AK

Postby jlaake » Fri Oct 31, 2008 1:55 pm

Jared-

Did you look at the design matrix in the input file to see if it created the proper value in using I()? I don't believe I've ever used the I() in formula and I think in one of the help files I suggest creating a separate variable. Did you try creating a separate inverse squared variable and use it to see if you got the same result?

Whenever you get differences between MARK and RMark it must be either the data or the model. They use the same mark.exe so what matters is what is in the data and what is in the design matrix. The first place to look is the DM in this case because you've already gotten the same results with a different model.

--jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby jlaake » Fri Oct 31, 2008 4:55 pm

Jared-

I had one more thought. Is dist an individual covariate in the data for RMark or is it in the design data? If it is in the design data, then it should be in the p and c design data because you are using a common model for capture (p) and recapture (c) and if it is not then that may be the problem because it woud be using a dist=0 for c. This should be apparent if you look at the design matrix it created.

Let me know---jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby jlaufenb » Mon Nov 03, 2008 2:58 pm

Jeff

To answer your afterthought, "dist" is an individual covariate and not in the design data. I checked the design matrix and it appears that, despite specifying a model for p using the I() function, the untransformed distance values were being used:

p:(Intercept) p:I((dist)^-2)
p g1 t1 "1" "dist"


Reals, AICc, etc matched a model I built directly in MARK using untransformed values. This tells me that I() does not work for individual covariates as it does for design data variables (e.g., Time, Age, etc).

I have previously built models in MARK using the design matrix functions to create new individual covariates (i.e., power(dist,-2) to transform distance) and subsequently model p as a function of those covariates. This allows plotting parameter estimates against the original untransformed values, which can be more interpretable than plotting against transformed values. Is there a way to utilize these functions through RMark? If so, I think I could use the covariate.predictions function in RMark to produce plots similar to those I have produced in MARK. If not, is there a way in R to achieve the same goal?
jlaufenb
 
Posts: 49
Joined: Tue Aug 05, 2008 2:12 pm
Location: Anchorage, AK

Postby jlaake » Mon Nov 03, 2008 5:13 pm

I'll investigate whether I can incorporate I() for individual covariates and if I cannot, I'll make it clear in the documentation that it isn't supported. Most likely I will not be able to support it because of the way that individual covariates are handled in the design matrix. As anyone that uses MARK knows they are entered as a string with the variable name. It was a trick to get them into RMark in the first place because that aspect is not handled with the model.matrix function in R. If you are interested, the details are spelled out in Appendix C of Cooch and White.

I don't entirely understand the last paragraph of your message. If you define a new variable in your dataframe say invdist2=1/dist^2 then you can use it in the formula and can use it to get predictions. If you want to plot against the untransformed values you can simply save or solve for the untransformed values to plot against. If this is still not clear, send me an email off-list on this subject.

--jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby jlaufenb » Wed Nov 05, 2008 11:30 am

Jeff

Thanks for the reply. Disregard the last paragraph. I figured out what I needed. I could use some insight into some results I'm getting out of MARK. It is a situation similar to the example in chapter 11.5 of Cooch and White where they created mass^2 in the design matrix instead of bringing it in hard coded. I built the model mentioned in my first posting 2 ways, 1 using an indiv. cov. hard coded into the input file and 1 using the same indiv. cov. values, but creating them using a separate cov and a design matrix function. Both models produce the same AICc and the derived estimates and betas are essentially the same (differences ~1E-6 to 1E-8, I'm guessing rounding error?). However, the real parameter estimates reported in 'view estimates of real parameters' are different (hard coded model - p=0.1467463 vs DM function model - p=0.1411468). Does anyone know why this is?
jlaufenb
 
Posts: 49
Joined: Tue Aug 05, 2008 2:12 pm
Location: Anchorage, AK

Postby jlaake » Wed Nov 05, 2008 1:07 pm

Jared-

I'm having a hard time figuring out what you are comparing in your message. But with individual covariates, the real parameter value depends on the value you use for the covariate. Thus, I expect the difference lies there. If you still have further questions about this you can contact me off-list.

--jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby jlaufenb » Mon Nov 10, 2008 4:49 pm

Here's a summary of the discussion Jeff and I had off the listserv about using design matrix functions to create individual covariates vs hard coding individual covariates into the input data. I've included a snippet of my data to illustrate hard coding.

/*Columns are "BearID", capture history, Count, "Sex", "InvDistSq", "Dist"*/
/*Sex coded: 1=Female, 0=Male*/

/*1*/ 1000000000 1 1 0.374079997 1.635;
/*2*/ 1000000000 1 1 0.1927048125 2.278;
/*3*/ 1011100010 1 1 0.0957320361 3.232;
/*4*/ 1000000000 1 1 0.1810774106 2.35;
/*5*/ 1111000000 1 1 0.374079997 1.635;
/*6*/ 1000000000 1 1 0.1641760072 2.468;
/*7*/ 1000010000 1 0 0.1084916878 3.036;
/*8*/ 1000100100 1 0 0.4795850247 1.444;
/*9*/ 1011000001 1 0 0.1587276392 2.51;
/*10*/ 1000110000 1 0 0.374079997 1.635;



"InvDistSq" is the inverse of distance squared (i.e., 1/(Dist^2)) and, of course, "Dist" is a distance measure used as an individual covariate. The inversed distance squared covariate can also be created in the design matrix by way of the power function (i.e., power(Dist,-2)). As I mentioned in a previous message, I built a model, where p=c and p is modeled solely as a function of the inversed distance squared, using 2 different methods. The first method used the hard coded "InvDistSq" and the second used the design matrix function. Both methods produced the same AICc, Deviance, betas, etc. However, the real parameter estimates were different. In my case, I specified the mean individual covariate values to be used to calculate reals. The mean "Dist" was 1.9491295 and mean "InvDistSq" was 0.3179616. As Jeff pointed out, the 2 methods were:

using different covariate values because 1/( mean dist)^2 is not equal
to the mean (1/dist^2).

INVDISTSQ 0.3179616
DIST 1.9491295

It is using 0.3179616 for the one estimate and
1/(1.9491295)^2=0.2632198 for the other. Thus

> plogis(-2.0243193+0.8301871*.3179616)
[1] 0.1467463
> plogis(-2.0243193+0.8301871*.2632198)
[1] 0.1411468
>
However neither of these may be what you are looking for.which would be
the mean p for the population. That is more difficult to compute
because you need to know the distribution of individuals at each value
of dist.


Thanks Jeff!
jlaufenb
 
Posts: 49
Joined: Tue Aug 05, 2008 2:12 pm
Location: Anchorage, AK


Return to RMark

Who is online

Users browsing this forum: No registered users and 15 guests