Test Significance Between Estimates- survival between areas

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

Test Significance Between Estimates- survival between areas

Postby ctlamb » Sat May 30, 2015 5:43 pm

** solution to this issue posted by ctlamb on second page**

I have computed measures of apparent survival, recruitment and population growth for three areas, using three difference data sets. I would like to compare whether the estimates are significantly different from one another.

Initially I took the mean, and SE (which is = to the SD for a parameter estimate viewtopic.php?f=1&t=2209&p=6830&hilit=standard+deviation) and pulled data from a normal distribution and compared these data using ANOVA and a post hoc Turkey HSD.

However this approach seems sensitive to how many samples I draw, (i,e at 1,000 samples the HWY FH comparison is insignificant, but at 10,000 it is significant)

Any help?

Code: Select all
###SURVIVAL
set.seed(1)


FH<-rnorm(1000,mean=0.78,sd=0.14)

HWY<-rnorm(1000,mean=0.765,sd=0.19)

SR<-rnorm(1000,mean=0.78,sd=0.17)





t.test(FH,HWY,alternative="two.sided")
t.test(SR,HWY,alternative="two.sided")
t.test(SR,FH,alternative="two.sided")

Surv<-
  rbind(data.frame(x=FH,Strata="FH"),
        data.frame(x=HWY,Strata="HWY"),
        data.frame(x=SR,Strata="SR"))

ANOVA_Surv<-aov(x ~ Strata, Surv)

summary(ANOVA_Surv)

TukeyHSD(ANOVA_Surv)


Last edited by ctlamb on Sun Aug 30, 2015 8:59 pm, edited 1 time in total.
ctlamb
 
Posts: 56
Joined: Mon Nov 04, 2013 9:44 pm

Re: Test Significance Between Estimates- survival between ar

Postby jlaake » Sat May 30, 2015 5:52 pm

Sure it is a function of sample size the way you are doing it. I'm not certain why you are doing that. If they are independent analysis then you can simply construct a z-score with (est1-est2)/sqrt(se1^2+se2^2). If they are probabilities then you may want to do test on logit. But rather than doing that why didn't you analyze the data sets together which would give you a direct measure of any differences?

--jeff
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Test Significance Between Estimates- survival between ar

Postby ctlamb » Sat May 30, 2015 6:05 pm

These are probabilities if I understand correctly. I should transform them using the logit transformation? And then compare in a pairwise fashion using z scores?

I would love to analyze together, but these data were collected over a number of years and each area does not have the same number of primary sessions ( one area was 2006-2013, another 2007-2013, and the other 2007, 2010-2013). And the number of secondary sessions within a year was not constant. I wasn't sure how to deal with that in regards to specifying time intervals.
ctlamb
 
Posts: 56
Joined: Mon Nov 04, 2013 9:44 pm

Re: Test Significance Between Estimates- survival between ar

Postby jlaake » Sat May 30, 2015 6:25 pm

This probably shouldn't have been posted under RMark but I can bring it back to that with my suggestion. What you could do is to create a combined capture history by appending each capture history. That gives you one long capture history in which you can set time intervals and secondary sessions. You'll have to pad with 0s where the data doesn't cover for a particular data set. Then make each set of data a different group and assign a begin.time for the first primary occasion (could be a dummy one) such that you get the correct times that you want for each data set. You'll have to fix parameters for the padded 0's - setting p=0,S=1 etc. I'm assuming you are using the robust design but not sure which one. Either way it should work. Before you go down this route with your data, try a CJS example with the dipper data to make sure you follow what I'm saying.

For example, if I had 2 sets of CJS data with 7 occasions and the first started in 1990 and the second in 1993, I could do the following

ch group
10010100000000 1
00000001010111 2

process.data(data,groups=("group"),begin.time=c(1990,1986))

I use 1986 for group 2 because that will make the first occasion for group 2 be 1993. Then you would want to set

Code: Select all
ddl$p$fix=NA
ddl$p$fix[ddl$p$group==1& ddl$p$time%in%1997:2003]=0
ddl$Phi$fix=NA
ddl$Phi$fix[ddl$Phi$group==1&ddl$Phi$time%in%1996:2002]=1

You don't need to fix p and Phi for group 2 because for the CJS model the initial 0's don't matter. That will not be the case for the Robust model and it will be a little more complicated but if you work at it you should be able to figure it out.

--jeff
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Test Significance Between Estimates- survival between ar

Postby ctlamb » Sat May 30, 2015 7:06 pm

Ah, neat work around, Jeff. Thanks for this. This will be a good test of my understanding of the model construction.

Sorry about posting in the wrong forum. One last question about the significance between the independent datasets (I intend to try your suggestion for a single model above, but I need a rough idea in the short term and I think getting the data ready for the single model will take me a bit)


Can I still use my norm approach, but use the effective sample size from the MARK output as the number of records I draw? OR, is this inappropriate? I tried the Z score as you recommended, but just trying to better understand what is best to do in this case.
ctlamb
 
Posts: 56
Joined: Mon Nov 04, 2013 9:44 pm

Re: Test Significance Between Estimates- survival between ar

Postby ctlamb » Sat May 30, 2015 9:26 pm

For example, if I had 2 sets of CJS data with 7 occasions and the first started in 1990 and the second in 1993, I could do the following

ch group
10010100000000 1
00000001010111 2



I recognize that in your example above, you have two individuals, that are from two areas (areas denoted as groups 1 and 2).

My area share a few individuals (i.e., some individuals moved between areas and are part of both data sets), If I have, for example, two areas and they share an individual, does my capture history look like this?

ch group
/*Animal X*/ 10010100000000 1
/*Animal X*/ 00000001010111 2

^^ That would actually be a nice example, where it looks like the animal was initially solely in area 1 then moved to area 2,

But, in reality a few of my individuals just move directly between areas, (hence why I chose to use the pradel open models), and may look more like this:

ch group
/*Animal X*/ 10010100101100 1
/*Animal X*/ 01010011010111 2
ctlamb
 
Posts: 56
Joined: Mon Nov 04, 2013 9:44 pm

Re: Test Significance Between Estimates- survival between ar

Postby jlaake » Sun May 31, 2015 10:23 am

Yes you can use the test based on normality but you don't need the effective sample size for anything.

I'm not sure what you should do with animals moving between areas. I didn't realize the areas were so close together. In that case you really have a multi-state robust design with transitions between areas but it doesn't sound like you would have enough data to support such a complex model.

--jeff
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Test Significance Between Estimates- survival between ar

Postby ctlamb » Sat Jul 04, 2015 1:33 am

Hi Jeff,

One question, if I have two study areas that do not share animals, so I can use the groups suggestion above, do I always have to append them, or could I do :

1001010000
0001010111

And then have both the begin times in 1990?

Curious why the non overlap? Is it because a capture probability can be fit seperatley for each group and session then?

Also, for the pradel robust design, as you mention the initial zeros matter. In the case above I would just set the initial padded zeros for group 2 as p=0, to phi=0 and f=0 as we don't know any of these seeing as the sampling hadn't started yet?

Thanks for all your help
ctlamb
 
Posts: 56
Joined: Mon Nov 04, 2013 9:44 pm

Re: Test Significance Between Estimates- survival between ar

Postby jlaake » Mon Jul 06, 2015 11:19 am

One question, if I have two study areas that do not share animals, so I can use the groups suggestion above, do I always have to append them, or could I do :

1001010000
0001010111

And then have both the begin times in 1990?


Sure if they have the same number of occasions and same time intervals they can be constructed as a above with a group covariate to model any regional effects. They can have different different begin times in that case as well as long as you include a covariate for groups and specify multiple begin.times (one for each group). That would be the standard thing to do. The non-overlap is when the number of occasions (secondary/primary structure as well) or the time intervals differ across regions. I believe that was the reason for your question and my suggestion

Also, for the pradel robust design, as you mention the initial zeros matter. In the case above I would just set the initial padded zeros for group 2 as p=0, to phi=0 and f=0 as we don't know any of these seeing as the sampling hadn't started yet?


I don't honestly know but it may be more complicated than that. I suggest that you simulate data and try what you are suggesting so you know if it works or not.
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Test Significance Between Estimates- survival between ar

Postby ctlamb » Sun Jul 12, 2015 3:24 pm

Sure if they have the same number of occasions and same time intervals they can be constructed as a above with a group covariate to model any regional effects. They can have different different begin times in that case as well as long as you include a covariate for groups and specify multiple begin.times (one for each group). That would be the standard thing to do. The non-overlap is when the number of occasions (secondary/primary structure as well) or the time intervals differ across regions. I believe that was the reason for your question and my suggestion



Thanks, Jeff. And yes, correct. I was initially looking for a solution when occasions didn't line up either. I suggested the idea above to ensure I understood how the structure worked.





I have simulated two sets of data with the following specifications:

GROUP1:
Initial Captures=25
Phi=0.85
f=0.15
p=0.5

GROUP2:
Initial Captures=25
Phi=0.65
f=0.15
p=0.5

The structure of these data are 5 primary sessions, each with 4 secondary sessions. Group 1 was sampled 2007,2010-2013, and Group 2 2006, 2010-2013.

I prefixed the ch for Group 1 with four 0's, "0000" to pad the ch where Group 2 was sampled in 2006, and I added four 0's, "0000" between the 2006 and 2010 captures for Group 2, to pad where Group 1 was sampled in 2007.

Overall the structure looks like this,(1's are added to show where I left the data as is and added 0's, the true ch's are not pure 1's)

Group 1: 000011111111111111111111
Group 2: 111100001111111111111111

Processing code looked as follows:
Code: Select all
NEWDF.proc=process.data(NEWDF, model="RDPdfHuggins",groups=("Group"), begin.time=c(2006),time.intervals=c(0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0))



I fixed parameters and ran mark model as follows:

Code: Select all
###FIX PARAMETERS   

##p
NEWDF.ddl$p$fix=NA
NEWDF.ddl$p$fix[NEWDF.ddl$p$group==1&NEWDF.ddl$p$time==2006]=0
NEWDF.ddl$p$fix[NEWDF.ddl$p$group==2&NEWDF.ddl$p$time==2007]=0

NEWDF.ddl$c$fix=NA
NEWDF.ddl$c$fix[NEWDF.ddl$c$group==1&NEWDF.ddl$c$time==2006]=0
NEWDF.ddl$c$fix[NEWDF.ddl$c$group==2&NEWDF.ddl$c$time==2007]=0

##Phi
NEWDF.ddl$Phi$fix=NA
NEWDF.ddl$Phi$fix[NEWDF.ddl$Phi$group==1&NEWDF.ddl$Phi$time==2006]=1
NEWDF.ddl$Phi$fix[NEWDF.ddl$Phi$group==2&NEWDF.ddl$Phi$time==2007]=1

NEWDF.ddl$f$fix=NA
NEWDF.ddl$f$fix[NEWDF.ddl$f$group==1&NEWDF.ddl$f$time==2006]=1
NEWDF.ddl$f$fix[NEWDF.ddl$f$group==2&NEWDF.ddl$f$time==2007]=0


##Model
NEWM1<-mark(data=NEWDF.proc,ddl=NEWDF.ddl,model.parameters = list(p=list(formula=~1,share=TRUE),
                                                                       Phi=list(formula=~Group),
                                                                       f=list(formula=~Group)),output=FALSE,model="RDPdfHuggins", delete=TRUE)


Results are as follows:
GROUP1:
Phi=0.92
f=0.086
p=0.49

GROUP2:
Phi=0.71
f=0.086
p=0.49

So pretty close to the expected values, and did quite well given small samples (25).


My main concern was whether I fixed Phi and f properly. Yes I got the expected results, but if I do:
Code: Select all
NEWDF.ddl$f$fix[NEWDF.ddl$f$group==1&NEWDF.ddl$f$time==2006]=0


I get nearly identical results,

OR, if I do :

Code: Select all
NEWDF.ddl$Phi$fix=NA
NEWDF.ddl$Phi$fix[NEWDF.ddl$Phi$group==1&NEWDF.ddl$Phi$time==2006]=1
NEWDF.ddl$Phi$fix[NEWDF.ddl$Phi$group==2&NEWDF.ddl$Phi$time==2007]=1

NEWDF.ddl$f$fix=NA
NEWDF.ddl$f$fix[NEWDF.ddl$f$group==1&NEWDF.ddl$f$time==2006]=0
NEWDF.ddl$f$fix[NEWDF.ddl$f$group==2&NEWDF.ddl$f$time==2007]=1


I get:

GROUP1:
Phi=0.93
f=0.088
p=0.45

GROUP2:
Phi=0.64
f=0.028
p=0.45

So survival is closer for Group 2, but f is off.


#######

I'm looking to understand how exactly the fixing of parameters is interpreted by the model. Phi must have to fixed at 1 for Group 2 in 2007, as surviving from 2006 to 2010 meant you were also alive in 2007. But the others are less clear to me. Group 1 Phi in 2006- should this be 1 or 0, as we first sampled individuals in 2007 in this area, so we really have no idea whether bears survived this period (2006-2007) or were new recruits?



Very sorry for the long message. I am hopeful this info will be used by others in the future though!
CL
ctlamb
 
Posts: 56
Joined: Mon Nov 04, 2013 9:44 pm

Next

Return to RMark

Who is online

Users browsing this forum: No registered users and 1 guest