Page 1 of 1

Reference covariate for a set of categorical covariates

PostPosted: Mon Jul 01, 2024 11:00 am
by daniel.fisher91
I need help specifying which categorical covariate is used as a reference when estimating beta coefficients. I copied the output for [model]$beta$p for one of my models below. The categorical covariate used here is "season," which specifies whether a survey was conducted in summer, winter, or spring.

est se
p1_B1.Int -1.246784 0.319133
p1_B2 1.802345 0.420427
p3_B3 0.442573 0.423648

p1_B2 represents summer, p3_B3 represents winter, so I understand that "spring" (presumably p5_B4?) is being used as the reference to which summer and winter are compared. I can't figure out how to set either winter or summer as the reference covariate so I can get the beta estimates for the remainder of the pairwise comparisons. Does anyone know how to do this? Or maybe there's a way to apply a Tukey test to the data, or something similar?

I apologize if any of this is worded incorrectly--this is my first time using RPresence on my own dataset and trying to figure it out as I go! Happy to supply any other information needed to get an answer. Thank you!

Re: Reference covariate for a set of categorical covariates

PostPosted: Mon Jul 01, 2024 2:48 pm
by jhines
If you check the design matrix, [model]$desmat, it should show which beta parameter corresponds to which real parameter. If you used "spring", "summer" and "winter" as the categorical covariate values for "season", then R will assign values 0, 1 and 2 for them based on alphabetical order. The first, "spring" will be the intercept or reference value, and the others will be the difference between the other estimates and "spring".

So, the real estimate of detection for sprint would be the inverse-logit of -1.2446784, or 0.2236. The estimate for summer would be the inverse-logit of -1.246784+1.802345, or 0.6359. The estimate for winter would be the inverse-logit of -1.246784+0.442573 or 0.3096.

To test if summer detection is significantly different from spring detection, simply check if the confidence interval (estimate +/- 1.96*se) for summer includes zero (It doesn't.) since p1_B2 is the estimated difference between sprint and summer detection. To test if winter detection is significantly different from sprint, check the confidence interval for winter. (It does include zero.) To test if summer is significantly different from winter, you could re-run the model, making summer the intercept, or use program CONTRAST (run online at https://www.mbr-pwrc.usgs.gov/software/contrast.shtml) with the beta estimates and variances as input.

To make summer the intercept, you would need to change the "season" covariate ordering. This can be done with...

survcov$season2 = factor(as.character(survcov$season), levels=c('summer','spring','winter'))

Then, run the model with detection as a function of "season2" and the resulting beta's will have summer as the intercept and you would just look at the confidence interval of winter as that would be the difference between winter and summer (the intercept).

Re: Reference covariate for a set of categorical covariates

PostPosted: Fri Jul 05, 2024 12:26 pm
by daniel.fisher91
Thank you very much, this was helpful. I am still trying to get it to work--the beta and intercept values remain the same, no matter the order I put the factors "summer," "winter," and "spring" in. It seems like it's still using spring as the intercept/reference, even though everything else seems to be set up right.

I'll keep at it, thank you for the advice!

Re: Reference covariate for a set of categorical covariates

PostPosted: Sat Jul 06, 2024 3:53 pm
by jhines
Hi Daniel,

I've found that my advice on re-ordering the variables does not work in this case since RPresence simplifies the design matrix. However, since you have a single-season model, the order of the surveys doesn't matter.
You can re-order the columns in the detection data so that Spring (or Winter) is first and that becomes the intercept. According to your survey covariate data, Summer is first, followed by Winter, then Spring. This means Summer is the intercept, and the other beta estimates are the differences between Summer and Winter or Spring.

To make Winter the intercept, simply re-order the columns such that Winter is first. For example, this will create the pao object with the columns re-ordered:

bridgedata <- createPao(data = bridgepa[,c(3,4,5,6,1,2)], unitcov = unitcov, survcov = survcov)

You could also re-order the season covariate (SEA) with:

Survcov$SEA <- factor(rep(c("Winter","Spring","Summer"),each=2*42))