Detection probabilities in remotely sensed tagging data

I have encountered some challenges in modelling demographic parameters using an MSORD-SU model with a population that is tagged on irregular secondary intervals, and re-detected continuously. I am wondering if someone could make a recommendation for modelling this type of detection heterogeneity. I am aware that models exist to separate first detection probabilities from all subsequent detection (e.g. Huggins-type); however, the population I am dealing with is open and these models assume closed populations. I also am interested in the structure of a population comprised of 3 states which are unidentifiable at the initial tagging event--hence using the MSORD-SU model.
So the probability of the first tagging is dependent upon an effort covariate related to time spent tagging within each secondary period (most periods have an effort of 0), while re-detection probabilities are quite high because the data is remotely sensed.
I have considered to using the probability of arrival (pent/Beta) parameter to try and separate these two processes (i.e. model p as constant and model pent as a function of tagging effort, where most occasions are fixed pent = 0). My concern with this approach is that I am fitting a biological parameter to study-design data with no biological interpretation and will therefore invalidate any of the derived parameters that this model estimates, including population size and residence-time.
Is this a valid concern and are there better ways of accounting for the effect of initial tagging on detection probability in this type of model?
So the probability of the first tagging is dependent upon an effort covariate related to time spent tagging within each secondary period (most periods have an effort of 0), while re-detection probabilities are quite high because the data is remotely sensed.
I have considered to using the probability of arrival (pent/Beta) parameter to try and separate these two processes (i.e. model p as constant and model pent as a function of tagging effort, where most occasions are fixed pent = 0). My concern with this approach is that I am fitting a biological parameter to study-design data with no biological interpretation and will therefore invalidate any of the derived parameters that this model estimates, including population size and residence-time.
Is this a valid concern and are there better ways of accounting for the effect of initial tagging on detection probability in this type of model?