Page 1 of 1

Input file problem with large datasets

PostPosted: Mon Oct 27, 2014 5:29 am
by tessjunec
Hi,

I've been doing multiple season analyses ( simple psi.gamma(),eps(),p() ) with numerous datasets with success. I've now tried to use the same model for larger datasets - more than 200 columns. I cannot copy-paste directly into the program, as I get the error report "clipboard can only support xxx characters", so I've tried two methods: breaking the data up and copy-paste-ing them into the input sheet separately, and importing the file as a csv. With both, I get the same error when I try to run analyses - that there is an input file problem. I see this question has been asked before, but none of the answers have helped me. I've checked and re-checked my input files; there are no problems, no missing values, no extra spaces, no larger numbers etc. Any help would be greatly appreciated!

Tessa

Re: Input file problem with large datasets

PostPosted: Mon Oct 27, 2014 8:57 am
by jhines
Hi Tessa,

Either of the two methods you tried should work. Since they didn't, there must be something different about your input than what I expect. Would you mind sending me a copy of your csv file, and pao file?

Thanks,

Jim

Re: Input file problem with large datasets

PostPosted: Thu Oct 30, 2014 8:28 am
by tessjunec
Hi Jim,

Thanks so much for your response. Where should I send the files?

Regards,
Tessa

Re: Input file problem with large datasets

PostPosted: Thu Oct 30, 2014 8:36 am
by jhines
Hi Tessa,

You can send to jhines@usgs.gov

Cheers,

Jim

Re: Input file problem with large datasets

PostPosted: Fri Oct 31, 2014 9:22 am
by jhines
Hi Tessa,

Your input pao file is OK. The problem you are having is due to the large number of secondary
surveys in the data. When I wrote the multi-season model, I assumed that > 200 secondary
surveys per season must be an input error. PRESENCE prints the message in the output file and
stops.

Thinking about it now, I don't see a need for this restriction, so I removed the limitation.
However, the reason that it seemed unreasonable for there to be so many surveys per season is
that it seems unlikely that occupancy doesn't change over such a long period. What do the 233
surveys per season represent in your data? Days? Are you sure occupancy doesn't change over
this period? You might want to define your 'seasons' differently if that's not the case.

For many people, detection probabilities are nuisance parameters, and they are only interested
in occupancy, colonization and extinction. So, you could combine days into weeks, reducing the
number of surveys per season and get the same occupancy, colonizatin and extinction estimates, but
with weekly detection probabilities instead of daily probabilities. This would also reduce the
number of zeros in the data and make building models simpler. What you would lose by this is
the ability to estimate daily detection probabilites, or to model daily detection probabilities
as a function of some daily covariate.

If you download the latest version of PRESENCE, your pao file should work. Unfortunately, the
Patuxent webserver is having technical difficulties at the moment. I put a copy of PRESENCE at
the following link until the webserver is back online:

ftp://ftpext.usgs.gov/pub/er/md/laurel/ ... esence.exe

One other thing: If you are going to read a csv file into the input data form, the csv file
must contain a header line (eg., site,1-1,1-2,1-3,...1-233,2-1,2-2,...2-233)

Cheers,

Jim

Re: Input file problem with large datasets

PostPosted: Mon Nov 03, 2014 7:37 am
by tessjunec
Hi Jim,

Thank you so much, this is very helpful! I do understand what you mean about large numbers of surveys becoming redundant. I am using citizen-science bird survey data from two timeframes, and am modelling for different species - why there are so many surveys for some and why I need a standard method for doing modelling. The species I am working with are highly cryptic, so detection probabilities are thus important for explaining low/missing numbers.

Again, thank you very much. I'll try again with this version, and change my csv's.

Regards,
Tessa