Linux version of MARK: deviance issues and slow run

announcements (new versions, changes, bugs, installation problems...) related to program MARK

Linux version of MARK: deviance issues and slow run

Postby C. Le Coeur » Tue Apr 08, 2014 9:23 am

Dear all

I want to bring up 2 modelling problems using LinMark (old and new versions)

1) Deviance issues between windows and Linus platforms
I am running Huggin’s robust design models to estimate survival probabilities in a squirrel population. Using a previous version of LinMark (“Old LinMark”, Sept 2012), I ran both random and Markovian temporary emigration robust design models (see details and deviance scores below):

S(~time*sex*birth_seas*age) Gamma(~random season) p(.) c(session)
S(~time*sex*birth_seas*age) Gamma(~markovian season) p(.) c(session)

As markovian model (the more parameterized model) fits better, lower deviance score is expected. Surprisingly, the markovian model’s deviance value is higher (deviance=6409.324) than the random one (deviance=6355.731).

Using the windows platform (same versions of R and RMark package), the deviance changed to 6340.959.
I ran the same models using the latest version of the Linux-compiled Mark.exe (“New LinMark”, Dec 25). The deviance problem was solved with i) identical deviance score recorded between windows and Linux platforms (deviance of the markovian model=6340.959) and ii) lower deviances of markovian emigration models than random ones.

Code: Select all
Model: S(~time*sex*birth_seas*age) Gamma''(~randseason) p(.)c(session)   
          Platform   AICc   deviance   npar
          Windows 8392.73   6355.731   266
      Old LinMark 8392.73   6355.731   266
      New LinMark 8392.73   6355.731   266
                
Model: S(~time*sex*birth_seas*age)Gamma''(~markovian season)p(.)c(session)   
          Platform  AICc    deviance   npar
          Windows 8386.007   6340.959   269
      Old LinMark 8454.372   6409.324   269
      New LinMark 8386.007   6340.959   269


I am wondering why the older version of LinMark provided wrong deviance score?


2) The latest LinMark version on a cluster : slow run
According to this previous error, I’ve tried to start again my model selection (the same) from my starting model (strongly parameterized model). But the running time was huge (a week and over, and even some models never ended)!!

The problem occurred with the latest version of LinMark (version of Dec 25). However a recent post of Ewan Cooch indicates an updated version posted December 28 could solve the problem. But I can’t find this version on the website (linux version).

To save up time, I’m using LinMark on a computer cluster. My running jobs were killed since the new mark.exe takes cpus-1 to do so. By setting “threads=1”, the problem is solved (see also viewtopic.php?f=2&t=2692 and viewtopic.php?f=2&t=2743).

But the running time exceeds 46 hours!

Here an example: same model and dataset, different Linux versions
Old version:
-2logL(saturated) = 1421.1968
Effective Sample Size = 1962
Number of function evaluations was 1391 for 793 parameters.
Time for numerical optimization was 6315.33 seconds.

New version:
-2logL(saturated) = 1421.1968
Effective Sample Size = 1962
Number of function evaluations was 291 for 793 parameters.
Time for numerical optimization was 148482.50 seconds

How can I manage this problem?

At last, I’m totally aware of strongly parameterized models at the starting point of my selection. From a conceptual point of view, what’s your opinion on setting a strongly parameterized starting model (makes more biological sense) versus a less parameterized model (makes more mathematical sense?) on maximum likelihood computation and efficiency of model selection?

Thank you for your help!
C. Le Coeur
C. Le Coeur
 
Posts: 13
Joined: Mon Feb 13, 2012 9:36 am

Re: Linux version of MARK: deviance issues and slow run

Postby C. Le Coeur » Wed May 21, 2014 4:28 am

Dear all
Sorry to post again but I’m still stuck with my models. They are taking a huge amount of time, how can I manage it??
Has a new version been updated?
Thank you for your help,
C. Le Coeur
C. Le Coeur
 
Posts: 13
Joined: Mon Feb 13, 2012 9:36 am

Re: Linux version of MARK: deviance issues and slow run

Postby jlaake » Wed May 21, 2014 8:43 am

I don't know why the Linux version is slow but you might try first fitting simpler models at first and then using those as initial values for the more complex models and build up from there. You use initial=mod
where mod is a previous model name.

--jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Linux version of MARK: deviance issues and slow run

Postby cooch » Wed May 21, 2014 8:47 am

C. Le Coeur wrote:Dear all

I want to bring up 2 modelling problems using LinMark (old and new versions)

1) Deviance issues between windows and Linus platforms
I am running Huggin’s robust design models to estimate survival probabilities in a squirrel population. Using a previous version of LinMark (“Old LinMark”, Sept 2012), I ran both random and Markovian temporary emigration robust design models (see details and deviance scores below):

S(~time*sex*birth_seas*age) Gamma(~random season) p(.) c(session)
S(~time*sex*birth_seas*age) Gamma(~markovian season) p(.) c(session)

As markovian model (the more parameterized model) fits better, lower deviance score is expected. Surprisingly, the markovian model’s deviance value is higher (deviance=6409.324) than the random one (deviance=6355.731).


This assumes you built the models correctly in the first place.

The problem occurred with the latest version of LinMark (version of Dec 25). However a recent post of Ewan Cooch indicates an updated version posted December 28 could solve the problem. But I can’t find this version on the website (linux version).


its there now.

To save up time, I’m using LinMark on a computer cluster. My running jobs were killed since the new mark.exe takes cpus-1 to do so. By setting “threads=1”, the problem is solved (see also viewtopic.php?f=2&t=2692 and viewtopic.php?f=2&t=2743).



Throwing things on a cluster server does not guarantee that MARK (or anything else) will run faster (many people assume that throughput is a linear function of the number of cores/threads. This is not true. It is a function of how the threaded code is written, the architecture of the scheduler, and so forth. There is no guarantee that anything written for 'parallel execution' will actually run faster). MARK is multi-threaded for some of what it does. But, in fact, there are circumstances when you can have too many threads. In my experience, MARK runs fastest for most jobs that make use of the multi-threaded bits when you use 6-8 cores/threads. Performance gains (relative to a single thread) seems to plateau at that point, and I think I've convinced myself it can actually degrade performance if you go >8 cores. So, if your cluster allows an individual user access to (say) 32 cores, then setting MARK to use all of them is a potentially a bad idea. Try having MARK use 4, then 6, then 8 for some benchmark job, and see how it does.

You can find out more about the architecture of your system (assuming its Linux -- might even work for Macs) by running the command lscpu. Here is some sample output:

Code: Select all
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 60
Stepping:              3
CPU MHz:               800.000
BogoMIPS:              5799.89
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
NUMA node0 CPU(s):     0-3


So, on this machine, 4 cores, each with a single thread. Meaning, at most, MARK could run 4 cores.
cooch
 
Posts: 1628
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Linux version of MARK: deviance issues and slow run

Postby cooch » Sat May 24, 2014 11:45 am

Just to make one final comment on the 'threads' and 'execution time' argument I made before, here is a simple plot of run time for a big, ugly open robust design analysis (note the vertical axis is minutes -- so even using all threads, jobs still takes ~24 minutes to complete), versus threads (1 -> 4). 64-bit MARK under GNU/Linux:

Image

The obvious points here -- the relationship between #threads and execution time is not linear.

But, what you don't see here is an asymptote -- probably because with only 4 threads on this machine, I'm not running into one.
cooch
 
Posts: 1628
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Linux version of MARK: deviance issues and slow run

Postby C. Le Coeur » Wed May 28, 2014 8:58 am

Thank you both for your detailed replies and the relevant information about threads.
My models run well and the jobs were completed faster using the latest version of LinMark!

cooch wrote:This assumes you built the models correctly in the first place.
Coding and dataset were the same among the 2 platforms (wds and linux).

Thanks
C. Le Coeur
C. Le Coeur
 
Posts: 13
Joined: Mon Feb 13, 2012 9:36 am


Return to software problems/news

Who is online

Users browsing this forum: No registered users and 8 guests

cron