www.phidot.org

by **Eldar** » Wed Apr 13, 2011 12:58 pm

It took more time than I thought but finally the function seems to be working. It works in the same way as mark.wrapper(), but does it through socket cluster. It may be easily turned to work with several computers in the intranet through functionality of snow/snowfall packages, types of available protocols are (Socket, MPI, PVM and NetWorkSpaces).

require(snowfall)

mark.wrapper.par()
Usage:
mark.wrapper.par(wd=getwd(), model.list, data, ddl, options=NULL, invisible=T, cpus=2, parallel=T, windows7patch=FALSE, MarkPath = "c:/Program Files/Mark/")
Arguments:
wd - working directory, if empty - will use curent R wd
model.list, ddl, data, invisible - see in ?mark, ?mark.wrapper
options - I used it for "SIMANNEAL"
cpus - number of threads to run in parallel
parallel - you can run model set in sequantial mode (for testing)
windows7patch - if you have win7 and immediately after start the R console will freeze - press esc and switch this option to TRUE
MarkPath = "c:/Program Files/Mark/" - Mark.exe directory

Details:
1. windows7patch=FALSE - I tried function with windows XP and win7 64 bit. At one of the win7 machines the snow package was not working. It took me some time to define and patch the problem. The solution I have found is to replace original function newSOCKnode() in SNOW package with my updated version. So you can change the windows7patch to TRUE and function will do it.
2. with i5-i7 machines under windows 7 hyperthreading switched ON by default. so you'll see 8 processors, but each of them will use only half of physical processor speed. It is easy to turn it OFF, and switch to 4 processors with full speed.. I don't know what is easier. Some experiments with different models are needed.
3. the function will start each of the parallel processes consequently with waiting period of 30 seconds. I needed to add this because otherwise mark.exe will mix data between runs. So - if your modeling is fast and 1 model run takes less than 30 seconds - it will wait anyway. It is possible to change it in the future but some changes (new locks) in original run.mark.model() function will be needed.
4. Sometimes original mark.wrapper() function that works under mark.wrapper.par() does not rename markxxx.vcv file. I don’t know why it is happening (with manual run it will do it) but you can rename it by hand if needed

You'll need to copy ALL of the lines below and paste it to R,
I would like to attach R script but it seems that it is not possible here

##########################the function
mark.wrapper.par<-function(wd=getwd(), model.list, data, ddl, options=NULL, invisible=T, cpus=2, parallel=T, windows7patch=FALSE, MarkPath = "c:/Program Files/Mark/" ) {
cat(" mark.wrapper.par, ver.0.2 from Apr 12, 2011\n mailto: eldar.rakhimberdiev at cornell.edu\n")
## the problem I met was that on some machines under win7 system() function does not work...
## you'll see following behaviour: a) R-console freezes or b) you'll see an error: "can not connect to..."
## in this case you'll need to switch windows7patch to TRUE
## future development:
## 1. add more functionality to lock - we don't need to wait for 10 seconds if model execution finished before.
## 2. we may want to save intermediate results
## 3. to add additional locks into markwrapper - not to allow several processes to extract mark outputs at the same moment.
## 4. it is possible to write forward model selection programm, that will add new models in cml using results of the previous runs..
cml<-model.list
require(snowfall)
require(RMark)
if (windows7patch) {
envir<-environment(newSOCKnode)
cat("loading patch\n")
eval(parse(text=("newSOCKnode=newSOCKnode1")),envir)
}
######################################################################
## loading the sister function:
Mark.wrapper.Par.sister.sf<-function(datastep) {
if ((datastep/cpus)<=1) Sys.sleep(2*(datastep-1))
setwd(wd)
require(RMark)
cml2run<-cml[datastep,]
z<-1
repeat {
if ( z>100 ) stop("for par", seed, "lock waited for tool long...\n")
lock<-list.files(path=wd, pattern="lock.tmp")
if (length(lock)==0) {
write(cpus, file = paste(wd,"/lock.tmp", sep=""))
break }
else {
if (difftime(Sys.time(),(file.info(paste(wd, "/lock.tmp", sep=""))$mtime), units="secs")>30) {
write(paste(datastep, cpus), file = paste(wd,"/lock.tmp", sep=""))
break } else {
cat("waiting in a queue lock\n for ", 30 - difftime(Sys.time(),(file.info(paste(wd, "/lock.tmp", sep=""))$mtime), units="secs"), " seconds \n")
Sys.sleep(30)
z=z+1
}
}
}
t<-as.POSIXlt(Sys.time())
working.dir<-paste("model", datastep, "attempt", 1900+t$year, t$mon, t$mday,t$hour,t$min, sep="-")
dir.create(working.dir)
setwd(working.dir)
res<-mark.wrapper(cml2run, data = data, ddl =ddl, options=options, invisible = invisible)
return(res)
}
## here only main function starts
sfInit( parallel=parallel, cpus=cpus)
cat("sfInit passed\n")
lx = ls(envir = parent.frame())
models<-lx[lx %in% unlist(cml)]
models<-eval(models, envir=parent.frame())
sfExport(list=c(paste(models), "cml", "data", "ddl", "options", "invisible", "wd", "Mark.wrapper.Par.sister.sf", "cpus", "MarkPath"), local=T)
res<-sfClusterApplyLB(1:nrow(cml), Mark.wrapper.Par.sister.sf)
lock<-list.files(path=wd, pattern="lock.tmp")
if (!length(lock)==0) {
unlink(lock)
}
modellist<-res[[1]]
for(i in 2:(length(res))){
tempobj=res[[i]]
modellist<-merge.mark(modellist, tempobj)
}
on.exit(sfStop())
return(modellist)
}

newSOCKnode1<- function (machine = "localhost", ..., options = defaultClusterOptions)
## this is not original version of function
## I used snow 0.3-3 prototype
# see below the place where changes where made!
{
options <- addClusterOptions(options, list(...))
if (is.list(machine)) {
options <- addClusterOptions(options, machine)
machine <- machine$host
}
outfile <- getClusterOption("outfile", options)
if (machine == "localhost")
master <- "localhost"
else master <- getClusterOption("master", options)
port <- getClusterOption("port", options)
manual <- getClusterOption("manual", options)
homogeneous <- getClusterOption("homogeneous", options)
if (getClusterOption("useRscript", options)) {
if (homogeneous) {
rscript <- getClusterOption("rscript", options)
snowlib <- getClusterOption("snowlib", options)
script <- file.path(snowlib, "snow", "RSOCKnode.R")
env <- paste("MASTER=", master, " PORT=", port, " OUT=",
outfile, " SNOWLIB=", snowlib, sep = "")
cmd <- paste(rscript, script, env)
}
else {
script <- "RunSnowWorker RSOCKnode.R"
env <- paste("MASTER=", master, " PORT=", port, " OUT=",
outfile, sep = "")
cmd <- paste(script, env)
}
}
else {
if (homogeneous) {
scriptdir <- getClusterOption("scriptdir", options)
script <- file.path(scriptdir, "RSOCKnode.sh")
rlibs <- paste(getClusterOption("rlibs", options),
collapse = ":")
rprog <- getClusterOption("rprog", options)
env <- paste("MASTER=", master, " PORT=", port, " OUT=",
outfile, " RPROG=", rprog, " R_LIBS=", rlibs,
sep = "")
}
else {
script <- "RunSnowNode RSOCKnode.sh"
env <- paste("MASTER=", master, " PORT=", port, " OUT=",
outfile, sep = "")
}
cmd <- paste("env", env, script)
}
if (manual) {
cat("Manually start worker on", machine, "with\n ",
cmd, "\n")
flush.console()
}
else {
if (machine != "localhost") {
rshcmd <- getClusterOption("rshcmd", options)
user <- getClusterOption("user", options)
cmd <- paste(rshcmd, "-l", user, machine, cmd)
}
if (.Platform$OS.type == "windows") {
######################################################################
## here is the place I did change! - April 8 2010 eldar.rakhimberdiev at cornell.edu
cmd<-paste("start /B", cmd, sep=" ")
shell(cmd)
#system(cmd, wait = FALSE, input = "")
######################################################################
}
else system(cmd, wait = FALSE)
}
timeout <- getClusterOption("timeout")
old <- options(timeout = timeout)
on.exit(options(old))
con <- socketConnection(port = port, server = TRUE, blocking = TRUE,
open = "a+b")
structure(list(con = con, host = machine), class = "SOCKnode")
}

by **pmm** » Fri Apr 15, 2011 11:47 am

My understanding is that snow (simple network of workstations) is designed for multiple computers and foreach, as well as mclapply (multicore lapply), are for multicore. I found a great presentation you might enjoy: http://www.slideshare.net/bytemining/ta ... -group-727

by **Eldar** » Fri Apr 15, 2011 2:49 pm

Presentation is great, thanks a lot!
You are right snow is more or less developed for cluster (but use for parallelization is a common for task for it too). Multicore package - does not work under win7 and 64 bit, so it is not a choice. Foreach tries to use multicore or snow or some not freely distributed packages. So - I decided to use snowfall as it is simple, free and it works - I am really happy with functions speed now. It IS 8 times faster than before..
I am also thinking about work with local clusters - socket or MPI - parallelization of 2 I7 machines will give you x16 (2x8) speed. So a big set of models may be run in a week or so.
PS As Evan’s forum takes out the spaces in code it became really hard to read the function. I can send it in email as R script if needed.

~ Eldar

by **cooch** » Thu May 26, 2011 7:41 pm

Quick update/correction to the following (wrt running a series of saved models):

cooch wrote:Actually, there is a relatively simply way to do this. Imagine a model set with multiple models. You build each model, and save (but don't run) the model structure. For each model, this generates a .tmp file, which is a simple ASCII file contains the control language for that model (which, ultimately) is what MARK interprets when you submit/run the model. So, I tried a little experiment:

1\ generated a candidate set of 8 approximately models for the dipper data (what else?)

2\ saved each model, then renamed each of the temp files model1.txt, model2.txt, etc. So, 8 in total.

<snip>

When I posted this, I forgot that in fact you can save the ASCII file containing the control language directly from the browser (avoiding the need to play with .tmp files). All you need to do is (i) create the model structure, (ii) save it without running it, then (the step I forgot) (iii) right-click, and 'open in notepad'. This will bring up the command file in the notepad (or whatever you've set as the default text editor), which you can then save using whatever name you want. Its that simple.

www.phidot.org

Mark in parallel under windows on multiple cores

Re: Mark in parallel under windows on multiple cores

Re: Mark in parallel under windows on multiple cores

Re: Mark in parallel under windows on multiple cores

Re: Mark in parallel under windows on multiple cores

Who is online