Page 1 of 1

RMark 32 and 64 bit

PostPosted: Tue Aug 03, 2010 1:38 pm
by jlaake
I rebuilt RMark with R v2.11.1 and did so for 32 and 64 bit. I tested with version 6.0 of mark and it worked fine. Look on left side of screen of RMark webpage on phidot. Note that the 64 bit is only the R portion. Both are running a 32 bit mark.exe. Using 64 bit is only useful if you are using 64 bit R exclusively or you are having issues with memory in R creating the model for mark.exe. You need more than 4GB of memory to see any impact with 64 bit R.

--jeff

Re: RMark 32 and 64 bit

PostPosted: Tue Aug 03, 2010 2:01 pm
by cooch
jlaake wrote:I rebuilt RMark with R v2.11.1 and did so for 32 and 64 bit. I tested with version 6.0 of mark and it worked fine. Look on left side of screen of RMark webpage on phidot. Note that the 64 bit is only the R portion. Both are running a 32 bit mark.exe. Using 64 bit is only useful if you are using 64 bit R exclusively or you are having issues with memory in R creating the model for mark.exe. You need more than 4GB of memory to see any impact with 64 bit R.

--jeff


In general, there is no performance gain - and, in some cases - there is a performance penalty - in going to 64-bit R (from 32-bit). As Jeff notes, there is a gain in the ability to access more RAM in going to 64-bit anything (including R), and for highly RAM intensive applications, this can be of some benefit (since it minimizes swapping to disk cache), but, beyond that, 64-bit is not inherently faster than 32-bit. The following quote (from CRAN) provides some of the background - pay particular attention to the second point concerning pointers:

64-bit builds have both advantages and disadvantages:

* The total virtual memory space made available to a 32-bit process is limited by the pointer size to 4GB, and on most OSes to 3GB (or even 2GB). The limits for 64-bit processes are much larger (e.g. 8–128TB).

R allocates memory for large objects as needed, and removes any unused ones at garbage collection. When the sizes of objects become an appreciable fraction of the address limit, fragmentation of the address space becomes an issue and there may be no hole available that is the size requested. This can cause more frequent garbage collection or the inability to allocate large objects. As a guide, this will become an issue with objects more than 10% of the size of the address space (around 300Mb) or when the total size of objects in use is around one third (around 1Gb).

* 32-bit OSes by default limit file sizes to 2GB (and this may also apply to 32-bit builds on 64-bit OSes). This can often be worked around: and configure selects suitable defines if this is possible. (We have also largely worked around that limit on 32-bit Windows.) 64-bit builds have much larger limits.

* Because the pointers are larger, R's basic structures are larger. This means that R objects take more space and (usually) more time to manipulate. So 64-bit builds of R will, all other things being equal, run slower than 32-bit builds. (On Sparc Solaris the difference was 15-20%.)

* However, `other things' may not be equal. In the specific case of ‘x86_64’ vs ‘ix86’, the 64-bit CPU has features (such as SSE2 instructions) which are guaranteed to be present but are optional on the 32-bit CPU, and also has more general-purpose registers. This means that on chips like Intel Core 2 Duo the vanilla 64-bit version of R is around 10% faster on both Linux and Mac OS X — this can be reduced by tuning the compilation to the chip.

So, for speed you may want to use a 32-bit build, but to handle large datasets (and perhaps large files) a 64-bit build. You can often build both and install them in the same place.


So, as per the final statement - for speed, 32-bit will probably win. In my own experiments with 64- vs 32-bit R, 32-bit versions are often 4-8% faster. Please don't fall for the sales pitch that '64-bit is faster!'.