SPRINT Beta 1
The changes in SPRINT beta 0.1 have been targeted at improving the scalability of SPRINT. SPRINT can now process larger data sets. The restriction on the size of the output data has been removed thanks to the use of R ff objects and binary files. SPRINT successfully returns output data which is larger than the memory of the computer.
Further improvements to the HPC harness using MPI/IO means that SPRINT now scales almost perfectly to 512 cores, providing a major improvement in performance and execution time.
No additional function has been added to the library of parallelised statistical R functions.
The SPRINT function library contains two functions:
- pcor(): a Pearson correlation function
- ptest(): a simple 'Hello World!' function
The parallel Pearson correlation function pcor() is based on the serial correlation function cor(). Howerver the interface has been widen to allow the use of ff objects. The ff package offers a memory mapped file support for the R environment and effectively allows the manipulation of objects on disk as if they were in memory. This enable pcor() to process data larger than the computer memory.
The graph below was obtained on the UK National Supercomputing Service, HECToR, using up to 256 cores for calculating the Pearson correlation on a 11,00 x 320 input matrix.