Welcome to KMR#
This is KMR, a high-performance map-reduce library. KMR-1.0 is available since 2013-04-26. KMR works on ordinary clusters as well as large-scale supercomputers. KMR source code is available under the BSD license.
Latest release is KMR-1.10 (2018-11-16).
KMR is a set of high-performance map-reduce operations in the MPI (Message Passing Interface) environment. It makes programming for data-processing much easier by hiding low-level details of message passing. Its main targets are large-scale supercomputers with thousands of compute nodes. KMR provides utilities other than map-reduce operations to address issues such as accessing very large file-systems, on platforms K and Fujitsu FX10.
KMR is designed to work in-memory and to exploit large amount of memory available on supercomputers, whereas most map-reduce implementations are designed to work with external (disk-based) operations. So, data exchanges in KMR occur as message passing instead of remote file operations. The KMR routines work in bulk-synchronous and the most part of the code is sequential, but the code inside the mapper and reducer are multi-threaded.
Documents#
- Overview and API Document
- It is a Doxgen generated document, included in the installation.
Downloading#
Tutorials#
- Tutorial (in Japanese)
Project Site#
- KMR in GitHub https://github.com/riken-rccs
- Issue reporting https://github.com/riken-rccs/kmr/issues
- Other software from RIKEN R-CCS https://riken-rccs.github.io
Publications#
- K MapReduce: A Scalable Tool for Data-Processing and Search/Ensemble Applications on Large-Scale Supercomputers. Motohiko Matsuda, Naoya Maruyama, and Shinichiro Takizawa. IEEE Cluster Computing (CLUSTER) 2013. (C) Copyright IEEE. ieeexplore.ieee.org
It describes an overview and optimizations used in KMR.
- Supporting Workflow Management of Scientific Applications by MapReduce Programming Model. Shinichiro Takizawa, Motohiko Matsuda, and Naoya Maruyama. IPSJ HPCS 2014. (in Japanese). http://id.nii.ac.jp/1001/00096874
It describes some scientific applications workflow implemented in MapReduce using KMR.
- Evaluation of Asynchronous MPI Communication in Map-Reduce System on the K Computer. Motohiko Matsuda, Naoya Maruyama, and Shinichiro Takizawa. EuroMPI Workshop 2014. (C) Copyright ACM. dl.acm.org
It compares all-to-all collective communication versus asynchronous communication in shuffling communication, to qualify believed effectiveness of overlapping of communication and computation.
Acknowledgment#
KMR is a product of RIKEN R-CCS. Part of the results is obtained by using K computer at RIKEN R-CCS.
DISCLAIMER#
KMR comes with ABSOLUTELY NO WARRANTY. This wiki also comes with ABSOLUTELY NO WARRANTY. Contents are liable to change.