Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM ›› 2019, Vol. 42 ›› Issue (3): 106-113.doi: 10.13190/j.jbupt.2018-252

• Reports • Previous Articles     Next Articles

High Performance Row-Based Hashing GPU SpGEMM

TANG Yang1, ZHAO Da-fei2,3, HUANG Zhi-bin2,3, DAI Zhi-tao2,3   

  1. 1. School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    2. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    3. School of Computer Science, Beijing University of Posts and Telecommunication, Beijing 100876, China
  • Received:2018-10-09 Online:2019-06-28 Published:2019-06-20

Abstract: Aiming at the performance problem of general sparse matrix-matrix multiplication (SpGEMM), a graphics processing unit (GPU)-accelerate SpGEMM algorithm based on task classification and low-latency Hashing table, RBSPARSE, was presented in the paper. RBSPARSE consists of a low-cost pre-analysis method to identify the complexity of sub-tasks, and a Hashing table-based algorithm which could utilize low-latency shared memory to achieve max efficiency. By taking the load balancing issue and the memory latency issue into consideration, RBSPARSE could significantly reduce the overall time in computation. RBSparse and BHSparse are compared. BHSparse is the previous state-of-the-art algorithm for SpGEMM. The result shows that our algorithm is 3.1 times faster than BHSparse on average, and could achieve a maximum 14.49 times faster speed in the best scenario.

Key words: general sparse matrix-matrix multiplication, graphics processing unit, performance optimization, Hash table, shared memory

CLC Number: