Design of Distributed Shared Memory Structure for Array Processor

doi:10.13190/j.jbupt.2017.04.002

JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM ›› 2017, Vol. 40 ›› Issue (4): 9-15.doi: 10.13190/j.jbupt.2017.04.002

• Papers • Previous Articles Next Articles

Design of Distributed Shared Memory Structure for Array Processor

SHAN Rui¹, SHEN Xu-bang¹, JIANG Lin², ZHU Yun², SONG Hui²

1. School of Microelectronics, Xidian University, Xi'an 710071, China;
2. School of Electronic Engineering, Xi'an University of Posts and Telecommunication, Xi'an 710121, China

Received:2016-10-18 Online:2017-08-28 Published:2017-07-10

Abstract

Abstract: With the increasing of number of processors, the problem of memory wall was more severely. In order to alleviate this problem, two-level mixed interconnection network was proposed: fast crossbar for local data transfer and network on chip for long distance data communication. Meanwhile data transfer mechanism was designed to support unified addressing. Two memory architecture sizes were implemented on field rpogrammable gate array, and area, frequency and power consumption were evaluated. A mixed simulation testbench based on SystemC language was developed. The simulation results show that the designed architecture has higher memory access bandwidth and lower local accessing latency.

Key words: array processor, memory structure, network on chip, distributed memory, unified addressing

CLC Number:

TN492

SHAN Rui, SHEN Xu-bang, JIANG Lin, ZHU Yun, SONG Hui. Design of Distributed Shared Memory Structure for Array Processor[J]. JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM, 2017, 40(4): 9-15.

References

[1] 魏少军, 刘雷波, 尹首一. 可重构计算处理器技术[J]. 中国科学:信息科学, 2012(12):1559-1576. Wei Shaojun, Liu Leibo, Yin Shouyi. Key techniques of reconfigurable computing processor[J]. SCIENCE CHINA:Information Sciences, 2012(12):1559-1576.
[2] 李浩, 谢伦国. 片上多处理器末级Cache优化技术研究[J]. 计算机研究与发展, 2012, 49(S1):172-179. Li Hao, Xie Lunguo. Research development of optimization technology on last level cache in chip multi-processors[J]. Journal of Computer Research and Development, 2012, 49(S1):172-179.
[3] 石嵩, 李宏亮, 朱巍. 阵列众核处理器上的高效归并排序算法[J]. 计算机研究与发展, 2016, 53(2):362-373. Shi Song, Li Hongliang, Zhu Wei. Efficient merge sort algorithms on array-based manycore architectures[J]. Journal of Compute Research and Development, 2016, 53(2):362-373.
[4] Berezecki M, Frachtenberg E, Paleczny M, et al. Power and performance evaluation of memcached on the TILEPro64 architecture[J]. Sustainable Computing Informatics & Systems, 2012, 2(2):81-90.
[5] Hu Ziang, Cuvillo J D, Zhu Weirong, et al. Optimization of dense matrix multiplication on IBM cyclops-64:challenges and experiences[C]//Euro-Par 2006, Parallel Processing, 12^th International Euro-Par Conference. Dresden:[s.n.], 2006:134-144.
[6] 胡向东, 杨剑新, 朱英. 高性能多核处理器申威1600[J]. 中国科学:信息科学, 2015(4):513-522. Hu Xiangdong, Yang Jianxin, Zhu Ying. Shenwei-1600:a high-performance multi-core microprocessor[J]. SCIENCE CHINA:Information Sciences, 2015(4):513-522.
[7] 郑方, 许勇, 李宏亮, 等. 一种面向高性能计算的自主众核处理器结构[J]. 中国科学:信息科学, 2015(4):523-534. Zheng Fang, Xu Yong, Li Hongliang, et al. A homegrown many-core processor architecture for high-performance computing[J]. SCIENCE CHINA:Information Sciences, 2015(4):523-534.
[8] Banakar R, Steinke S, Lee B S, et al. Scratchpad memory:design alternative for cache on-chip memory in embedded systems[C]//Tenth International Symposium on Hardware/Software Codesign. Piscataway:IEEE, 2002:73-78.
[9] 朱小虎, 曹阳, 王力纬. 多级拥塞控制的NOC路由算法[J]. 北京邮电大学学报, 2007, 30(5):91-94. Zhu Xiaohu, Cao Yang, Wang Liwei. A multilevel congestion control routing algorithm for network-on-chip[J]. Journal of Beijing University of Posts and Telecommunications, 2007, 30(5):91-94.
[10] Mullins R, West A, Moore S. The design and implementation of a low-latency on-chip network[C]//2006 Asia and South Pacific Conference on Design Automation. Piscataway:IEEE, 2006:164-169.
[11] Loi I, Benini L. An efficient distributed memory interface for many-core platform with 3D stacked DRAM[C]//2010 Design, Automation & Test in Europe Conference & Exhibition (DATE). Piscataway:IEEE, 2010:99-104.

Design of Distributed Shared Memory Structure for Array Processor

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 2

Recommended Articles

Metrics

Comments

[1]	ZHANG Xiao-dong, SUN Han-xu, JIA Qing-xuan, ZHOU Liu-shuan. Research on Robot Modular Joint adaptive Fuzzy Servo Control System [J]. JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM, 2007, 30(5): 37-40.
[2]	MA Guowei，SUN Hanxu，JIA Qingxuan，YE Ping，KUAI Yongtao. Design and Realization of Control System of Brushless DC Motor Based on DSP [J]. JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM, 2005, 28(3): 96-99.