北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2013, Vol. 36 ›› Issue (4): 76-80.doi: 10.13190/jbupt.201304.76.songj

• 论文 • 上一篇    下一篇

数据密集型计算中负载均衡的数据布局方法

宋杰, 李甜甜, 闫振兴, 朱志良   

  1. 东北大学 软件学院, 沈阳 110819
  • 收稿日期:2012-10-11 出版日期:2013-08-31 发布日期:2013-05-22
  • 作者简介:宋杰(1980—),男,副教授,博士,E-mail:songjie@mail.neu.edu.cn.
  • 基金资助:

    国家自然科学基金项目(61202088);辽宁省自然科学基金项目(200102059);中央高校基本科研业务费专项资金项目(N110417002)

Load-Balanced Data Layout Approach in Data-Intensive Computing

SONG Jie, LI Tian-tian, YAN Zhen-xing, ZHU Zhi-liang   

  1. Software College, Northeastern University, Shenyang 110819, China
  • Received:2012-10-11 Online:2013-08-31 Published:2013-05-22

摘要:

广泛用于数据密集型计算的MapReduce模型将计算部署到数据端并行执行,数据布局将不再只影响存储本身,还影响计算效率;节点上存储数据的特征决定该节点上任务的执行效率,负载均衡从传统的服务器管理或任务调度研究转变成为以提高并行性为目的的数据布局研究,为此,分析了数据密集型计算和MapReduce环境中数据布局的特点,提出了负载均衡的数据布局目标,并提出在特定环境下实现负载均衡的数据布局方法,最后通过实验证明了数据布局目标和数据布局方法的有效性. 理论和实验结果证明,新提出的布局方法能有效地提高MapReduce应用的并行性,优化其执行效率.

关键词: 数据密集型计算, 数据布局, 负载均衡, MapRedcue, 云计算

Abstract:

Widely used in data-intensive computing, the MapReduce model deploys computing to the data side so as to execute in parallel. On this occasion, data layout will not only affect the storage itself, but also affect the computing efficiency. Computing efficiency of node is determined by features of data stored on this node. Therefore, the study on load balancing is accordingly shifted from traditional server management or task scheduling to study of data layout as a purpose to improve parallelism. The data layout characteristics in data-intensive computing and MapReduce environment is analyzed, a load-balanced goal of data layout is proposed, and a load-balanced data layout approach in a specific environment is presented as well. The proposed data layout goal and approach are proved effective through experiments. It is shown that the proposed data layout approach can effectively improve the parallelism of MapReduce applications, thus optimizing the computing efficiency.

Key words: data-intensive computing, data layout, load balancing, MapReduce, cloud computing

中图分类号: