Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM ›› 2015, Vol. 38 ›› Issue (6): 34-38.doi: 10.13190/j.jbupt.2015.06.008

• Papers • Previous Articles     Next Articles

Smallfiles on HDFS Merging based on the Energy Efficiency

YU Jun-yang1,2, HU Zhi-gang1, LIU Xiu-lei3   

  1. 1. Software School, Central South University, Changsha 410075, China;
    2. Software School, Henan University, Henan Kaifeng 475000, China;
    3. Computer School, Beijing Information Science and Technology University, Bejing 100101, China
  • Received:2015-01-10 Online:2015-12-28 Published:2015-12-01

Abstract:

The map reduce program operated on Hadoop distributed file system (HDFS) has a high-energy-cost problem caused by existence of small files. In order to solve this problem, the article established a new energy model of Hadoop node cluster to analyze data then proved that there exists the optimal file size on Hadoop which can reduce the energy cost of program operation to the lowest level, and based on the above data and the margin analysis theory, a judging strategy was put forward, which can find the optimal file size from the angle of energy cost and visit cost. This strategy can merge the small files on HDFS to the optimal file size according to the cost efficiency, so to get the best benefit. The existence of optimal sized data block was proved by examination, and the reasonability and validity of identifying the data block size by the combination of cost and efficiency under the margin analysis theory are proved as well by examination.

Key words: cloud computing, Hadoop distributed file system, Hadoop, energy efficiency, marginal analysis

CLC Number: