Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM ›› 2009, Vol. 32 ›› Issue (6): 125-129.doi: 10.13190/jbupt.200906.125.tangl

• Reports • Previous Articles     Next Articles

A Novel Dynamic Spectrum Allocation Algorithm Based on POMDP Reinforcement Learning

TANG Lun;CHEN Qian-bin;ZENG Xiao-ping   

  1. (1.College of Communication Engineering, Chongqing University, Chongqing 400044, China;
    2.School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China)
  • Received:2008-10-05 Revised:2009-08-31 Online:2009-12-28 Published:2009-12-28
  • Contact: TANG Lun

Abstract:

A game model based on Vickrey-Clarke-Groves (VCG) mechanism for dynamic spectrum allocation is presented, to solve the complexity problem of the dynamic spectrum allocation and reduce information exchange during the dynamic spectrum allocation. Further, a partially observable Markov decision processes (POMDP) reinforcement learning algorithm is presented. Through the observation and statistics of historical information, the secondary users enhance the reward value of bidding strategy by continuous learning, so as to obtain the optimal bidding strategy
. Finally, the POMDP reinforcement learning algorithm is transformed into optimal strategy learning algorithm of belief Markov decision processes(MDP), which is solved by using the value iteration algorithm. The simulation results reveales that the POMDP reinforcement learning algorithm can significantly improve the performance of dynamic spectrum allocation.

Key words: dynamic spectrum allocation, Vickrey-Clarke-Groves mechanism, cognitive network