北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2013, Vol. 36 ›› Issue (4): 19-22.doi: 10.13190/jbupt.201304.16.fuy

• 论文 • 上一篇    下一篇

时序关系下的闭合序列模式挖掘算法

付宇1, 于艳华1, 宋美娜1, 战晓苏2   

  1. 1. 北京邮电大学PCN&CAD 中心, 北京 100876;
    2. 军事科学院 军事运筹分析研究所, 北京 100876
  • 收稿日期:2012-05-16 出版日期:2013-08-31 发布日期:2013-05-22
  • 作者简介:付宇(1984—),男,博士生,E-mail:ddskyfuyu@gmail.com;战晓苏(1964—),男,教授,博士生导师.
  • 基金资助:

    国家科技支撑计划项目(2012BAH01F02,2013BAH10F01,2013BAH07F02);国家自然科学基金项目(61072060);国家高技术研究发展计划项目(2011AA100706);高等学校博士学科点专项科研基金项目(20110005120007);中央高校基本科研业务费专项资金和教育部信息网络工程研究中心项目

A Closed Sequential Pattern Mining Algorithm in Time Order

FU Yu1, YU Yan-hua1, SONG Mei-na1, ZHAN Xiao-su2   

  1. 1. PCN&CAD Center, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    2. Institute of Military Operations Research and Analysis, Academy of Military Science, Beijing 100876, China
  • Received:2012-05-16 Online:2013-08-31 Published:2013-05-22

摘要:

序列挖掘算法产生冗余序列,造成其运行时间过长. 对此,提出了一种新的闭合序列挖掘算法——时序关系下的闭合序列模式挖掘算法. 依据闭合序列模式的性质,通过比较频繁序列与每个1-项频繁序列之间的时序关系,推断频繁序列模式是否可扩展. 基于IBM公司的合成数据,将其与闭合序列模式挖掘算法进行比较,实验结果表明,这种新的闭合序列挖掘算法可以有效降低运行时间且不易受到属性值个数的影响.

关键词: 时序关系, 闭合序列模式, 数据挖掘

Abstract:

Since there exist redundant sequential patterns in results, such mining runs for a long time. To combat this drawback, a new algorithm, called closed sequential patterns mining algorithm in time order (CloTSP),is proposed. Based on the nature of closed sequential patterns, CloTSP can judge whether a frequent sequential pattern is extended by comparing its time order with each frequent 1-item subsequence. Experiments on the synthetic data produced by International business machines corporation sequential pattern generator show that CloTSP can shorten run-time significantly compared to closed sequential pattern mining (CloSpan). Furthermore, it is also shown that the run-time of CloTSP is not affected by variation of attribute numbers.

Key words: time order, closed sequential patterns, data mining

中图分类号: