Journal of Beijing University of Posts and Telecommunications

Author Center

Journal

Current Issue

- 2024, Vol. 47 No. 4 Published date:28 August 2024 Last issue Next issue
- Holistic Artificial Intelligence
- FENG Junlan
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 1-10.
- Abstract ( 419 ) 　　
- Artificial intelligence ( AI ) technology represented by foundation models has achieved remarkable results, raising the overall level of machine intelligence to an unprecedented height. Foundation models, computing power, networks, and data are gradually becoming important fundamental infrastructure in the field of artificial intelligence. In order to rely on the above infrastructure to provide ubiquitous and secure social-level intelligent services, and make them as ubiquitous as water, electricity, and 5G services, with the almost zero marginal costs, the holistic artificial intelligence (HAI) technology framework is proposed and elaborated. In this framework, intelligence needs can be flexibly expressed through natural language, graphics, images, component arrangement, and other methods. Based foundational big model, HAI understands the user’s needs and forms an execution plan which includes the models, capabilities, data, and computing network resources needed to complete the tasks. HAI then deploys models and capabilities to corresponding computing network resources, flexibly schedules and jointly optimizes to meet the requirements of business. The proposed framework utilizes core technologies including big loop AI for artificial intelligence services, atomized AI capabilities, network native AI, and secure and trusty AI services.
- Supplementary Material | Related Articles
- Application of Holistic Artificial Intelligence and Large Language Models for Comprehensive Information Collection
- HAN Xu, SUN Yawei, ZHAO Lu
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 11-19,28.
- Abstract ( 338 ) 　　
- The application of holistic artificial intelligence and large language models in the scenarios of comprehensive information collection, analysis, and decision-making has been studied. Firstly, the current state of development of holistic artificial intelligence and large language models is reviewed, where the advantages of the related technologies in the application scenarios of intelligent intelligence are clarified, and a theoretical framework diagram integrating them into intelligent intelligence research is proposed. Secondly, each functional module of the framework is interpreted in detail, and the corresponding technical points are deeply analyzed to explore the specific landing scenarios corresponding to this framework system. Finally, the enhancement of the efficiency and accuracy of intelligent intelligence work by holistic artificial intelligence are analyzed, and the risks and challenges that may be faced in actual applications, as well as the directions for future exploration and development, are discussed.
- Supplementary Material | Related Articles
- Artificial Intelligence in the Era of Large Language Models: Technical Significance, Industry Applications, and Challenges
- CHEN Guang, GUO Jun
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 20-28.
- Abstract ( 1240 ) 　　
- The emergence of ChatGPT marks the advent of the era of artificial intelligence powered by large language models ( LLM ). Based on large-scale datasets for pre-training, LLMs demonstrate exceptional adaptability and creativity, becoming a critical driving force in advancing society and playing a significant role in systematic artificial intelligence. Given the limitations of existent reviews in analyzing the challenges faced by LLMs, their key attributes, and engineering implementation aspects. The framework is rediscussed and reconstructed from three dimensions: technical connotations, industry applications, and major challenges. The focus is on elucidating the connotation on the level of technical aspects of LLMs, including system architecture, training strategies, model scale, compression, multimodal fusion, prompting, and planning. It also explores the application prospects in various fields such as education, scientific research, healthcare, finance, and justice. Additionally, the discussion covers the current state of research on the reliability, controllability, and security of LLMs, as well as the dual challenges LLMs face on both technical and societal levels. It envisions the role of LLMs in systematic artificial intelligence and identifies alignment points in research directions, aiming to provide new perspectives and ideas for the research and application of LLMs.
- Supplementary Material | Related Articles
- Inpainting Computational Fluid Dynamics with Physics-Informed Variational Autoencoder
- WANG Jiamin, YAN Zhexi, WANG Xiaokun, ZHANG Yalan, GUO Yu
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 29-35,43.
- Abstract ( 135 ) 　　
- To repair noisy or partially missing fluid flow data and achieve accurate fluid dynamics
  analysis, a physics-informed variational autoencoder model is proposed. First, the variational autoencoder is employed to learn the latent representation of fluid flow. Second, spatiotemporal coordinate information is combined with this latent representation, and the partial derivatives of the decoded flow field information with respect to the spatiotemporal coordinates are obtained by automatic differentiation technology. Finally, physical prior information of fluid dynamics is introduced to construct a physical constraint loss term. This ensures that the generated data conforms to both the key features of the flow and the underlying physical laws, thereby enhancing the physical consistency and reconstruction accuracy while providing a certain level of interpretability. Experimental results demonstrate that the proposed model achieves higher accuracy than existing methods in dealing with flow field noise and data missing issues. Its effectiveness is also proven in both two-dimensional and three-dimensional complex vortex flow fields.
- Supplementary Material | Related Articles
- The Instruction Tuning of Large Language Models with Multi-Modal Recommendation Instruction
- HAO Bowen, LIU Yifei, LI Liyao, WANG Jie, PENG Yan
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 36-43.
- Abstract ( 528 ) 　　
- The tuning of large language models based on multimodal instructions has been proven effective in endowing large language models with the capability to address relevant multimodal tasks. To further empower large language models in handling multimodal zero-shot or few-shot recommendation tasks, multi-modal recommendation of large language model is proposed, which is built upon the foundation of ChatGLM2-6B, and is trained on multimodal recommendation dataset that includes both textual and image information. The construction of multimodal user profiles and item attributes is achieved through the utilization of ChatGPT and GPT-4 for generating instructions. Additionally, instructions for zero-shot and few-shot recommendations are formulated. The model undergoes efficient parameter fine-tuning using the P-tuning v2 method, requiring only a single A100 40GB graphics processing unit for the fine-tuning process. Experimental results demonstrate that the proposed model significantly outperforms existing baseline models.
- Supplementary Material | Related Articles
- Fluid Dynamics Prediction Based on Spatial Polar Coordinate Convolution
- DU Feilong, BAN Xiaojuan, ZHANG Yalan, DONG Zirui, WANG Xiaokun
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 44-49.
- Abstract ( 112 ) 　　
- Fluid, as one of the most fundamental substances in nature, often requires a trade-off between accuracy and efficiency in its simulation. Thus, a novel end-to-end systematic convolutional network fluid simulator called PolarNet is proposed. Firstly, particle data is converted to a 3D polar coordinate representation, and the PolarConv spatial convolution structure is designed. Secondly, combined with a physical fluid simulator, a four-layer network structure is constructed to design the network fluid simulator PolarNet, achieving end-to-end fluid prediction. Additionally, physically-based constraints are carefully designed to enhance fluid incompressibility. Experimental results show that compared with traditional modeling-based simulators, PolarNet significantly improves the accuracy of fluid boundaries and maintains incompressibility while ensuring efficiency. Compared with the other learning-based fluid simulators, PolarNet maintains the highest prediction stability with fewer training parameters, benefiting from the compact representation of polar coordinates. PolarNet provides new methods and perspectives for multimodal information processing.
- Supplementary Material | Related Articles
- Domain-Specific Question Answering System Construction Approach Integrated with Large Language Model
- QI Siyang, HU Huiyun, LI Hongbing, LI Qi, XIAO Bo
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 50-56.
- Abstract ( 439 ) 　　
- The construction of domain-specific question answering system frequently encounters challenges, including substantial data costs, intricate knowledge construction, and the significant differences among datasets from various domains. To address these challenges, an approach that integrates large language models and domain specific knowledge for question answering system construction is proposed. Most of the existing methods directly store and match local knowledge corpus in segments. When performing retrieval-augmented generation, the semantic matching between the query and the corpus is insufficient, thus reducing the quality of text generation. Therefore, the prompt aligned retrieval generation approach is proposed to unify the semantic space of user queries and corpus by generating pseudo question and answer pairs, thereby improving the retrieval efficiency of domain knowledge and the accuracy of answers. Experiments show that the proposed approach overcomes challenges related to high model training costs, enabling rapid deployment across various vertical domains and outperforming other methods.
- Supplementary Material | Related Articles
- Cross-Domain Object Detection Algorithm for Complex End-to-End Scene Understanding
- CHEN Aoran, HUANG Hai, ZHU Yueyan, XUE Junsheng
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 57-62.
- Abstract ( 161 ) 　　
- Conventional deep learning training approaches often assume a similarity between the deployment scenario and the visual domain features present in the training data. However, this assumption might not hold true in complex end-to-end scenarios, making it difficult to meet the demands of intelligent detection services in open environments. In response, an object detection algorithm based on artificial intelligence closed-loop ensemble theory with cross-domain capabilities has been introduced. Within the detection framework, construct a backbone network and bottleneck layer network with multi-scale convolutional layers. A visual domain discriminator featuring long-range dependency attention works as a secondary detection head to refine the results. Moreover, a background focusing module, based on spatial reconstruction attention units, is able to enhance learning focused on pseudo-background representations, thereby improving the accuracy of cross-domain object detection. Experimental results show that, compared to two-stage algorithms, the proposed algorithm yields an average precision increase 6.9% , and surpasses single-stage algorithms by 9.0% in complex end-to-end scenarios.
- Supplementary Material | Related Articles
- Image Denoising Network Algorithm Based on Jacobian Dynamic Approximation
- LIU Meiqin, JI Houguo, BAI Yu, YAO Chao, ZHAO Yao
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 63-70.
- Abstract ( 174 ) 　　
- Limited by the environment and the acquisition device, the captured images are susceptible to the noise and this leads to the poor visual perception for the users. To systematically remove noise from the images, an end-to-end Jacobian approximation denoising network algorithm is proposed. Specifically, an ordinary differential equation is utilized to construct a forward differential structure for dynamically simulating the noise distribution in the image. A Jacobian matrix-based solution module is designed to implement the forward derivative and reduce the complexity of the denoising network. To enhance the feature representation capability for complex noise, a multi-scale feature extraction module is designed to capture the features of non-uniform noise. Besides, a dual attention structure is used to enhance the reconstructed features and improve the quality of the reconstructed images. Extensive experimental results demonstrate the effectiveness of the proposed algorithm on eliminating synthetic and real noise from images, and the reconstructed image achieves better results in both subjective visual effect and objective evaluation metrics.
- Supplementary Material | Related Articles
- Few-Shot Knowledge Graph Completion Based on Subgraph Structure Semantic Enhancement
- YANG Rongtai, SHAO Yubin, DU Qingzhi, LONG Hua, MA Dinan
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 71-76,89.
- Abstract ( 213 ) 　　
- A model referred to as subgraph structure semantic enhancement for few-shot knowledge graph completion is proposed in addressing the limitations of insufficient semantic representation of entities in few-shot learning contexts. First, an attention mechanism is employed to extract text semantic features of relation interaction, and to extract subgraph structure semantic features of clustering coefficients. Subsequently, entity semantic aggregation is executed through the utilization of a feedforward neural network and a Transformer network is applied to encode triples. Finally, the score for link prediction is computed using the prototype matching network. Experimental results show the proposed model’s superiority over metric-learning-based baseline models, outperforming the latest meta-learning-based baseline model in Hits@ 1 index on the NELL-One dataset. Moreover, across all indices on the Wiki-One dataset, the proposed model delivers optimal results. This demonstrates the proposed model’s effectiveness in enhancing entity representation and improving prediction accuracy.
- Supplementary Material | Related Articles
- Zero-Shot Rumor Detection via Meta Multi-Task Prompt Learning
- SHI Yu, YU Ning, SUN Yawei, LIU Jianyi
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 77-82.
- Abstract ( 200 ) 　　
- To address the issue of the vast amount of memory usage associated with fine-tuning large language models in existing rumor detection methods, and to tackle the sensitivity of prompt learning to its initial point, a meta multi-task prompt learning method for zero-shot rumor detection is proposed. First, the objective of the zero-shot rumor detection task objective is modified based on the prompt learning, and the prompt template is designed to make its task objective align with the training task objective of large language models, fully leveraging the prior knowledge accumulated by large language models. Second, the parameter update strategy based on meta-learning is employed to rapidly identify suitable initial points of the prompt template for zero-shot rumor detection, and the meta-knowledge is learned from different meta-tasks to achieve parameter optimization. Finally, sentiment analysis is introduced as an auxiliary meta-task to further model parameter optimization. Extensive experiments conducted on real-world datasets demonstrate that the proposed model outperforms baseline methods in zero-shot rumor detection tasks, achieving the best performance across various metrics.
- Supplementary Material | Related Articles
- Industrial Paste Concentration Measurement Through Holistic Smart Visual Information Fusion Model
- WANG Hezheng, MA Boyuan, LI Xiaorui, GUO Lijie, LIU Guangsheng
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 83-89.
- Abstract ( 107 ) 　　
- To counter existing limitations in paste concentration monitoring of paste backfilling technique, such as lack of accuracy, short life expectancy of related device, prolonged detection time, and limitations due to safety issues, a two-stream visual feature fusion model for automatic soft measuring of paste concentration is proposed, reducing the need for manual participation, increasing automation, and furthering the application of holistic artificial intelligence in the field of smart mining. The model, based on convolutional neural network model, adopts a two-stream architecture, analyzes the paste video and the corresponding optic flow information, extracts spatial and temporal features from the input, and produces two-stream feature representation. The feature fusion module further enhances representation for effective features, enabling the model to accurately measure paste concentration through non-contact method. In addition, video data of paste under production environment is collected to construct a dataset, and evaluated the proposed model with the dataset. The experiment results show that the proposed model can reach an accuracy of 94.16% , surpassing other deep learning methods by 3.47% under the same condition, fulfilling the need to conduct accurate real-time paste concentration detection in production environment.
- Supplementary Material | Related Articles
- Physics-Informed Neural Differential Equation Model
- CHEN Haowei, GUO Yu, YUAN Zhaolin, WANG Baojie, BAN Xiaojuan
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 90-97.
- Abstract ( 147 ) 　　
- In process industries, the coupling of multiple complex devices makes it challenging for independent device models to effectively guide actual production. Pure data-driven models often face out- of-distribution generalization issues, resulting in poor data efficiency and generalization capabilities. In response to this, a physics-informed neural differential equation model is proposed for flotation, a typical process industry system. The model considers the coupling relationships between devices and global characteristics, utilizing physical priors to reconstruct neural differential equations to model an environment-aware single intelligent agent. The proposed model consists of a sequence encoder, an interpolation module, a neural differential equation inference module, and a state decoder. The gradient network computational graph structure of the neural differential equations is designed based on physical priors. By establishing different systems according to the actual process topology, the multi-agent model can achieve long-term liquid level prediction for the entire flotation process and assist in multi-agent collaborative control as an online simulation environment. The model was validated using an industrial dataset collected from a flotation plant. The results show that the proposed model demonstrates superior data efficiency and generalization capability compared with the discrete-time models and baseline models without leveraging physical information to reconstruct gradient network.
- Supplementary Material | Related Articles
- Construction and Application of Holistic Artificial Intelligence System for Medical Large Language Models
- LUO Yan, LIU Yuyang, LI Xiaoying, LIU Hui
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 98-104.
- Abstract ( 232 ) 　　
- Large language models ( LLMs ) are recognized for their powerful self-learning and understanding skills, as well as their huge development potential and application value in medical domain. However, current LLMs in the medical field are marked by an urgent demand for a large amount of pre-training data, high computing power costs, and a lack of standardized standards and indicator systems, which greatly limits its expansion and application. To address the above-mentioned issues, a systematic large-scale modeling framework is proposed for the whole medical process service scenario. Knowledge factorization and dynamic resource management methods are utilized in this framework to achieve model simplification and native network construction, enabling elastic deployment and flexible configuration of the models. The reliance on computational resources is reduced to some extent by this approach. Furthermore, blockchain technology is incorporated into the framework to ensure the security and trustworthiness of medical data. By introducing the concept of holistic artificial intelligence, a holistic artificial intelligence system framework for the medical field is constructed, aiming to promote the rapid implementation and sustain healthy development of medical LLMs.
- Supplementary Material | Related Articles
- Improved Byzantine Fault Tolerant Consensus Algorithm Based on Sharded DAG Blockchain
- LI Xiaohui, LIU Xiaowei, LYU Siting
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 105-110.
- Abstract ( 180 ) 　　
- In the context of the Internet of things, traditional blockchain faces challenges such as insufficient scalability, high costs, and low block generation efficiency. The introduction of the structure of directed acyclic graph(DAG) can effectively enhance the concurrency of the blockchain system, but it also brings about problems such as heavy network load and difficult to achieve consistency. To address these issues, a DAG blockchain model combined with a network sharding scheme is designed. Based on this, an improved Byzantine fault tolerance consensus algorithm is proposed. The nodes in the network are divided into several groups through community mechanism discovery by the proposed algorithm. In each group, candidate nodes are selected through a trust scoring mechanism. Subsequently, a verifiable random function is employed to select primary nodes, followed by an enhancement of the consensus process based on an aggregated signature scheme. Simulation results demonstrate that the proposed algorithm can reduce transaction latency and effectively increase system throughput.
- Supplementary Material | Related Articles
- The Power Model of Data Center Server Based on Temporal Convolutional Network
- ZHOU Zhou, ZHU Dan, LI Chuang, NAN Suqin, WEN Yanhua
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 111-116.
- Abstract ( 167 ) 　　
- To solve the problem of real-time server energy consumption estimation, a data center server energy consumption prediction method based on a temporal convolutional network is proposed. First, for the different workloads handled by the server, it is divided into four categories, namely central processing unit-intensive workload, memory-intensive workload, input / output intensive workload, and hybrid workload. Then, the method calculates the importance of characteristic parameters under four different workloads through a random forest algorithm and then selects representative parameters greater than the threshold as the input of the model. Finally, the temporal convolutional network is used to build the energy consumption prediction model of the data center server. The experimental results show that compared with other models, the average relative error of the proposed model is reduced by 2.18% ~5.29% , which has certain advantages in the accuracy of energy consumption prediction.
- Supplementary Material | Related Articles
- Anime Image Style Transfer Algorithm Based on Improved Generative Adversarial Networks
- LI Yunhong, ZHU Jingkun, LIU Xingrui, CHEN Jinni, SU Xueping
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 117-123.
- Abstract ( 221 ) 　　
- An improved anime style transfer algorithm for generative adversarial networks is proposed to address the issues of missing detail structure, color shifting, and semantic content artifacts. Firstly, a feature transformation module is constructed by combining channel shuffle operations with improved inverted residual blocks to enhance the local feature attributes of the image, and an efficient attention mechanism is incorporated to further improve the style feature representation capability. Secondly, the style loss function is modified to suppress the influence of brightness and color variations on high-frequency texture learning. Finally, content images containing random noise are fed into the generator and a spectral normalization constraint is applied to the discriminator network to address the issue of mode collapse. The experimental results demonstrate that the image generated by the proposed method is richer in detail than other algorithms, and effectively avoiding the occurrence of artifacts and color shifting, so that the generated image will have a greater sense of realism, achieving style Frechet inception distance of 154.61 and 115.64, respectively.
- Supplementary Material | Related Articles
- Dynamic Network Slicing Resource Deployment Algorithm in Vehicular Networks
- LI Xiaohui, ZHOU Yuanyuan, LYU Siting, SU Jianan
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 124-129.
- Abstract ( 254 ) 　　
- Considering the problem of complex topology in the rapid movement of vehicles in the network of vehicles, a dynamic network slicing resource deployment algorithm based on deep reinforcement learning is proposed. In the communication scenario of vehicle to infrastructure, for the changing vehicle topology and business requests, the slicing resource deployment problem is modeling as an observed Markov decision model, and the joint controller is used to monitor the network status in real time. The parameters are updated in real time according to the value of the actions in the distribution ratio of slicing resources, and a prioritized experience replay strategy is introduced to accelerate convergence speed, providing sufficient communication resources for each service request to interact with vehicle speed and location information. Simulation experiment results indicate that, compared to other algorithms, the proposed algorithm demonstrates better performance in end-to-end throughput, end-to-end latency, slice packet loss rate, and vehicle service request acceptance rate.
- Supplementary Material | Related Articles
- Fine-Grained Image Classification Based on Multi-Modal Features and Enhanced Alignment
- HAN Jing, ZHANG Tianpeng, LYU Xueqiang
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 130-135.
- Abstract ( 486 ) 　　
- Addressing the limitations of existing models in multimodal information processing, such as inadequate feature extraction and insufficient information interaction, a fine-grained image classification model is proposed, incorporating multi-modal features and enhanced alignment. A hierarchical feature adaptive fusion module is proposed to achieve multi-level adaptive fusion of multi-modal features, fully utilizing feature information of the convolutional intermediate layer and enhancing the model's ability to perceive local details of the image. Additionally, an enhanced aligned feature fusion module is proposed to improve the interaction dimension between multimodal features and make full use of the mapping relationship between different modalities. Experimental results show that the proposed model achieves excellent recognition performance on several public datasets, outperforming previous multimodal feature fusion models. Furthermore, through comparative analysis in ablation experiments, the results of individual modules are better than the original model, highlighting the effectiveness of the proposed model.
- Supplementary Material | Related Articles
- A Symbol Detection Algorithm for Cooperative MIMO-NOMA Systems
- XIE Wenwu, LI Pan, XIAO Jian, WANG Ji, YANG Liang
- Journal of Beijing University of Posts and Telecommunications. 2024, 47(4): 136-142.
- Abstract ( 198 ) 　　
- Influenced by the power allocation and superposition coding at the transmitter side, the power- domain non-orthogonal multiple access (NOMA) symbol detection algorithm based on a single-task neural network is not compatible with the symbol detection task for different users. A symbol detection algorithm based on multi-task neural network is designed for user-assisted cooperative multiple-input multiple-output (MIMO)-NOMA communication system, which can learn the deep shared features of data and detect symbols of different users simultaneously. In cooperative communication, the signal data received by different users are distributed differently, and there is a problem of data island. However, the training data and the test data are required by the machine learning model to be independently and equally distributed. Therefore, the multi-task federal learning framework isproposed to address this problem. The experimental results show that with the improvement of signal-noise-ratio ( SNR), the proposed symbol detection algorithm has better performance than the traditional symbol detection algorithm.
- Supplementary Material | Related Articles