Please wait a minute...
Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184

Frontiers of Information Technology & Electronic Engineering  2021, Vol. 22 Issue (9): 1153-1168   https://doi.org/10.1631/FITEE.2000286
  本期目录
网络空间安全命名实体识别综述
高宸1, 张璇1,2,3(), 韩梦婷1, 刘会1
1. 云南大学软件学院,中国昆明市,650091
2. 云南省软件工程重点实验室,中国昆明市,650091
3. 网络空间工程研究中心,中国昆明市,650091
A review on cyber security named entity recognition
Chen GAO1, Xuan ZHANG1,2,3(), Mengting HAN1, Hui LIU1
1. School of Software, Yunnan University, Kunming 650091, China
2. Key Laboratory of Software Engineering of Yunnan Province, Kunming 650091, China
3. Engineering Research Center of Cyberspace, Kunming 650091, China
 全文: PDF(612 KB)  
摘要:

随着互联网技术飞速发展和大数据时代到来,越来越多网络空间安全文本出现在互联网上。这些文本不仅包括安全概念、事件、工具、指南和政策,还包括风险管理方法、最佳实践、保证和技术等。整合大规模、异构和非结构化的网络空间安全信息,对网络空间安全实体进行识别和分类,有助于处理和解决网络空间安全问题。由于网络空间安全领域文本的复杂性和多样性,使用传统的命名实体识别(NER)方法难以识别该领域中的安全实体。本文介绍该领域NER的各种方法和技术,包括基于规则的方法、基于字典的方法和基于机器学习的方法,并讨论该领域NER研究面临的问题,如实体词组的结合与分离、非标准化的命名约定、缩写和大量嵌套等。最后,提出NER在网络空间安全方面的3个研究方向:(1)应用无监督或半监督技术;(2)开发更全面的网络空间安全本体;(3)应用更加有效的深度学习模型。

Abstract

With the rapid development of Internet technology and the advent of the era of big data, more and more cyber security texts are provided on the Internet. These texts include not only security concepts, incidents, tools, guidelines, and policies, but also risk management approaches, best practices, assurances, technologies, and more. Through the integration of large-scale, heterogeneous, unstructured cyber security information, the identification and classification of cyber security entities can help handle cyber security issues. Due to the complexity and diversity of texts in the cyber security domain, it is difficult to identify security entities in the cyber security domain using the traditional named entity recognition (NER) methods. This paper describes various approaches and techniques for NER in this domain, including the rule-based approach, dictionary-based approach, and machine learning based approach, and discusses the problems faced by NER research in this domain, such as conjunction and disjunction, non-standardized naming convention, abbreviation, and massive nesting. Three future directions of NER in cyber security are proposed: (1) application of unsupervised or semi-supervised technology; (2) development of a more comprehensive cyber security ontology; (3) development of a more comprehensive deep learning model.

Key wordsNamed entity recognition (NER)    Information extraction    Cyber security    Machine learning    Deep learning
收稿日期: 2020-06-13      出版日期: 2021-11-15
通讯作者: 张璇     E-mail: zhxuan@ynu.edu.cn
Corresponding Author(s): Xuan ZHANG   
 引用本文:   
高宸, 张璇, 韩梦婷, 刘会. 网络空间安全命名实体识别综述[J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22(9): 1153-1168.
Chen GAO, Xuan ZHANG, Mengting HAN, Hui LIU. A review on cyber security named entity recognition. Front. Inform. Technol. Electron. Eng, 2021, 22(9): 1153-1168.
 链接本文:  
https://academic.hep.com.cn/fitee/CN/10.1631/FITEE.2000286
https://academic.hep.com.cn/fitee/CN/Y2021/V22/I9/1153
[1] FITEE-1153-20001-CG_suppl_1 Download
[2] FITEE-1153-20001-CG_suppl_2 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed