New Framework MultiKG Enhances Cyber Attack Technique Representation
/ 4 min read
Quick take - A recent study introduces MultiKG, a new framework that enhances the representation of cyber attack techniques by integrating audit logs, static code analysis, and Cyber Threat Intelligence reports, addressing limitations of previous methodologies and demonstrating high accuracy in extracting and aggregating attack knowledge graphs for improved cybersecurity applications.
Fast Facts
- The MultiKG framework, developed by researchers from Zhejiang University and Northwestern University, enhances the representation of cyber attack techniques by integrating audit logs and static code analysis with Cyber Threat Intelligence (CTI) reports.
- It addresses limitations of previous methodologies that relied solely on coarse-grained and unstructured CTI reports, leading to incomplete knowledge graphs.
- Utilizing a Large Language Model (LLM), MultiKG constructs and merges attack graphs from diverse sources, aiming to create a fine-grained, unified attack technique knowledge graph for improved cybersecurity applications.
- Evaluation of MultiKG demonstrated high accuracy in extracting and aggregating attack knowledge from 1,015 real attack techniques and 9,006 CTI entries, contributing to tasks like attack reconstruction and detection.
- The study highlights ongoing challenges in integrating multi-source threat knowledge and suggests future research should focus on detailed threat intelligence mappings for better attack representation.
New Framework MultiKG Enhances Cyber Attack Technique Representation
Introduction to MultiKG
A recent study authored by Jian Wang, Tiantian Zhu, Chunlin Xiong, and Yan Chen introduces a new framework called MultiKG, which marks a significant advancement in the representation of cyber attack techniques. The researchers are affiliated with prestigious institutions, including Zhejiang University, Zhejiang University of Technology, China Unicom, and Northwestern University.
The study addresses the limitations of previous methodologies that primarily relied on textual data from Cyber Threat Intelligence (CTI) reports. These reports are often coarse-grained and unstructured, leading to incomplete and inaccurate knowledge graphs that do not fully capture the complexities of cyber threats.
Integration of Diverse Data Sources
To overcome these challenges, the MultiKG framework integrates audit logs and static code analysis alongside CTI reports. This integration enhances the granularity of attack knowledge representation. The system utilizes a Large Language Model (LLM) for analyzing, constructing, and merging attack graphs derived from various sources, including CTI reports, dynamic logs, and static code. The goal is to create a fine-grained, unified attack technique knowledge graph that can serve multiple purposes in cybersecurity.
The evaluation of MultiKG involved the analysis of 1,015 real attack techniques and 9,006 attack intelligence entries from actual CTI reports. The results demonstrated that MultiKG can accurately extract and aggregate attack knowledge graphs from diverse sources, contributing to downstream security tasks such as attack reconstruction and detection.
Despite its advancements, the study identifies ongoing challenges in collecting and summarizing multi-source threat knowledge to accurately represent complex attack variants. Integrating data from different sources and formats remains a critical concern for efficient representation of attack knowledge.
Contributions and Future Directions
The contributions of MultiKG are substantial, as it combines threat intelligence, static code analysis, and dynamic log analysis to create a comprehensive framework for threat knowledge gathering. The framework implements algorithms designed for extracting attack technique graphs from audit logs and static code. It employs LLMs to analyze entities and relationships in threat reports. Furthermore, it develops algorithms for both single-resource fusion and multi-resource merging, achieving a cohesive knowledge representation.
The article critiques existing tools in the cybersecurity landscape, noting that many do not consider the correlation between security threat intelligence and real attack logs. This limitation affects their effectiveness in detecting attacks. Some tools focus narrowly on single entity information and consequently lack the necessary structural and semantic context.
Looking ahead, the authors suggest that future research should continue to emphasize the integration of multiple information sources to construct detailed and comprehensive threat intelligence mappings. The data collection methodology utilized in this study involved the Atomic Red Team for security testing and an ETW-based automated data collector for capturing audit logs. PowerShell tools were employed in a controlled environment to execute attack scripts and collect corresponding logs.
MultiKG’s architecture is structured around two main subsystems: single-source threat knowledge analysis and multi-source attack knowledge graph analysis. The construction of knowledge graphs entails extracting attack data from various sources, analyzing the information, and creating a structured representation of attack techniques.
The evaluation setup involved a comparative analysis between MultiKG’s results and manually annotated ground truth data to assess the framework’s accuracy. The results indicate that MultiKG exhibits high accuracy in extracting attack technique information and effectively aggregating technical-level knowledge. Real-world case studies demonstrate MultiKG’s utility in attack reconstruction and detection, showcasing its practical application in cybersecurity contexts.
The article also discusses related work, emphasizing the role of cyber threat intelligence and the limitations of existing methods in accurately representing attack techniques.
Original Source: Read the Full Article Here