Hadeer Ahmed
- BSc (University of Ahram Canadian, 2012)
- MSc (University of Victoria, 2018)
Topic
Enhancing Cybersecurity Text Classification via AMR based Augmentation and Drift Simulation with Reinforcement Learning
Department of Electrical and Computer Engineering
Date & location
- Tuesday, September 9, 2025
- 9:00 A.M.
- Virtual Defence
Examining Committee
Supervisory Committee
- Dr. Issa Traore, Department of Electrical and Computer Engineering, UVic (Supervisor)
- Dr. Lin Cai, Department of Electrical and Computer Engineering, UVic (Member)
- Dr. Sean Chester, Department of Computer Science, UVic (Outside Member)
External Examiner
- Dr. Steven Honghui Ding, School of Information Studies, McGill University
Chair of Oral Examination
- Dr. Marc Klimstra, School of Exercise Science, Physical and Health Education, UVic
Abstract
Natural language processing shows promise for cybersecurity applications but faces two persistent challenges. First, obtaining high-quality labeled cybersecurity text data is difficult since organizations rarely share sensitive incident reports or vulnerability descriptions due to confidentiality concerns. Second, the rapidly evolving nature of cyber threats creates continuous drift in linguistic patterns, causing model performance to degrade as new attack vectors emerge. Traditional text augmentation approaches provide limited relief from data scarcity, typically producing shallow modifications that compromise semantic integrity or fail to preserve domain-specific terminology critical in cybersecurity contexts. Existing drift handling methods rely on generic approaches that inadequately capture the structured evolution of data set or rely on numeric representation of the text data. Most augmentation techniques also lack the transparency and interpretability required for security applications.
This dissertation addresses these limitations through two complementary frame works. The first, AMR-CLONALG, combines Abstract Meaning Representation graphs with a clonal selection algorithm to generate semantically faithful text variations while maintaining precise control over domain-specific modifications. This approach lever ages AMR’s structural properties to preserve meaning while introducing controlled lexical and syntactic diversity, enabling effective augmentation from limited data. The second framework, DriftRL, employs reinforcement learning to simulate realistic patterns of textual drift over time, supporting sudden, gradual, incremental-step, and recurring drift scenarios.
Together, these frameworks provide a comprehensive approach to enhancing cybersecurity text classification. AMR-CLONALG enables organizations to develop robust models from limited proprietary data while maintaining semantic fidelity, while DriftRL allows researchers to proactively test model resilience against evolving data. Both frameworks prioritize transparency and interpretability, offering detailed traceability essential for security-critical applications.