تفاصيل العمل

VulnSneak is an AI-powered cybersecurity project built to detect security vulnerabilities in source code using transformer-based deep learning. The system is based on microsoft/codebert-base, a pretrained model designed for understanding programming languages and code structure.

The project follows a two-stage detection pipeline. In the first stage, a binary classification model analyzes the input source code and predicts whether it is Safe or Vulnerable. In the second stage, if the code is vulnerable, a family classification model identifies the exact vulnerability type from 8 vulnerability families:

CSRF

Insecure Cryptography

Insecure Deserialization

OS Command Injection

Path Traversal

SQL Injection

XML Injection

XSS

The binary model was trained on a balanced dataset containing Safe and Vulnerable code samples. The family classification model was trained on vulnerable code only to classify each sample into its correct vulnerability family. Both models were fine-tuned using CodeBERT, Hugging Face Transformers, PyTorch, and scikit-learn.

The project includes complete training notebooks, dataset loading, tokenization, model fine-tuning, evaluation metrics, confusion matrix analysis, per-family performance analysis, and saved model outputs.

The binary model achieved strong performance with:

Accuracy: 98.33%

Vulnerable Recall: 97.75%

Vulnerable F1 Score: 98.32%

The family classification model achieved excellent multi-class performance with:

Accuracy: 99.81%

Macro F1 Score: 99.80%

Weighted F1 Score: 99.81%

This project demonstrates the ability to apply AI and deep learning techniques to real cybersecurity problems, especially automated secure code review, vulnerability detection, and vulnerability family diagnosis.

بطاقة العمل

اسم المستقل
عدد الإعجابات
0
عدد المشاهدات
2
تاريخ الإضافة
تاريخ الإنجاز
المهارات