Adaptive Security Research

Adaptive
Cyber-Physical
Security

Combining supervised classification and anomaly detection to defend against known and zero-day threats using the CIC-IDS-2018 benchmark.

Intrusion Detection Zero-Day Detection CIC-IDS-2018
Pugazhendhi J & Dally R
Department of CS and AI · Rishihood University
01 — The Problem

Cyber-physical systems can't
detect what they've never seen.

Traditional intrusion detection matches known attack "fingerprints." When a new, never-before-seen threat appears — a zero-day attack — these systems stay silent. And the datasets used to train them are decades old.

🔒
Signature-Based IDS
Matches known fingerprints. Fast, but blind to novel attacks.
📊
Anomaly-Based IDS
Learns "normal" behavior. Can catch zero-days, but too many false positives.
Outdated Benchmarks
NSL-KDD is from 1999. No DDoS, no botnets, only 125K records.
📄
"NSL-KDD is an old data set that does not represent modern network traffic." — Guo, "ML-based Zero-Day Attack Detection: Challenges and Future Directions," Computer Communications, 2023, p. 3
02 — Dataset Choice

From KDD to CIC-IDS-2018 —
a modern solution.

Researchers agree: legacy datasets fail modern IDS research. CIC-IDS-2018 was designed with real traffic profiles and 14+ modern attack families — the gold standard.

Criteria NSL-KDD (1999) ✗ CIC-IDS-2018 ✓
Attack Types 4 generic categories 14+ realistic attack families
Traffic Realism Simulated lab traffic Real-world profiled flows
Scale ~125K records 8.2M+ flow records
Modern Attacks No DDoS, no botnets DDoS, Botnets, Brute-Force, Web Attacks
8.2M
Flow Records
14
Attack Types
81
Features
2018
Year Released
📄
"CSE-CIC-IDS2018 contains more recent network traffic than NSL-KDD…covering 14 different intrusion types from six major categories." — Guo, "ML-based Zero-Day Attack Detection," Computer Communications, 2023, pp. 2–4
03 — Literature Review

What the research says —
and where the gap is.

🧠
ML in Cybersecurity
Supervised models like Random Forest achieve near-perfect accuracy on known attacks. But "known" is the key word — they memorize, not generalize.
📄
"Decision tree model was best; TPR up to 96%, FPR at 5%… varying accuracy depending on attack type." — Guo, 2023, p. 9, Table 1
🔮
Anomaly Detection
One-Class SVMs frame detection as boundary estimation — no attack labels needed. But recall varies wildly across attack types.
📄
"One-class SVM: Recall varies from 27% to 99%… inconsistency against different attacks." — Guo, 2023, p. 9, Table 1
Model Comparison from Literature
Adapted from Guo (2023), Table 1, p. 9
Method ML Model Training Data Results Challenge
Outlier Detector One-Class SVM CIC-IDS2017, NSL-KDD Recall: 27%–99% Inconsistent across attacks
6-Detector Comparison RF, MLP, KNN CSE-CIC-IDS2018 TPR up to 96% Varying by attack type
Hybrid Method RF & SVM Private dataset 88.54% zero-day detected Private data, hard to compare
Transfer Learning SVM, KNN, DT NSL-KDD 70% acc, 0.75 F1 Low accuracy, limited tests
⚠️
The Gap
Most papers evaluate on random train/test splits — leaking future attack knowledge into training. When models face true zero-day threats, recall collapses. There's no structured framework that tests both known-attack classification and true novelty detection together.
04 — EDA & Preprocessing

Understanding the data
before modeling.

8.2 million network flows — 74% benign, 26% attacks across 14 categories. Raw data contains infinite values, high correlations, and identifier columns that leak topology.

Label Distribution
Binary Label Distribution — Benign vs Attack
Correlation Heatmap
Correlation Heatmap — Dense redundancy blocks
Feature Histograms
Feature Histograms — Heavy-tailed, skewed, full of outliers
Preprocessing Steps
1. Dropped identifiers (Flow ID, IPs, Timestamp)  →  2. Replaced inf/-inf with NaN  →  3. Dropped columns with >50% missing values  →  4. Filled remaining NaN with 0  →  5. Normalized labels to lowercase
05 — Feature Engineering

From noisy data clean signal
ready for modeling.

The Problem
81 raw features full of infinite values, zero-variance columns, and highly correlated pairs (ρ > 0.98) that inflate noise and distort models.
The Solution
Drop identifiers → remove constant features → prune correlated pairs (threshold = 0.98) → clean, 52-feature dataset ready for classification.
Clean Correlation
Post-Engineering — Redundancy blocks removed
Clean Histograms
Cleaned Distributions — Ready for modeling
81→52
Features Reduced
0.98
Correlation Threshold
0
Inf / NaN Remaining
06 — Model Selection

Two models, two complementary roles.

One classifies known attacks. The other detects anything abnormal. Together, they cover known threats and zero-day unknowns.

🌲
Random Forest
Role: Supervised classification of known attacks.
How: 100 decision trees, each trained on a bootstrap sample. Majority vote decides the class. No scaling needed.
Supervised 100 Trees Known Attacks
📄
Training: Benign + DoS-Hulk (80/20 split)
Testing: Held-out Benign + DoS-Hulk + Bot, SlowHTTPTest (zero-day)
🔮
One-Class SVM
Role: Anomaly detection — trained on benign traffic only.
How: Learns the boundary of "normal" using an RBF kernel. Anything outside is flagged. No attack labels needed.
Semi-Supervised RBF Kernel Zero-Day
📄
Training: Benign only (20K samples, StandardScaler)
Testing: Held-out Benign + ALL attacks (everything is zero-day)
📄
"Hybrid method using Random Forest & SVM detected 88.54% of zero-day attacks… combining supervised classifiers with anomaly detectors provides both precision on known threats and resilience to novel attacks." — Guo, 2023, p. 9, Table 1
07 — Results & Summary

What the models revealed.

RF dominates known attacks. OCSVM catches what it can't. Our Hybrid approach combines both for adaptive defense.

Random Forest
RF Confusion Matrix
Precise but blind to zero-day
One-Class SVM
OCSVM Confusion Matrix
Detects novelty, high FP
Hybrid Model
Hybrid Confusion Matrix
Balanced adaptive defense
Random Forest
Perfect precision on DoS-Hulk, but zero zero-day recall.
1.0000
Precision
0.1782
Recall
0.0000
Z-Recall
0.3025
F1
One-Class SVM
Catches novel behavior, but trades off precision (more false alarms).
0.8557
Precision
0.1667
Recall
0.1667
Z-Recall
0.2791
F1
Hybrid Model
Balances known precision with unseen detection. Best F1-Score.
0.7609
Precision
0.3973
Recall
0.2663
Z-Recall
0.5220
F1

🛡️ Key Insights from the Data

  • Supervised Models fail on Novelty: Random Forest memorizes known attacks flawlessly (Precision: 1.0000) but is completely blind to novel threats (Zero-Day Recall: 0.0000).
  • Anomaly Detectors trade Precision for Discovery: One-Class SVM successfully catches brand-new zero-day attacks, but it produces more false alarms in the process.
  • The Hybrid Advantage: By processing traffic through both, the Hybrid model effectively doubles the system's reliability (F1-score jumps to 0.5220) and guarantees we aren't blind to the unknown.
References

Academic sources.