Skip to main content

Table 23 Summary of ML for payload-based anomaly detection

From: A comprehensive survey on machine learning for networking: evolution, applications and research opportunities

Ref. ML Technique Dataset Features Evaluation
     Settings Results
Zanero et al. [493] Unsupervised A two-tier SOM-based architecture (Offline) Normal: KDD Cup [257] Attack: Scans from Nessus [44] Packet headers and payload -2,000 training packets -2,000 testing packets -10x10 SOM trained for 10,000 epochs -Platform used: SOM toolbox [12] Improves DR by 75% over 1-tiered S.O.M
Wang et al. [459] Unsupervised Centroid model (Offline) KDD Cup [257] & CUCS Payload of TCP traffic -2 weeks training data -3 weeks testing data -Inside network TCP data only -Incremental learning DR w/ payload of a packet: 58.8% DR w/ first 100 bytes of a packet: 56.7% DR w/ last 100 bytes of a packet: 47.4% DR w/ all payloads of a con: 56.7% DR w/ first 1000 bytes of a Con: 52.6% Training time: 4.6-26.2 sec Testing time: 1.6-16.1 sec
Perdisci et al. [356] Supervised Ensemble of single-class SVM (Offline) Normal: KDD Cup [257] Normal: GATECH Attack: CLET [117] Attack: PBA [149] Generic [204] Payload -50% of dataset for training -50% of dataset for testing -11 OCSVM trained with 2 v -grams; v=1...10 -5-fold cross validation on KDD cup -7-fold cross validation on GATECH -2 GHz Dual Core AMD Opteron Processor and 8GB RAM Generic DR w/ FP 105: 60% shell-code DR w/ FP 105: 90% CLET DR w/ FP 105: 90% Detection time KDD Cup: 10.92 ms Detection time GATECH: 17.11 ms
Gornitz et al. [171] Supervised SVDD (Online) Normal: from Fraunhofer Inst. Attack: Metasploit payload -2,500 training network events -1,250 testing network events -Active Learning -Fraction of Labeled data: 1.5% DR: 96% FP: 0.0015%