Ref. | ML Technique | Dataset | Features | Classes | Evaluation | |
---|---|---|---|---|---|---|
Settings | Results | |||||
Bernaille et al. [55] ∗ | Unsupervised k-Means | Proprietary: univ. network | Packet size and direction of first P packets in a flow | eDonkey, FTP, HTTP, Kazaa, NNTP, POP3, SMTP, SSH, HTTPS, POP3S | P=5, k=50 | Accuracy > 80% |
Supervised J48 DT, k-NN, Random Tree, RIPPER, MLP, NB | Proprietary: Univ. Napoli campus network | Payload size stats and inter-packet time stats of first N packets, bidirectional flow duration and size, transport protocol | BitTorrent, SMTP, Skype2Skype, POP, HTTP, SOULSEEK, NBNS, QQ, DNS,SSL RTP, EDONKEY | N=1...10 | Overall accuracy = 98.4% with BKS (J48, Random Tree, RIPPER, PL) combiner, N=10 | |
Nguyen et al. [337] †| Supervised NB, C4.5 DT | Proprietary: home network, univ. network, game server | Inter-packet arrival time statistics, inter-packet length variation statistics, IP packet length statistics of N consecutive packets | Enemy Territory (online game), VoIP, Other | N=25 | C4.5 DT: Enemy Territory - recall ∗ = 99.3%, prec. ∗ = 97%; VoIP - recall ∗= 95.7%, precision ∗= 99.2% NB: Enemy Territory - recall ∗ = 98.9%, prec. ∗ = 87%, VoIP - recall ∗= 99.6%, precision ∗= 95.4% ∗ median |
Erman et al. [137] ⋆ | Semi-supervised k-Means | Proprietary: Univ. Calgary | Number of packets, average packet size, total bytes, total header bytes, total payload bytes (caller to callee and vice versa) | P2P, HTTP, CHAT, EMAIL, FTP, STREAMING, OTHER | k = 400, 13 layers, packet milestones (number of packets) in layers are separated exponentially (8, 16, 32, …) | Flow accuracy > 94%, byte accuracy 70-90% |
Li et al. [270] ⋆ | Supervised C4.5 DT, C4.5 DT with AdaBoost, NBKE | Proprietary | A subset of 12 from 248 features [321] of first N packets | WEB, MAIL, BULK, Attack, P2P, DB, Service, Interactive | N=5 | C4.5 DT: Accuracy >99%; Attack is an exception with moderate-high recall |
Jin et al.[222] ⋆ | Supervised AdaBoost | Proprietary: ISP network, labeled as in [176] | Lowsrcport, highsrcport, duration, mean packet size, mean packet rate, toscount, tcpflags, dstinnet, lowdstport, highdstport, packet, byte, tos, numtosbytes, srcinnet | Business, chat, DNS, FileSharing, FTP, Games, Mail, Multimedia, NetNews, SecurityThreat, VoIP, Web | Number of binary classifiers (k): TCP = 12, UDP = 8 | Error rate: TCP = 3%, UDP = 0.4% |
Bonfiglio et al. [69] ‡ | Supervised NB, Pearson’s χ2 test | Proprietary: univ. network, ISP network | Message size, average inter-packet gap | Skype | NB decision threshold B min =−5, χ2(Thr)=150 | NB ∧χ2: UDP – E2E - FP = 0.01%, FN = 29.98% E2O - FP = 0.0%, FN = 9.82% (univ. dataset); E2E - FP = 0.01%, FN = 24.62% E2O - FP = 0.11%, FN = 2.40% (ISP dataset) TCP – negligible FP |
Alshammari et al. [17] ‡ | Supervised AdaBoost, SVM, NB, RIPPER, C4.5 DT | AMP [457], MAWI [474], DARPA99 [278], proprietary from Univ. Dalhousie | Packet size, packet inter-arrival time, number of packets, number of bytes, flow duration, protocol (forward and backward direction) | SSH, Skype | N/A | C4.5 DT: SSH – DR = 95.9%, FPR = 2.8% (Dalhousie), DR = 97.2%, FPR = 0.8% (AMP), DR = 82.9%, FPR = 0.5% (MAWI) Skype – DR = 98.4%, FPR = 7.8% (Dalhousie) |
Shbair et al. [409] ‡ | Supervised C4.5 DT, RF | Synthetic trace | Statistical features from encrypted payload and [253] (client to server and vice versa) | Service Provider (number of services): Uni-lorraine.fr (15), Google.com (29), akamihd.net (6), Googlevideo.com (1), Twitter.com (3), Youtube.com (1), Facebook.com (4), Yahoo.com (19), Cloudfront.com (1) | N/A | RF (service provider): precision = 92.6%, recall = 92.8%, F-measure = 92.6% RF (service): accuracy in 95-100% for majority of service providers > 100 connections per HTTPS service |