Skip to main content

Table 5 Summary of supervised flow feature-based traffic classification

From: A comprehensive survey on machine learning for networking: evolution, applications and research opportunities

Ref. ML Technique Dataset Features Classes Evaluation
      Settings Results
Roughan et al. [390] Supervisedk-NN Proprietary: univ. networks, streaming service Packet-level and flow-level features Telnet, FTP-data, Kazaa, RealMedia Streaming, DNS, HTTPS k=3, number of QoS classes = 3, 4, 7 Error rate: 5.1% (4), 2.5% (3), 9.4% (7); (#): number of QoS Classes
Moore and Zuev [321] Supervised NBKE Proprietary: campus network Baseline and derivative packet-level features BULK, WWW, MAIL, SERVICES, DB, P2P, ATTACK, MULTIMEDIA N/A Accuracy upto 95%, TPR upto 99%
Jiang et al. [218] Supervised NBKE Proprietary: campus network Baseline and derivative flow-level features WWW, email, bulk, attack, P2P, multimedia, service, database, interaction, games N/A Average accuracy ≈ 91%
Park et al. [347] Supervised REPTree, REPTree-Bagging NLANR [457] Packet-level, flow-level and connection-level features WWW, Telnet, Messenger, FTP, P2P, Multimedia, SMTP, POP, IMAP, DNS, Services Burst packet threshold = 0.007s Accuracy ≥ 90% (features ≥ 7)
Zhang et al. [496] Supervised BoF-NB WIDE [474], proprietary: ISP network Packet-level and flow-level features from unidirectional flows BT, DNS, FTP, HTTP, IMAP, MSN, POP3, SMTP, SSH, SSL, XMPP Aggregation rule = sum, BoF size Accuracy 87-94%, F-measure = 80%
Zhang et al. [497] Supervised RF, Unsupervised k-Means (BoF-based, RTC) KEIO [474], WIDE [474], proprietary: ISP network Packet-level and flow-level features from unidirectional flows FTP, HTTP, IMAP, POP3, RAZOR, SSH, SSL, UNKNOWN / ZERO-DAY (BT, DNS, SMTP) N/A RTC upto 15% and 10% better in flow and byte accuracy, respectively, than second best F-measure = 0.91 (before update), 0.94 (after update)
Auld et al. [26] Supervised BNN Proprietary Packet-level and flow-level features ATTACK, BULK, DB, MAIL, P2P, SERVICE, WWW Number of features = 246, hidden layers = 0-1, 0-30 nodes in the hidden layer, output = 10 Accuracy > 99%, 95% with temporally distant training and testing datasets
Sun et al. [431] Supervised PNN Proprietary: campus networks Packet-level and flow-level features P2P, WEB, OTHERS Number of features = 22 Accuracy = 87.99%; P2P: TPR = 91.25%, FPR = 1.36%; WEB: TPR = 98.74%, FPR = 27.7%
Este et al. [140] Supervised SVM LBNL [262], CAIDA [451], proprietary: campus network Packet payload size HTTP, SMTP, POP3, HTTPS, IMAPS, BitTorrent, FTP, MSN, eDonkey, SSL, SMB, Kazaa, Gnutella, NNTP, DNS, LDAP, SSH Number of support vectors cf., [140] TP > 90% for most classes
Jing et al. [223] Supervised FT-SVM Proprietary [270, 321] A subset of 12 from 248 features [321] BULK, INTERACTIVE, WWW, MAIL, SERVICES, P2P, ATTACK, GAME, MULTIMEDIA, OTHER SVM parameters automatically chosen Accuracy up to 96%, error ratio 2.35 times, avg. computation cost 7.65 times
Wang et al. [464] Supervised multi-class SVM, unbalanced binary SVM Proprietary: univ. network Flow-level and connection-level features BitTorrent, eDonkey, Kazaa, pplive N/A Accuracy 75-99%
  1. N/A: Not available