Skip to main content

Table 6 Summary of unsupervised flow feature-based traffic classification

From: A comprehensive survey on machine learning for networking: evolution, applications and research opportunities

Ref. ML Technique Dataset Features Classes Evaluation
      Settings Results
Liu et al. [283] Unsupervised k-Means Proprietary: campus network Packet-level and flow-level features WWW, MAIL, P2P, FTP (CONTROL, PASV, DATA), ATTACK, DATABASE, SERVICES, INTERACTIVE, MULTIMEDIA, GAMES k=80 Average accuracy ≈ 90%, minimum recall = 70%
Zander et al. [492] Unsupervised AutoClass NLANR [457] Packet-level and flow-level features AOL Messenger, Napster, Half-Life, FTP, Telnet, SMTP, DNS, HTTP Intra-class homogeneity (H) Mean accuracy = 86.5%
Erman et al. [136] Unsupervised AutoClass Univ. Auckland [457] Packet-level and flow-level features HTTP, SMTP, DNS, SOCKS, IRC, FTP(control, data), POP3, LIMEWIRE, FTP N/A Accuracy = 91.2%
Erman et al. [135] Unsupervised DBSCAN Univ. Auckland [457], proprietary: Univ. Calgary Packet-level and flow-level features HTTP, P2P, SMTP, IMAP, POP3, MSSQL, OTHER eps = 0.03, minPts = 3, number of clusters = 190 Overall accuracy = 75.6%, average precision > 95% (7/9 classes)
Erman et al. [138] Unsupervised k-Means Proprietary: univ. network Packet-level and flow-level features from unidirectional flows Web, EMAIL, DB, P2P, OTHER, CHAT, FTP, STREAMING k= 400 Server-to-client: Avg. flow accuracy = 95%, Avg. byte accuracy = 79%; Web: precision = 97%, recall = 97%; P2P: precision = 82%, recall = 77%
  1. N/A: Not available