Skip to main content

Table 5 Summary of supervised flow feature-based traffic classification

From: A comprehensive survey on machine learning for networking: evolution, applications and research opportunities

Ref.

ML Technique

Dataset

Features

Classes

Evaluation

     

Settings

Results

Roughan et al. [390]

Supervisedk-NN

Proprietary: univ. networks, streaming service

Packet-level and flow-level features

Telnet, FTP-data, Kazaa, RealMedia Streaming, DNS, HTTPS

k=3, number of QoS classes = 3, 4, 7

Error rate: 5.1% (4), 2.5% (3), 9.4% (7); (#): number of QoS Classes

Moore and Zuev [321]

Supervised NBKE

Proprietary: campus network

Baseline and derivative packet-level features

BULK, WWW, MAIL, SERVICES, DB, P2P, ATTACK, MULTIMEDIA

N/A

Accuracy upto 95%, TPR upto 99%

Jiang et al. [218]

Supervised NBKE

Proprietary: campus network

Baseline and derivative flow-level features

WWW, email, bulk, attack, P2P, multimedia, service, database, interaction, games

N/A

Average accuracy ≈ 91%

Park et al. [347]

Supervised REPTree, REPTree-Bagging

NLANR [457]

Packet-level, flow-level and connection-level features

WWW, Telnet, Messenger, FTP, P2P, Multimedia, SMTP, POP, IMAP, DNS, Services

Burst packet threshold = 0.007s

Accuracy ≥ 90% (features ≥ 7)

Zhang et al. [496]

Supervised BoF-NB

WIDE [474], proprietary: ISP network

Packet-level and flow-level features from unidirectional flows

BT, DNS, FTP, HTTP, IMAP, MSN, POP3, SMTP, SSH, SSL, XMPP

Aggregation rule = sum, BoF size

Accuracy 87-94%, F-measure = 80%

Zhang et al. [497]

Supervised RF, Unsupervised k-Means (BoF-based, RTC)

KEIO [474], WIDE [474], proprietary: ISP network

Packet-level and flow-level features from unidirectional flows

FTP, HTTP, IMAP, POP3, RAZOR, SSH, SSL, UNKNOWN / ZERO-DAY (BT, DNS, SMTP)

N/A

RTC upto 15% and 10% better in flow and byte accuracy, respectively, than second best F-measure = 0.91 (before update), 0.94 (after update)

Auld et al. [26]

Supervised BNN

Proprietary

Packet-level and flow-level features

ATTACK, BULK, DB, MAIL, P2P, SERVICE, WWW

Number of features = 246, hidden layers = 0-1, 0-30 nodes in the hidden layer, output = 10

Accuracy > 99%, 95% with temporally distant training and testing datasets

Sun et al. [431]

Supervised PNN

Proprietary: campus networks

Packet-level and flow-level features

P2P, WEB, OTHERS

Number of features = 22

Accuracy = 87.99%; P2P: TPR = 91.25%, FPR = 1.36%; WEB: TPR = 98.74%, FPR = 27.7%

Este et al. [140]

Supervised SVM

LBNL [262], CAIDA [451], proprietary: campus network

Packet payload size

HTTP, SMTP, POP3, HTTPS, IMAPS, BitTorrent, FTP, MSN, eDonkey, SSL, SMB, Kazaa, Gnutella, NNTP, DNS, LDAP, SSH

Number of support vectors cf., [140]

TP > 90% for most classes

Jing et al. [223]

Supervised FT-SVM

Proprietary [270, 321]

A subset of 12 from 248 features [321]

BULK, INTERACTIVE, WWW, MAIL, SERVICES, P2P, ATTACK, GAME, MULTIMEDIA, OTHER

SVM parameters automatically chosen

Accuracy up to 96%, error ratio ↓ 2.35 times, avg. computation cost ↓ 7.65 times

Wang et al. [464]

Supervised multi-class SVM, unbalanced binary SVM

Proprietary: univ. network

Flow-level and connection-level features

BitTorrent, eDonkey, Kazaa, pplive

N/A

Accuracy 75-99%

  1. N/A: Not available