Skip to main content

Table 12 Summary of decision making of the increment for updating CWND by using online training at end-systems of the network

From: A comprehensive survey on machine learning for networking: evolution, applications and research opportunities

Ref.

RL

Network

Synthetic Dataset

Features

Action-set

Evaluation

 

Technique

   

(action selection)

Settings

Resultsa

TCP-FALA [380]

FALA

WANET

GloMoSim simulation: · Topology: - Random - Dumbbell

States and reward: · IAT of ACKs(distinguish ACKS and DUPACKs)

Finite: · 5 actions(stochastic)

· 1 input feature · 5 states · 5 actions

To TCP-NewRenob: · Packet loss =66% · Goodput =29% · Fairness =20% To TCP-FeW ‡: · Packet loss =−5% · Goodput =−10% · Fairness =12%

Learning-TCP [29, 379]

CALA

WANET

Simulation: · ns2 and GloMoSim · Topology: - Chain - Random node - Grid Experimental: · Linux-based · Chain topology

States and reward: · IAT of ACKs

Continuous: · Normal action probability distribution (stochastic)

· 1 input feature · 2 states ·∞ actions

To TCP-FeW: · Packet loss =37% · Goodput =13% · Fairness =23% To TCP-FALA: · Packet loss =28% · Goodput =36% · Fairness =14%

TCP-GVegas [219]

Q-learning

WANET

ns-2 simulation: · Topology: - Chain - Random

States: · CWND · RTTz · Throughput Reward: · Throughput

Continuous: · Range based on RTT, throughput, and a span factor (ε-greedy)

· 3 input features · 3 states ·N/A actions

To TCP-Vegas: · Throughput =60% · Delay =54%

FK-TCPLearning [271]

FKQL

IoT

ns-3 simulation: · Dumbbell topology: - Single source/sink - Double source/sink

States: · IAT of ACKs · IAT of packets sent · RTT · SSThresh Reward: · Throughput · RTT

Finite: · 5 actions (ε-greedy)

· 5 input features ·10k states · 5 actions · FK approx: - 100 prototypes

To TCP-NewReno: · Throughput =34% · Delay =12% To TCPLearning based on pure Q-learning: · Throughput =−1.5%· Delay =−10%

UL-TCP [30]

CALA

Wireless: · Single-hop: - Satellite - Cellular - WLAN · Multi-hop: - WANET

ns-2 simulation: · Single-hop dumbbell · Multi-hop topology: - Chain - Random - Grid

States and reward: · RTT · Throughput · RTO CWND

Continuous: · Normal action probability distribution (stochastic)

· 3 input features · 2 states ·∞ actions

For single-hop, to ATL: · Packet loss =51% · Goodput: =−14% · Fairness =53% For multi-hop, similar to Learning-TCP

Remy [477]

Own(offline training)

· Wired · Cellular

ns-2 simulation: · Wired topology: - Dumbbell - Datacenter · Cellular topology

States: · IAT of ACKs · IAT of packets sent · RTT Reward: · Throughput · Delay

Continuous with 3-dimensions: · CWND multiple · CWND increment · Time between successive sends (ε-greedy)

· 4 input features ·(16k)3 states ·1003 actions · 16 network configurations

To TCP-Cubic: · Throughput =21% · Delay =60% To TCP-Cubic/SFQ-CD: · Throughput =10%· Delay =38%

PCC [122]

Own

· Wired · Satellite

Experimental: · GENI · Emulab · PlanetLab

States: · Sending rate Reward: · Throughput · Delay · Loss rate

Finite: · 2 actions of the increment for updating sending rate (not CWND) (gradient ascent)

· 3 input features · 4 states · 2 actions

To TCP-Cubic: · Throughput =21%· Delay =60%

  1. aAverage value of improvement ratio. Results vary according to the configured network parameters (e.g. topology, mobility, traffic)
  2. bBased on the results from the simulated and experimental evaluations in [29]