From: DOD-ETL: distributed on-demand ETL for near real-time business intelligence
Module | Software | Hardware | Instances |
---|---|---|---|
Database | MySQL database | 8-core 10 GB | 1 |
Sampler | Python script | 20-core 18 GB memory | 1 |
Change Tracker | Python script | 20-core 18 GB memory | 1 |
Message Queue | Kafka | one core and 2 GB memory | 3 |
Zookeeper | one core and 2 GB memory | 3 | |
Stream Processor | Spark Streaming DOD-ETL job | one core and 2 GB memory | From 1 to 20 |