This folder contains 8 different MARL algorithm implementations for multi-agent coordination along with a random baseline which selects random actions.
marl_models/
├── base_model.py # Base class for all models
├── buffer_and_helpers.py # Experience replay buffer
├── utils.py # Model factory & utilities
├── README.md # README for the folder
│
├── maddpg/ # MADDPG implementation
│ ├── agents.py # Agent class
│ └── maddpg.py # Algorithm update logic
│
├── matd3/ # MATD3 implementation
├── mappo/ # MAPPO implementation
├── masac/ # MASAC implementation
│
├── attention.py # Attention base classes
|
├── attention_maddpg/ # MADDPG + Attention
│ ├── agents.py
│ └── attention_maddpg.py
│
├── attention_matd3/ # MATD3 + Attention
├── attention_mappo/ # MAPPO + Attention
├── attention_masac/ # MASAC + Attention
│
└── random_baseline/ # Random baseline
└── random_model.py
Simply change MODEL in config.py:
# Choose one:
MODEL = "maddpg" # Off-policy baseline
MODEL = "matd3" # Twin delayed DDPG
MODEL = "mappo" # On-policy PPO
MODEL = "masac" # Soft Actor-Critic
MODEL = "attention_maddpg" # MADDPG + attention
MODEL = "attention_matd3" # MATD3 + attention
MODEL = "attention_mappo" # MAPPO + attention
MODEL = "attention_masac" # MASAC + attention
MODEL = "random" # Random baselineNo algorithm-specific tuning, just reward weights:
python tune.py --stage 1 --episodes 500 --trials 50Tunes learning rates, network size, batch size, discount factor etc:
python tune.py --stage 2 --episodes 1000 --trials 50Optimize attention dimension and heads:
python tune.py --stage 3 --episodes 500 --trials 30# Train using multiple algorithms by changing model in configs.py
# Compare results
python utils/comparative_plots.py \
--logs train_logs/maddpg train_logs/masac/ \
--names MADDPG MASAC \
--smoothing 10Generates comparison plots showing:
- Reward curves (learning progress)
- Latency (task completion time)
- Energy consumption
- Fairness (equal service)
- Offline rate (battery health)
- Loss curves (training stability)
Refer Plotting Module for detailed plotting plan.