POWQMIX: Weighted Value Factorization with Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning
We propose POWQMIX, an algorithm that assigns higher weights to potentially optimal joint actions during training, overcoming QMIX's monotonicity constraints and outperforming existing methods in experiments.
May 1, 2024