Abstract
There has been substantial progress with formal models for sequential decision making by individual agents using the Markov decision process (MDP). However, similar treatment of multi-agent systems is lacking. A recent complexity result, showing that solving decentralized MDPs is NEXP-hard, provides a partial explanation. To overcome this complexity barrier, we identify a general class of transition-independent decentralized MDPs that is widely applicable. The class consists of independent collaborating agents that are tied together through a global reward function that depends upon both of their histories. We present a novel algorithm for solving this class of problems and examine its properties. The result is the first effective technique to solve optimally a class of decentralized MDPs. This lays the foundation for further work in this area on both exact and approximate solutions.
| Original language | English |
|---|---|
| Pages | 41-48 |
| Number of pages | 8 |
| DOIs | |
| State | Published - 2003 |
| Externally published | Yes |
| Event | Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 03 - Melbourne, Vic., Australia Duration: 14 Jul 2003 → 18 Jul 2003 |
Conference
| Conference | Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 03 |
|---|---|
| Country/Territory | Australia |
| City | Melbourne, Vic. |
| Period | 14/07/03 → 18/07/03 |
Keywords
- Decentralized MDP
- Decision-theoretic planning