Abstract
In this paper we suggest a new successive approximation method to compute the optimal discounted reward for finite state and action, discrete time, discounted Markov decision chains. The method is based on a block partitioning of the (stochastic) matrices corresponding to the stationary policies. The method is particularly attractive when the transition matrices are jointly nearly decomposable or nearly completely decomposable.
| Original language | English |
|---|---|
| Pages (from-to) | 151-160 |
| Number of pages | 10 |
| Journal | Stochastic Processes and their Applications |
| Volume | 19 |
| Issue number | 1 |
| DOIs | |
| State | Published - Feb 1985 |
Keywords
- Markov decision model
- optimal reward
- partitioning transition matrices
- stationary policies
- successive approximation
Fingerprint
Dive into the research topics of 'Block-successive approximation for a discounted Markov decision model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver