Abstract
A multichain Markov decision process with constraints on the expected state-action frequencies may lead to a unique optimal policy which does not satisfy Bellman's principle of optimality. The model with sample-path constraints does not suffer from this drawback.
Original language | English |
---|---|
Pages (from-to) | 25-28 |
Number of pages | 4 |
Journal | Operations Research Letters |
Volume | 19 |
Issue number | 1 |
DOIs | |
State | Published - Jul 1996 |
Keywords
- Constrained optimization
- Markov processes
- Sample path