Estimating the value of a discounted reward process

Moshe Haviv, Martin L. Puterman*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

This paper provides a differential equation which relates the expected total discounted reward of a reward process to the expected total undiscounted reward of a process which terminates at a negative binomial stopping time. The solution of this equation provides the basis for unbiased estimators of the expected total discounted reward and its derivative with respect to the discount rate. We compare this estimator to other estimators and discuss when it might be more efficient. When rewards are positive we show that the estimator is monotone in the sampled variate.

Original languageEnglish
Pages (from-to)267-272
Number of pages6
JournalOperations Research Letters
Volume11
Issue number5
DOIs
StatePublished - Jun 1992
Externally publishedYes

Keywords

  • reward processes
  • simulation
  • unbiased estimators

Fingerprint

Dive into the research topics of 'Estimating the value of a discounted reward process'. Together they form a unique fingerprint.

Cite this