629e Quantifying Risk in Multistage Stochastic Problems Using Approximate Dynamic Programming and Coherent Risk Measures

Nikolaos Pratikakis, Jay H. Lee, and Matthew J. Realff. Chemical and Biomolecular Engineering, Georgia Institute of Technology, Ford Environmental Science & Technology Building 311 Ferst Drive, N.W., Atlanta, GA 30332

To analyze and account for risk in decision-making, we need a measure that reflects the decision maker's attitude (risk averse vs. risk neutral vs. risk taker) and does not lead to counter-intuitive outcomes. Artzner et al [1] defined the class of coherent risk measures as those that satisfy the four properties of sub-additivity, monotonicity, positive homogeneity, and translation invariance. Commonly used risk measures like standard deviation or Value at Risk (α-VaR) [2] violate at least one of these properties and therefore can lead to counter-intuitive outcomes in certain situations.

As an alternative to the α-VaR, a new coherent risk measure called Conditional Value at Risk α-CVaR) has been proposed in the risk literature [2]. α-CVaR; is formally defined for an arbitrary loss distribution as: α-CVaR=E [{ Loss | Loss ≥ α-VaR}, which represents the mean of the tail of the (1-α)% worst cases. The most attractive characteristics of the α-CVaR measure are: a) consistency with the mean-variance approach, b) convexity resulting in easy optimization via LP even for non-normal distributions, and c) the capacity to handle even fat tails (e.g., Student-T distributions). More details about optimizing α-CVaR can be found in [2].

The main idea in the proposed presentation is to combine the computationally-tractable Approximate Dynamic Programming (ADP) algorithms [3] with coherent risk measures, in order to generate `risk-sensitive' policies for multi-stage decision problems under uncertainty. In other words the approach generates policies that weigh the trade-off between expected performance and risk. The proposed ADP methodology adaptively samples portions of the state space as it simulates the system behavior under a greedy or a ε-greedy exploration strategy [4] [5]. Nevertheless, the handling of multi-stage risk measures is very problematic, due to nested operations and the lack of the additivity property. An important condition needed to calculate the recursive optimality equations using coherent risk measures is the between stages independent condition [6]. In particular, minimizing α-CVaR at each time period will result to minimizing a coherent risk measure, but not α-CVaR of the actual multi-stage loss distribution itself.

The approach has been evaluated via Monte Carlo simulations of small scale problems, where we clearly observed the trade-off between expected loss and risk for the derived policies. In the presentation, we will show results for a more complex product portfolio management problem. In the problem, the process of new project arrival is dictated by a first order Markov Chain and main decision to make is to accept or reject each proposed project. For simplicity, we characterize the arriving projects in terms of: a) distribution of the profit (i.e., realized profit, if the project is successfully launched), b) required resources at each time period while it is in the pipeline; c) probability of project failure at each time period, and d) the minimum number of time periods each project needs to remain in the pipeline to be completed. We added a further twist to this problem, by having the progress of each project modeled with a Markov Chain. Also, in order to capture and hedge the risk in this application, the probability of failure is set to be commensurate with the expected profit of each project. Available resources are constrained and projects can be cancelled at any time (without any profit, of course), which make it a complex decision problem.

References:

1. Artzner, P., F. Delbaen, J. M. Eber, and D. Heath. “Coherent measures of risk”. Mathematical Finance, 9 (November): 203-228, 1999.

2. Rockafellar R.T. and S. Uryasev. “Conditional Value-at-Risk for General Loss Distributions”. Journal of Banking and Finance, 26/7, 1443-1471, 2002.

3. J. Si, A. Barto, W.B. Powell and D. Wunsch, “Learning and Approximate Dynamic Programming: Scaling up to the Real World”. John-Wiley and Sons, New York, 2004.

4. Georgios Chalkiadakis,“Multiagent Reinforcement Learnig: Stochastic Games With Multiple Learning Players”, PhD Report, Department of C.S., University of Toronto, 2003. (http://www.cs.toronto.edu/~gehalk/)

5. N.E. Pratikakis, J.H. Lee and M.J Realff. “A Real Time Approximate Dynamic Programming Approach For Stochastic Multistage Planning And Scheduling Applications”. (In Preparation)

6. A. Ruszczynski, and A. Shapiro, “Optimization of Risk Measures”. Probabilistic and Randomized Methods for Design under Uncertainty, G. Calafiore and F. Dabbene Springer, London, 2005.