Artículo

Estamos trabajando para incorporar este artículo al repositorio
Consulte el artículo en la página del editor
Consulte la política de Acceso Abierto del editor

Abstract:

We consider in this paper constrained Markov decision processes. This type of control model has many applications in telecommunications and other fields. We address the issue of the convergence of the value and optimal policies of the problem with discounted costs, to the ones for the problem with expected average cost. We consider the general multichain ergodic structure. We present two stability results in this paper. We establish the continuity of optimal values and solutions of as well as some type of robustness of some suboptimal solutions in the discount factor. Our proof relies on same general theory on continuity of values and solutions in convex optimization that relies on well-known notions of Γ-convergence.

Registro:

Documento: Artículo
Título:Continuity of optimal values and solutions for control of Markov chains with constraints
Autor:Tidball, M.M.; Lombardi, A.; Pourtallier, O.; Altman, E.
Ciudad:Philadelphia
Filiación:Inst. Natl. Rech. Agronom., Econ./S., 2 Place Viala 34060, Montpellier Cedex 1, France
Fac. de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universtaria, 1428 Capital Federal, Argentina
INRIA, Centre Sophia-Antipolis, 2004 Route des Lucioles, 06902 Sophia-Antipolis Cedex, France
Palabras clave:Control system analysis; Convergence of numerical methods; Decision theory; Markov processes; Optimization; Robustness (control systems); Sensitivity analysis; Constrained Markov decision processes (CMDPs); Ergodic structures; Optimal control systems
Año:2000
Volumen:38
Número:4
Página de inicio:1204
Página de fin:1222
DOI: http://dx.doi.org/10.1137/S0363012997280294
Título revista:SIAM Journal on Control and Optimization
Título revista abreviado:SIAM J Control Optim
ISSN:03630129
CODEN:SJCOD
Registro:https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_03630129_v38_n4_p1204_Tidball

Referencias:

  • Altman, E., Asymptotic properties of constrained Markov decision processes (1993) Z. Oper. Res., 37, pp. 151-170
  • Altman, E., Denumerable constrained Markov decision problems and finite approximations (1994) Math. Oper. Res., 19, pp. 169-191
  • Altman, E., (1999) Constrained Markov Decision Processes, , Stochastic Modeling, Chapman and Hall/CRC, Boca Raton, FL
  • Altman, E., Gaitsgory, V.A., Stability and singular perturbations in constrained Markov decision problems (1993) IEEE Trans. Automat. Control, 38, pp. 971-975
  • Altman, E., Shwartz, A., Optimal priority assignment: A time sharing approach (1989) IEEE Trans. Automat. Control, AC-34, pp. 1089-1102
  • Altman, E., Shwartz, A., Adaptive control of constrained Markov chains (1991) IEEE Trans. Automat. Control, 36, pp. 454-462
  • Altman, E., Shwartz, A., Markov decision problems and state-action frequencies (1991) SIAM J. Control Optim., 29, pp. 786-809
  • Altman, E., Shwartz, A., Sensitivity of constrained Markov decision problems (1991) Ann. Oper. Res., 32, pp. 1-22
  • Altman, E., Shwartz, A., Time-sharing policies for controlled Markov chains (1993) Oper. Res., 41, pp. 1116-1124
  • Altman, E., Spieksma, F., The linear program approach in Markov decision problems revisited (1995) Z. Oper. Res., 42, pp. 169-188
  • Beutler, F.J., Ross, K.W., Optimal policies for controlled Markov chains with a constraint (1985) J. Math. Anal. Appl., 112, pp. 236-252
  • Beutler, F.J., Ross, K.W., Time-average optimal constrained semi-Markov decision processes (1986) Adv. Appl. Probab., 18, pp. 341-359
  • Birge, J.R., Wets, R.J., Designing approximatin schemes for stochastic optimization problems (1986) Math. Programming Stud., 27, pp. 54-102
  • Borkar, V.S., A convex analytic approach to Markov decision processes (1988) Probab. Theory Related Fields, 78, pp. 583-602
  • Borkar, V.S., Ergodic control of Markov chains with constraints - The general case (1994) SIAM J. Control Optim., 32, pp. 176-186
  • Dal Maso, G., An introduction to Γ-convergence (1993) Progr. Nonlinear Differential Equations Appl., 8. , Birkhäuser, Basel, Boston
  • Dantzig, G.B., Folkman, J., Shapiro, N., On the continuity of the minimum set of a continuous function (1967) J. Math. Anal. Appl., 17, pp. 519-548
  • Derman, C., Strauch, R.E., A note on memoryless rules for controlling sequential control processes (1966) Ann. Math. Statist., 37, pp. 276-278
  • Dynkin, E., Yushkevich, A., (1979) Controlled Markov Processes, , Springer-Verlag, Berlin
  • Feinberg, E.A., Sufficient classes of strategies in discrete dynamic programming I: Decomposition of randomized strategies and embedded models (1986) SIAM Theory Probab. Appl., 31, pp. 658-668
  • Feinberg, E.A., Constrained semi-Markov decision processes with average rewards (1995) Z. Oper. Res., 39, pp. 257-288
  • Feinberg, E.A., Reiman, M.I., Optimality of randomized trunk reservation (1994) Probab. Engrg. Inform. Sci., 8, pp. 463-489
  • Fiacco, A.V., Convergence properties of local solutions of convex optimization problems (1974) J. Optim. Theory Appl., 13, pp. 1-12
  • Gaitsgory, V.A., Pervozvanskii, A.A., Perturbation theory for mathematical programming problems (1986) J. Optim. Theory Appl., 49, pp. 389-410
  • Hordijk, A., Kallenberg, L.C.M., Constrained undiscounted stochastic dynamic programming (1984) Math. Oper. Res., 9, pp. 276-289
  • Hordijk, A., Spieksma, F., Constrained admission control to a queuing system (1989) Adv. in Appl. Probab., 21, pp. 409-431
  • Kallenberg, L.C.M., Linear programming and finite markovian control problems (1983) Mathematical Centre Tracts, 148. , Amsterdam
  • Kanniappan, P., Sastry, S.M.A., Uniform convergence of convex optimization problems (1974) J. Math. Anal. Appl., 96, pp. 1-12
  • Krylov, N.V., The construction of an optimal strategy for a finite controlled chain (1965) Theory Probab. Appl., 10, pp. 45-54
  • Lazar, A., Optimal flow control of a class of queuing networks in equilibrium (1983) IEEE Trans. Automat. Control, 28, pp. 1001-1007
  • Lignota, M.B., Morgan, J., Topological existence and stability of min sup problems (1990) J. Math. Anal. Appl., 151, pp. 164-180
  • Lignota, M.B., Morgan, J., Convergences of marginal functions with dependent constraints (1992) Optimization, 23, pp. 189-213
  • Lucchetti, R., Wets, R.J.B., Convergence of minima of integral functionals, with applications to optimal control and stochastic optimization (1993) Statist. Decisions, 11, pp. 69-84
  • Morgan, J., Raucci, R., Continuity properties of ∈-solutions for generalized parametric saddle point problems and application to hierarchical games (1997) J. Math. Anal. Appl., 211, pp. 30-48
  • Nain, P., Ross, K.W., Optimal priority assignment with hard constraint (1986) IEEE Trans. Automat. Control, 31, pp. 883-888
  • Pervozvanskii, A.A., Gaitsgory, V.A., (1988) Theory of Suboptimal Decision: Decomposition and Aggregation, , Kluwer Academic Publishers, Dordrecht, The Netherlands
  • Rockafellar, R.T., Wets, R.J.B., Variational analysis (1998) Grundlehren Math. Wiss., 317. , Springer-Verlag, Berlin
  • Ross, K.W., Randomized and past-dependent policies for Markov decision processes with multiple constraints (1989) Oper. Res., 37, pp. 474-477
  • Ross, K.W., Chen, B., Optimal scheduling of interactive and non interactive traffic in telecommunication systems (1988) IEEE Trans. Automat. Control, 33, pp. 261-267
  • Ross, K., Varadarajan, R., Markov decision processes with sample path constraints: The communicating case (1989) Oper. Res., 37, pp. 780-790
  • Ross, K., Varadarajan, R., Multichain Markov decision processes with a sample path constraint: A decomposition approach (1991) Math. Oper. Res., 16, pp. 195-207
  • Schal, M., A selection theorem for optimization problems (1974) Arch. Math., 25, pp. 219-224
  • Schochetman, I.E., Pointwise versions of the maximum theorem with applications to optimization (1990) Appl. Math. Lett., 3, pp. 89-92
  • Schochetman, I.E., Smith, R.L., Convergence of selections with applications in optimization (1991) J. Math. Anal. Appl., 155, pp. 278-1242
  • Sennott, L.I., Constrained discounted Markov decision chains (1991) Probab. Engrg. Inform. Sci., 5, pp. 463-475
  • Sennott, L.I., Constrained average cost Markov decision chains (1993) Probab. Engrg. Inform. Sci., 7, pp. 69-83

Citas:

---------- APA ----------
Tidball, M.M., Lombardi, A., Pourtallier, O. & Altman, E. (2000) . Continuity of optimal values and solutions for control of Markov chains with constraints. SIAM Journal on Control and Optimization, 38(4), 1204-1222.
http://dx.doi.org/10.1137/S0363012997280294
---------- CHICAGO ----------
Tidball, M.M., Lombardi, A., Pourtallier, O., Altman, E. "Continuity of optimal values and solutions for control of Markov chains with constraints" . SIAM Journal on Control and Optimization 38, no. 4 (2000) : 1204-1222.
http://dx.doi.org/10.1137/S0363012997280294
---------- MLA ----------
Tidball, M.M., Lombardi, A., Pourtallier, O., Altman, E. "Continuity of optimal values and solutions for control of Markov chains with constraints" . SIAM Journal on Control and Optimization, vol. 38, no. 4, 2000, pp. 1204-1222.
http://dx.doi.org/10.1137/S0363012997280294
---------- VANCOUVER ----------
Tidball, M.M., Lombardi, A., Pourtallier, O., Altman, E. Continuity of optimal values and solutions for control of Markov chains with constraints. SIAM J Control Optim. 2000;38(4):1204-1222.
http://dx.doi.org/10.1137/S0363012997280294