Continuity of optimal values and solutions for control of Markov chains with constraints

Tidball, M.M.; Lombardi, A.; Pourtallier, O.; Altman, E.

doi:10.1137/S0363012997280294

Navegar

Documento Últimos Documentos Autor FCEN - Año Autor FCEN - Revista Año - Revista Revista - Año SubjectPcEn Colores Type

Colección

Artículo

Tidball, M.M.; Lombardi, A.; Pourtallier, O.; Altman, E. "Continuity of optimal values and solutions for control of Markov chains with constraints" (2000) SIAM Journal on Control and Optimization. 38(4):1204-1222

https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_03630129_v38_n4_p1204_Tidball

Estamos trabajando para incorporar este artículo al repositorio

Consulte el artículo en la página del editor

Consulte la política de Acceso Abierto del editor

Abstract:

We consider in this paper constrained Markov decision processes. This type of control model has many applications in telecommunications and other fields. We address the issue of the convergence of the value and optimal policies of the problem with discounted costs, to the ones for the problem with expected average cost. We consider the general multichain ergodic structure. We present two stability results in this paper. We establish the continuity of optimal values and solutions of as well as some type of robustness of some suboptimal solutions in the discount factor. Our proof relies on same general theory on continuity of values and solutions in convex optimization that relies on well-known notions of Γ-convergence.

Registro:

Documento:	Artículo
Título:	Continuity of optimal values and solutions for control of Markov chains with constraints
Autor:	Tidball, M.M.; Lombardi, A.; Pourtallier, O.; Altman, E.
Ciudad:	Philadelphia
Filiación:	Inst. Natl. Rech. Agronom., Econ./S., 2 Place Viala 34060, Montpellier Cedex 1, France Fac. de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universtaria, 1428 Capital Federal, Argentina INRIA, Centre Sophia-Antipolis, 2004 Route des Lucioles, 06902 Sophia-Antipolis Cedex, France
Palabras clave:	Control system analysis; Convergence of numerical methods; Decision theory; Markov processes; Optimization; Robustness (control systems); Sensitivity analysis; Constrained Markov decision processes (CMDPs); Ergodic structures; Optimal control systems
Año:	2000
Volumen:	38
Número:	4
Página de inicio:	1204
Página de fin:	1222
DOI:	http://dx.doi.org/10.1137/S0363012997280294
Título revista:	SIAM Journal on Control and Optimization
Título revista abreviado:	SIAM J Control Optim
ISSN:	03630129
CODEN:	SJCOD
Registro:	https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_03630129_v38_n4_p1204_Tidball

Referencias:

Altman, E., Asymptotic properties of constrained Markov decision processes (1993) Z. Oper. Res., 37, pp. 151-170
Altman, E., Denumerable constrained Markov decision problems and finite approximations (1994) Math. Oper. Res., 19, pp. 169-191
Altman, E., (1999) Constrained Markov Decision Processes, , Stochastic Modeling, Chapman and Hall/CRC, Boca Raton, FL
Altman, E., Gaitsgory, V.A., Stability and singular perturbations in constrained Markov decision problems (1993) IEEE Trans. Automat. Control, 38, pp. 971-975
Altman, E., Shwartz, A., Optimal priority assignment: A time sharing approach (1989) IEEE Trans. Automat. Control, AC-34, pp. 1089-1102
Altman, E., Shwartz, A., Adaptive control of constrained Markov chains (1991) IEEE Trans. Automat. Control, 36, pp. 454-462
Altman, E., Shwartz, A., Markov decision problems and state-action frequencies (1991) SIAM J. Control Optim., 29, pp. 786-809
Altman, E., Shwartz, A., Sensitivity of constrained Markov decision problems (1991) Ann. Oper. Res., 32, pp. 1-22
Altman, E., Shwartz, A., Time-sharing policies for controlled Markov chains (1993) Oper. Res., 41, pp. 1116-1124
Altman, E., Spieksma, F., The linear program approach in Markov decision problems revisited (1995) Z. Oper. Res., 42, pp. 169-188
Beutler, F.J., Ross, K.W., Optimal policies for controlled Markov chains with a constraint (1985) J. Math. Anal. Appl., 112, pp. 236-252
Beutler, F.J., Ross, K.W., Time-average optimal constrained semi-Markov decision processes (1986) Adv. Appl. Probab., 18, pp. 341-359
Birge, J.R., Wets, R.J., Designing approximatin schemes for stochastic optimization problems (1986) Math. Programming Stud., 27, pp. 54-102
Borkar, V.S., A convex analytic approach to Markov decision processes (1988) Probab. Theory Related Fields, 78, pp. 583-602
Borkar, V.S., Ergodic control of Markov chains with constraints - The general case (1994) SIAM J. Control Optim., 32, pp. 176-186
Dal Maso, G., An introduction to Γ-convergence (1993) Progr. Nonlinear Differential Equations Appl., 8. , Birkhäuser, Basel, Boston
Dantzig, G.B., Folkman, J., Shapiro, N., On the continuity of the minimum set of a continuous function (1967) J. Math. Anal. Appl., 17, pp. 519-548
Derman, C., Strauch, R.E., A note on memoryless rules for controlling sequential control processes (1966) Ann. Math. Statist., 37, pp. 276-278
Dynkin, E., Yushkevich, A., (1979) Controlled Markov Processes, , Springer-Verlag, Berlin
Feinberg, E.A., Sufficient classes of strategies in discrete dynamic programming I: Decomposition of randomized strategies and embedded models (1986) SIAM Theory Probab. Appl., 31, pp. 658-668
Feinberg, E.A., Constrained semi-Markov decision processes with average rewards (1995) Z. Oper. Res., 39, pp. 257-288
Feinberg, E.A., Reiman, M.I., Optimality of randomized trunk reservation (1994) Probab. Engrg. Inform. Sci., 8, pp. 463-489
Fiacco, A.V., Convergence properties of local solutions of convex optimization problems (1974) J. Optim. Theory Appl., 13, pp. 1-12
Gaitsgory, V.A., Pervozvanskii, A.A., Perturbation theory for mathematical programming problems (1986) J. Optim. Theory Appl., 49, pp. 389-410
Hordijk, A., Kallenberg, L.C.M., Constrained undiscounted stochastic dynamic programming (1984) Math. Oper. Res., 9, pp. 276-289
Hordijk, A., Spieksma, F., Constrained admission control to a queuing system (1989) Adv. in Appl. Probab., 21, pp. 409-431
Kallenberg, L.C.M., Linear programming and finite markovian control problems (1983) Mathematical Centre Tracts, 148. , Amsterdam
Kanniappan, P., Sastry, S.M.A., Uniform convergence of convex optimization problems (1974) J. Math. Anal. Appl., 96, pp. 1-12
Krylov, N.V., The construction of an optimal strategy for a finite controlled chain (1965) Theory Probab. Appl., 10, pp. 45-54
Lazar, A., Optimal flow control of a class of queuing networks in equilibrium (1983) IEEE Trans. Automat. Control, 28, pp. 1001-1007
Lignota, M.B., Morgan, J., Topological existence and stability of min sup problems (1990) J. Math. Anal. Appl., 151, pp. 164-180
Lignota, M.B., Morgan, J., Convergences of marginal functions with dependent constraints (1992) Optimization, 23, pp. 189-213
Lucchetti, R., Wets, R.J.B., Convergence of minima of integral functionals, with applications to optimal control and stochastic optimization (1993) Statist. Decisions, 11, pp. 69-84
Morgan, J., Raucci, R., Continuity properties of ∈-solutions for generalized parametric saddle point problems and application to hierarchical games (1997) J. Math. Anal. Appl., 211, pp. 30-48
Nain, P., Ross, K.W., Optimal priority assignment with hard constraint (1986) IEEE Trans. Automat. Control, 31, pp. 883-888
Pervozvanskii, A.A., Gaitsgory, V.A., (1988) Theory of Suboptimal Decision: Decomposition and Aggregation, , Kluwer Academic Publishers, Dordrecht, The Netherlands
Rockafellar, R.T., Wets, R.J.B., Variational analysis (1998) Grundlehren Math. Wiss., 317. , Springer-Verlag, Berlin
Ross, K.W., Randomized and past-dependent policies for Markov decision processes with multiple constraints (1989) Oper. Res., 37, pp. 474-477
Ross, K.W., Chen, B., Optimal scheduling of interactive and non interactive traffic in telecommunication systems (1988) IEEE Trans. Automat. Control, 33, pp. 261-267
Ross, K., Varadarajan, R., Markov decision processes with sample path constraints: The communicating case (1989) Oper. Res., 37, pp. 780-790
Ross, K., Varadarajan, R., Multichain Markov decision processes with a sample path constraint: A decomposition approach (1991) Math. Oper. Res., 16, pp. 195-207
Schal, M., A selection theorem for optimization problems (1974) Arch. Math., 25, pp. 219-224
Schochetman, I.E., Pointwise versions of the maximum theorem with applications to optimization (1990) Appl. Math. Lett., 3, pp. 89-92
Schochetman, I.E., Smith, R.L., Convergence of selections with applications in optimization (1991) J. Math. Anal. Appl., 155, pp. 278-1242
Sennott, L.I., Constrained discounted Markov decision chains (1991) Probab. Engrg. Inform. Sci., 5, pp. 463-475
Sennott, L.I., Constrained average cost Markov decision chains (1993) Probab. Engrg. Inform. Sci., 7, pp. 69-83

Citas:

---------- APA ----------

Tidball, M.M., Lombardi, A., Pourtallier, O. & Altman, E. (2000) . Continuity of optimal values and solutions for control of Markov chains with constraints. SIAM Journal on Control and Optimization, 38(4), 1204-1222.
http://dx.doi.org/10.1137/S0363012997280294

---------- CHICAGO ----------

Tidball, M.M., Lombardi, A., Pourtallier, O., Altman, E. "Continuity of optimal values and solutions for control of Markov chains with constraints" . SIAM Journal on Control and Optimization 38, no. 4 (2000) : 1204-1222.
http://dx.doi.org/10.1137/S0363012997280294

---------- MLA ----------

Tidball, M.M., Lombardi, A., Pourtallier, O., Altman, E. "Continuity of optimal values and solutions for control of Markov chains with constraints" . SIAM Journal on Control and Optimization, vol. 38, no. 4, 2000, pp. 1204-1222.
http://dx.doi.org/10.1137/S0363012997280294

---------- VANCOUVER ----------

Tidball, M.M., Lombardi, A., Pourtallier, O., Altman, E. Continuity of optimal values and solutions for control of Markov chains with constraints. SIAM J Control Optim. 2000;38(4):1204-1222.
http://dx.doi.org/10.1137/S0363012997280294