Contribución al estudio y el diseño de funciones de refuerzo

Santos, Juan Miguel

Navegar

Documento Últimos publicados Autor Año Título Obtenido - Año Departamento - Año Maestría y Doctorado Director y Director Asistente Jurado Consejero de Estudios

Colección

Datos Estadísticas

Tesis Doctoral

Santos, Juan Miguel. "Contribución al estudio y el diseño de funciones de refuerzo" . (1999). Tesis Doctoral, Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales.

Registro Abstract Citación Estadísticas

Registro:

Documento:	Tesis Doctoral
Título:	Contribución al estudio y el diseño de funciones de refuerzo
Título alternativo:	Contribution to the study and the design of reinforcement functions
Autor:	Santos, Juan Miguel
Editor:	Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales
Lugar de trabajo:	Universite d'Aix-Marseille III
Publicación en la Web:	2017-03-01
Fecha de defensa:	1999
Fecha en portada:	1999
Grado Obtenido:	Doctorado
Título Obtenido:	Doctor en Ciencias de la Computación
Departamento Docente:	Departamento de Computación
Director:	Scolnik, Hugo Daniel; Giambiasi, Norbert
Consejero:	Touzet, Claude
Idioma:	Español
Formato:	PDF
Handle:	https://hdl.handle.net/20.500.12110/tesis_n3129_Santos
PDF:	https://bibliotecadigital.exactas.uba.ar/download/tesis/tesis_n3129_Santos.pdf
Registro:	https://bibliotecadigital.exactas.uba.ar/collection/tesis/document/tesis_n3129_Santos
Ubicación:	003129
Derechos de Acceso:	Esta obra puede ser leída, grabada y utilizada con fines de estudio, investigación y docencia. Es necesario el reconocimiento de autoría mediante la cita correspondiente. Santos, Juan Miguel. (1999). Contribución al estudio y el diseño de funciones de refuerzo. (Tesis Doctoral. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales). Recuperado de https://hdl.handle.net/20.500.12110/tesis_n3129_Santos

Abstract:

We propose a Reinforcement Function Design Process in two steps. Thefirst one translates a natural language description into an instance of the Reinforcement Function General Expression. The second tunes parameters ofconstraints in this expression, so as to obtain the optimal definition of the function (relative to exploration). We separate the constraints according to the type ofstate variable estimator on which they act, in particular: position and velocity. Using a particular, but representative Reinforcement Function (RF)expression, we study the relation between the Sum of each reinforcement type andthe RF parameters during the exploration phase of the learning. For linearrelations, we propose an analytic method to obtain the RF parameters values (noexperimentation requires). For non-linear, but monotonous relations, we proposethe Update Parameter Algorithm (UPA) and show that UPA can efficiently adjustthe proportion of negative and positive reinforcements received during theexploratory phase of the learning. Additionally, we study the feasibility and consequences of adapting the RFduring the learning process so as to improve the learning convergence of thesystem. Dynamic-UPA allows the whole learning process to maintain a desiredratio of positive and negative rewards. Thus, we introduce an approach to solvethe exploration-exploitation dilemma - a necessary step for efficient Reinforcement Learning. We illustrate, with several experiments involving robots (mobile and arm),the performance of the proposed design methods. Finally, we emphasize the mainconclusions and present some future directions of research.

Citación:

---------- APA ----------

Santos, Juan Miguel. (1999). Contribución al estudio y el diseño de funciones de refuerzo. (Tesis Doctoral. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales.). Recuperado de https://hdl.handle.net/20.500.12110/tesis_n3129_Santos

---------- CHICAGO ----------

Santos, Juan Miguel. "Contribución al estudio y el diseño de funciones de refuerzo". Tesis Doctoral, Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales, 1999.https://hdl.handle.net/20.500.12110/tesis_n3129_Santos

Estadísticas:

Descargas totales desde :