Lecture 12 2024; Off-line training with neural nets for approximate VI and PI. Aggregation
Автор: Dimitri Bertsekas
Загружено: 2024-04-06
Просмотров: 398
Slides, class notes, and related textbook material at http://web.mit.edu/dimitrib/www/RLboo... A review of neural nets, approximation architectures, and off-line training. Approximate (fitted) value iteration, advantages of Q-learning, use of baselines, differential training, advantage updating. Implementation issues in approximate policy iteration: exploration, policy oscillations, robustness in the face of changing system parameters and on-line replanning. Aggregation architectures. A simple form of aggregation: representative states. Aggregation with representative features.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: