Technology

Science and Nature

Reinforcement Learning

By Trilokesh Khatri

Listen with Sir Michael Caine™ and 1,000+ voices

Get for $30

Length7h 56m

About this audiobook

Reinforcement Learning: A Practical Guide to Algorithms delves into the impactful world of reinforcement learning, a key branch of AI. Spanning over five decades, reinforcement learning has significantly advanced AI, offering solutions for planning, budgeting, and strategic decision-making. This book provides a comprehensive understanding of reinforcement learning, focusing on building smart models and agents that adapt to changing requirements. We cover fundamental and advanced topics, including value-based methods like UCB, SARSA, and Q-learning, as well as function approximation techniques. Additionally, we explore artificial neural networks, LSTD, gradient methods, emphatic TD methods, average reward methods, and policy gradient methods. With clear explanations, diagrams, and examples, this book ensures that readers can grasp and apply reinforcement learning algorithms to real-world problems effectively. By the end, you will have a solid foundation in both theoretical and practical aspects of reinforcement learning.

Artificial Intelligence Exploration Robot Fiction Futuristic

Audiobook details

GenreTechnology, Science and Nature

Length7 hrs 56 mins

Narrated byListen with 1,000+ voices

FormateBook with Audio

Publish dateJan 3, 2025

LanguageEnglish

1Part-1

737.10 Summary

2Tabular Solution Methods

747.11 Questions

3Part-2

758.1 Episodic semi-gradient control

4Approximate Solution Methods

768.2 Semi-gradient n-step SARSA

5Chapter 1. Introduction

778.3 Average reward

Show all chapters

61.1 Reinforcement Learning

788.4 Deprecating the discounted setting

71.2 Examples

798.5 Differential semi-gradient n-step SARSA

81.3 Elements of RL

808.6 Summary

91.4 Applications of RL

818.7 Questions

101.5 Summary

82Chapter 9. Off-Policy Methods With Approximation

111.6 Questions

839.1 Semi-gradient methods

12Chapter 2. Multi-arm Bandits

849.2 Examples of off-policy Divergence

132.1 An n-armed bandit problem

859.3 The Deadly Triad

142.2 Action-value methods

869.4 Linear value function geometry

152.3 Incremental implementation

879.5 Gradient-descent in the Bellman error

162.4 Tracking a nonstationary problem

889.6 Bellman error is not learnable

172.5 Optimistic Initial Values

899.7 Gradient TD methods

182.6 Upper-Confidence-Bound-Action Selection

909.8 Emphatic TD methods

192.7 Gradient Bandit Algorithms

919.9 Reducing Variance

202.8 Associative Search (Contextual Bandits)

929.10 Summary

212.9 Summary

939.11 Questions

222.10 Questions

94Chapter 10. Eligibility Traces

23Chapter 3. Solving Problems with Dynamic Programming

9510.1 The λ-return

243.1 MDP

9610.2 TD(λ)

253.2 Categorizing RL algorithms

9710.3 n-step Truncated λ-return methods

263.3 Dynamic Programming

9810.4 Redoing Updates: Online l-return algorithm

273.4 Summary

9910.5 True Online TD(λ)

283.5 Questions

10010.6 Dutch Traces in MC learning

29Chapter 4. Monte Carlo Methods

10110.7 SARSA(λ)

304.1 Monte Carlo prediction

10210.8 Variable λ and γ

314.2 Monte Carlo estimation of action values

10310.10 Off-policy Traces with Control Variates

324.3 Monte Carlo Control

10410.10 Watkin’s Q(λ) to Tree-Backup (λ)

334.4 Monte Carlo Control without Exploring

10510.11 Stable off-policy methods with traces

34Starts

10610.12 Implementation Issues

354.5 Off policy Prediction via Importance Sampling

10710.13 Summary

364.6 Incremental Implementation

10810.14 Questions

374.7 Off-policy MC Control

109Chapter 11. Policy Gradient Methods

384.8 Discounting-aware Importance sampling

11011.1 Policy Approximation and its advantages

394.9 Per-decision Importance Sampling

11111.2 Policy Gradient Theorem

404.10 Summary

11211.3 REINFORCE: MC Policy Gradient

414.11 Questions

11311.4 REINFORCE with Baseline

42Chapter 5. Temporal-Difference Learning

11411.5 Actor-Critic methods

435.1 TD Prediction

11511.6 Policy Gradient for continuing problems

445.2 Advantages of TD Prediction methods

11611.7 Policy Parameterization for Continuous

455.3 Optimality of TD(0)

117Actions

465.4 SARSA: On-policy TD Control

11811.8 Summary

475.5 Q-learning: off-policy TD Control

11911.9 Questions

485.6 Expected SARSA

120Chapter 12. Planning and Learning with Tabular Methods

495.7 Maximization Bias and Double Learning

12112.1 Model and Planning

505.7 Summary

12212.2 Dyna: Integrated Planning, Acting, and

515.8 Questions

123Learning

52Chapter 6. n-step Bootstrapping

12412.3 When the Model Is Wrong

536.1 n-step TD prediction

12512.4 Prioritized Sweeping

546.2 n-step Sarsa

12612.5 Expected Vs. Sample Updates

556.3 n-step off-policy learning

12712.6 Trajectory Sampling

566.4 Per-decision methods

12812.7 Real-time Dynamic Programming

576.5 n-step Tree Backup algorithm

12912.8 Planning at Decision Time

586.6 Unifying algorithm

13012.9 Heuristic Search

596.7 Summary

13112.10 Rollout Algorithms

606.8 Questions

13212.11 Monte Carlo Tree Search (MCTS)

61Chapter 7. On-policy Prediction with Approximation

13312.12 Summary

627.1 Value Approximation & Function Approximation

13412.13 Questions

637.2 Prediction Objective(VE)

135Appendix A - Summary of Notation

647.3 Stochastic-gradient and semi-gradient

136Glossary

65methods

137Part 1

667.4 Linear methods

138Part 1

677.5 Selecting step-size parameters manually

139Tabular Solution Methods

687.6 Approximation of non-linear function:

140Part 2

69Artificial Neural Networks

141Part 2

707.7 Least-Squares TD

142Approximate Solution Methods

717.8 Memory-based Function Approximation

143On-policy control with Approximation

727.9 Kernel-based Function Approximation

More from Trilokesh Khatri

Hacker’s Guide to Machine Learning ConceptsTrilokesh Khatri10h 39m$30

Introduction to Machine Learning and Neural ClassificationTrilokesh Khatri8h 39m$30

Real-Time Big Data AnalyticsTrilokesh Khatri10h 5m$30

More from Technology

Cadillac Desert, Revised and Updated EditionMarc Reisner27h 56m$30

The Collected Works of Nikola TeslaNikola Tesla48h 43m$2 · $0

Elon MuskAshlee Vance13h 24m$33 · $0

Hands of TimeRebecca Struthers8h 8m$26 · $0

Inventions, Researches and Writings of Nikola TeslaNikola Tesla, Thomas Commerford Martin21h 49m$1 · $0

Across the Airless WildsEarl Swift10h 8m$29 · $0

Wireless WarsJonathan Pelson8h 2m$20 · $0

Let There Be WaterSeth M. Siegel8h 36m$20 · $0

WTF?Tim O'Reilly16h 15m$33 · $0

Crystal FireMichael Riordan, Lillian Hoddeson12h 52m$20

Wired for WarP. W. Singer20h 26m$32

Dark TerritoryFred Kaplan9h 2m$20

RecalculatingLindsey Pollak7h 50m$26 · $0

AlibabaDuncan Clark9h 9m$29 · $0

Everybody LiesSeth Stephens-Davidowitz7h 40m$26 · $0

A Deadly WanderingMatt Richtel12h 30m$29 · $0

SURVIVAL HANDBOOK - How to Find Water, Food and Shelter in Any Environment, How to Protect Yourself and Create Tools, Learn How to SurviveU.S. Department of Defense10h 3m$1.5 · $0

MarsLeonard David3h 27m$15

BehemothJoshua B. Freeman13h 43m$23

ITIL® 4 Create, Deliver and Support (CDS)Claire Agutter3h 36m$40 · $0

View all

More from Science and Nature

Distant SkiesMelissa A. Priblo Chapman12h 39m$23

Nikola Tesla - Ultimate Collection: 70+ Scientific Works, Lectures & EssaysNikola Tesla48h 43m$2 · $0

Total Cat MojoJackson Galaxy10h 13m$23 · $0

Earth at NightNational Aeronautics and Space Administration2h 31m$1 · $0

The Collected Works of John MuirJohn Muir96h 8m$1 · $0

Never Trust a Sneaky PonyMadison Seamans MS DVM12h 26m$23 · $0

Homer and the Holiday MiracleGwen Cooper43m$10 · $0

The Universe in Your HandChristophe Galfard9h 49m$20

SapiensYuval Noah Harari15h 18m5 (1)$33 · $0

20,000 LEAGUES UNDER THE SEA (Illustrated Edition)Jules Verne18h 24m$1 · $0

The Pleasure of Finding Things OutRichard P. Feynman8h 23m$20

The History of Malay ArchipelagoAlfred Russel Wallace23h 51m$1 · $0

Elon MuskAshlee Vance13h 24m$33 · $0

First StepsJeremy DeSilva9h 17m$29 · $0

The Big Fat SurpriseNina Teicholz13h 25m$23

Lad: A DogAlbert Payson Terhune6h 50m$2 · $0

The Grieving BrainMary-Frances O'Connor8h 2m$26 · $0

Natural ObsessionsNatalie Angier18h 1m$25

The Theory of RelativityAlbert Einstein4h 58m$1 · $0

The Secret Wisdom of NaturePeter Wohlleben6h 43m$20

View all

Reinforcement Learning

About this audiobook

Audiobook details

Table of contents

More from Trilokesh Khatri

More from Technology

More from Science and Nature

You may also like

Reinforcement Learning

About this audiobook

Audiobook details

Table of contents

More from Trilokesh Khatri

More from Technology

More from Science and Nature

You may also like