Optimizing Large-Scale Systems with Reinforcement Learning 1

Data & IT

Optimizing Large-Scale Systems with Reinforcement Learning

Name: Optimizing Large-Scale Systems with Reinforcement Learning
Price: 639 SEK
Availability: InStock
Author: Sayak Ray Chowdhury
ISBN: 9798224721306

Sayak Ray Chowdhury

Pocket

639:-

Funktionen begränsas av dina webbläsarinställningar (t.ex. privat läge).

Uppskattad leveranstid 7-11 arbetsdagar

Fri frakt för medlemmar vid köp för minst 249:-

190 sidor
2024

Reinforcement learning (RL) is concerned with learning to take actions to maximize rewards,
by trial and error, in environments that can evolve in response to actions. A
Markov decision process (MDP) [6] is a popular framework to model decision making in
RL environments. In the MDP, starting from an initial observed state, an agent repeatedly
(a) takes an action, (b) receives a reward, and (c) observes the next state of the MDP.
The traditional objective in RL is a search goal - find a policy (a rule to select an action
for each state) with high total reward using as few interactions with the environment as
possible, known as the sample complexity of RL problem [7]. This is, however, quite different
from the corresponding optimization goal, where the learner seeks to maximize the
total reward earned from all its decisions, or equivalently, minimize the regret or shortfall
in total reward compared to that of an optimal policy [8]. This objective is relevant in
many practical sequential decision-making settings in which every decision that is taken
carries utility or value - recommendation systems (clicks by consumers translate into revenue),
sequential investment and portfolio allocation (financial holdings make profits or
losses), dynamic resource allocation in communication systems scheduling decisions affect
data throughput), to name a few.

Författare: Sayak Ray Chowdhury
Format: Pocket/Paperback
ISBN: 9798224721306
Språk: Engelska
Antal sidor: 190
Utgivningsdatum: 2024-03-29
Förlag: Classichouse