Optimal Policies for MDPs: Difference between revisions

From Algorithm Wiki
Jump to navigation Jump to search
(Created page with "{{DISPLAYTITLE:Optimal Policies for MDPs (Optimal Policies for MDPs)}} == Description == In an MDP, a policy is a choice of what action to choose at each state An Optimal Policy is a policy where you are always choosing the action that maximizes the “return”/”utility” of the current state. The problem here is to find such an optimal policy from a given MDP. == Parameters == No parameters found. == Table of Algorithms == {| class="wikitable sortable" styl...")
 
No edit summary
Line 24: Line 24:
|}
|}


== Time Complexity graph ==  
== Time Complexity Graph ==  


[[File:Optimal Policies for MDPs - Time.png|1000px]]
[[File:Optimal Policies for MDPs - Time.png|1000px]]


== Space Complexity graph ==  
== Space Complexity Graph ==  


[[File:Optimal Policies for MDPs - Space.png|1000px]]
[[File:Optimal Policies for MDPs - Space.png|1000px]]


== Pareto Decades graph ==  
== Pareto Frontier Improvements Graph ==  


[[File:Optimal Policies for MDPs - Pareto Frontier.png|1000px]]
[[File:Optimal Policies for MDPs - Pareto Frontier.png|1000px]]

Revision as of 13:05, 15 February 2023

Description

In an MDP, a policy is a choice of what action to choose at each state An Optimal Policy is a policy where you are always choosing the action that maximizes the “return”/”utility” of the current state. The problem here is to find such an optimal policy from a given MDP.

Parameters

No parameters found.

Table of Algorithms

Name Year Time Space Approximation Factor Model Reference
Bellman Value Iteration (VI) 1957 $O({2}^n)$ $O(n)$ Exact Deterministic Time
Howard Policy Iteration (PI) 1960 $O(n^{3})$ $O(n)$ Exact Deterministic Time
Puterman Modified Policy Iteration (MPI) 1974 $O(n^{3})$ $O(n)$ Exact Deterministic

Time Complexity Graph

Optimal Policies for MDPs - Time.png

Space Complexity Graph

Optimal Policies for MDPs - Space.png

Pareto Frontier Improvements Graph

Optimal Policies for MDPs - Pareto Frontier.png