What are different methods in Reinforcement Learning?


Reinforcement Learning refers to type of machine learning algorithm where a balance between exploration (of uncharted territory) and exploitation (of current knowledge) is used to maximize the cumulative reward function.

Illustration of concepts in reinforcement learning

Reinforcement Learning at the highest level can be categorized into the following two types:

  • Model Free RL [read more]

    Model Free RL is when the agent(s) do not use a model of the environment to plan for actions ahead. Model Free RL operates based on either Policy Optimization where the parameters specifying the policy are tuned to arrive at an optimal policy (here policy refers to an algorithm determining the next action of the agent given a certain state of the environment), or Q-Learning where the action-value function is optimized (here action-value function refers to expected return starting with a certain action given a certain state of the environment) or, a mixture of these two methods to achieve an optimal outcome.

  • Model-based RL

    Model-based RL is when a model of the environment is used by the agent(s) to plan ahead. Model-based RL can be based on a given model or aim to learn a model of the environment.