A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

What is the difference between Q-learning and SARSA learning?

Best Answers

This means that SARSA takes into account the control policy by which the agent is moving, and incorporates that into its update of action values, where Q-learning simply assumes that an optimal policy is being followed. read more

The difference between Q-learning and SARSA is that Q-learning makes an assumption about the control policy being used, and SARSA actually takes into account the behaviour of the control policy when updating q-values. read more

Well, not actually. A key difference between SARSA and Q-learning is that SARSA is an on-policy algorithm (it follows the policy that is learning) and Q-learning is an off-policy algorithm (it can follow any policy (that fulfills some convergence requirements). read more

When both SARSA and Q-learning use -greedy policy to strike the balance between exploration and exploitation, they still have different estimations on . Q-learning usually has more aggressive estimations, while SARSA usually has more conservative estimations. read more

Encyclopedia Research