Reinforcement Learning with an Ensemble: Decomposition or Combination?

Document Type


Publication Date



Computer Sciences


In the context of machine learning, reinforcement learning hardly requires an introduction and when combined with ensemble approach, it has been shown to perform better than a single learner. In ensemble approach, a collection of individual learners is trained to solve the same problem and each of their outputs are strategically combined to obtain better prediction. It is an effective tool that can improve the performance of a model and it has even been shown to be capable of decomposing the input space in a supervised learning task. But the existing works, where an ensemble of reinforcement learners are used, do not address or successfully explain their ability or potential to decompose a task. For deep reinforcement learning, if a single network, that is, if a single learner is responsible for mastering various components of a complex task, it can create some undesired performance issue. Training a single network to perform differently on different cases, that might be contrasting in nature, can cause interference effect, slow learning and poor generalization. Instead of a single learner, if there is an ensemble of learners, a concept can arise where each individual learner in the ensemble is made responsible for handling separate segments of a task. We want to look into whether this concept can be executed with reinforcement learning by using an ensemble of Q-learners. With our research, we intend to investigate if the ensemble simply combines the learners to solve the task as a whole or if it is capable of decomposing the state space of the task and solve each component using the individual learners. The implementation of our ensemble will be compared with a naïve ensemble as well as the current best Q-learning model i.e. a Q-learner with a single deep neural network.

First Advisor

Daniel Elliott

Second Advisor

Santosh KC

Research Area

Computer Science

This document is currently not available here.