Over the past few years, machine learning algorithms, especially neural networks (NN) have seen a surge of popularity owing to their potential in solving a wide variety of complex problems across image classification and speech recognition. Unfortunately, in order to be effective, NNs need to have the appropriate topology (connections between neurons) for the task at hand and have the right weights on the connections. This is known as supervised learning and requires training the NN by running it through terabytes to petabytes of data. This form of machine learning is infeasible for the emerging domain of autonomous systems (robots/drones/cars) which will often operate in environments where the right topology for the task may be unknown or keep changing, and robust training data is not available. Autonomous systems need the ability to mirror human-like learning, where we are continuously learning, and often from experiences rather than being explicitly trained. This is known as reinforcement learning (RL). The goal of this project will be on enabling RL in energy-constrained autonomous devices. If successful, this research will enable mass proliferation of automated robots or drones to assist human society. The learnings will also be used to develop new courses on cross-layer support for machine learning.
The focus of the research will be on neuroevolution (NE), a class of RL algorithms that evolve NN topologies and weights using evolutionary algorithms. The idea is to run multiple "parent" NNs in parallel, have the environment provide a reward (score) to the actions of all NNs, and create a population of new "child" NNs that preserve those nodes and connections from the parents that lead to actions producing the maximum reward. Running NE algorithms over multiple iterations has been shown to evolve complex behaviors in NNs. Unfortunately, NEs are computationally very expensive and have required large scale compute clusters running for hours before converging. A characterization of the computation and memory behavior of NE algorithms will be performed, and opportunities to massively parallelize these algorithms across genes (i.e., nodes and connections in the NN) will be explored. The evolutionary learning steps of crossover and mutation will be performed over specialized hardware engines, and a low-power architectural platform running NE algorithms at the edge will be demonstrated. Furthermore, the proposed research will serve as the foundation for further research in fast and energy-efficient RL algorithms to help realize general-purpose artificial intelligence.