Food for Thought | Boost Productivity Through Multi Agent Reinforcement Learning2021-11-12T11:37:26+05:30

Food for Thought | Boost Productivity Through Multi Agent Reinforcement Learning

Great Things Come from Fighting Continuous Small Battles

It’s a fact that reinforcement learning (RL) has certain limitations; thanks to these, other advanced techniques like recurrent neural networks (RNN) and convolutional neural networks (CNN) are gaining greater popularity. The issue is more towards practical application, as either there is a limited section of problems where RL can be applied, or the problem can’t be molded in a way which is accepted by RL.

In order to apply RL to solve a problem, we must be able to answer certain questions and thus adjudge it as an appropriate candidate for RL. Most importantly, the problem should be able to transform into a Markov decision process (MDP), i.e. a user should be able to define a state-action space and a reward system and the overall framework should be able to do better by receiving feedback from the environment.[1]

If the above aspects are taken care of in the problem, RL can simulate some eye-popping what-if scenarios – e.g. the user could create a parallel scenario and compare how different the results could be if the actions were performed in a different manner/route than the original.

Supercharging RL: Combining RL with Deep Learning

Some of the limitations of RL can be offset by combining it with deep learning frameworks like CNN or RNN. By combining, agents garner some new abilities; in the case of CNN. The agent can see through the environment and learn to interact within it.[1] Similarly, in the case of RNN (which has a memory component), agents are assisted in memorizing things.

Multi Agent Reinforcement Learning

Many historic innovations have happened by studying the psychology or movement of animals and birds. For example, the Wright brothers built their airplane after taking cues from how birds fly.[2] A similar proposition works for multi-agent RL; the behavior of ants was studied to formulate multi-agent robots or agents. The most fascinating behavior of ants is their ability to closely coordinate a task (i.e. gathering food) and achieve a common goal. If something similar is taught to the robots/agents, the opportunity created thus will have no bounds.

The prime objective of RL is to build an agent in a way that their goal is to maximize rewards within an environment. There are instances where robots have outperformed humans in various fields. Taking a cue from this, there is a level-up in the field of RL called multi-agent reinforcement learning (MARL). This field studies how multiple agents can collectively learn, communicate, and co-exist in an environment to achieve a common goal.[3] If this becomes more of a reality, it will open a plethora of opportunities in fields like farming, manufacturing (e.g. building mega-factories) and healthcare (e.g. building critical hospitals in a pandemic situation). Of course, it could be applied in many ways to boost productivity.

Food For Thought

There is yet much to be explored in the field of RL and specifically in MARL, where the challenges include the efficient collaboration and co-existence of agents in the environment. For efficient collaboration, agents need to communicate; hence, there is a requirement for a common language. This part can be tricky; for example, one agent could develop a way of communication which the other agent doesn’t understand because they’ve followed a different learning trajectory. In this case, communication won’t be effective and fruitful.[4]

Hence, building an effective communication method so that agents can understand each other is the most critical aspect in the future development of any MARL system.

References

[1] Applications of Reinforcement Learning in the Real World

[2] What Influenced the Wright Brothers About How Things Fly

[3] The Gist of Multi Agent Reinforcement Learning

[4] MARL and the Future of AI

Authored by Abhijeet Agarwal, Data Scientist at Absolutdata