About Me

Praveen Palanisamy

Praveen Palanisamy is the Principal AI Engineering lead for Project AirSim in the Autonomous Systems Business Incubation group at Microsoft, building an end-to-end (Perception + Scene-Understanding + Prediction + Planning + Control) autonomy platform for autonomous aerial robots and systems using Simulation, Planet-scale Synthetics, Deep Learning and AI. Prior to that, he was an Autonomous Driving AI Researcher at General Motors R&D in Michigan, where, he developed planning and decision making algorithms and architectures using Deep Reinforcement Learning. He is the lead inventor of 20+ patents in the area of learning-based autonomous mobile systems. He has authored two practical books – HOIAWOG and TensorFlow 2.x RL Cookbook for use by ML engineers, researchers, students and enthusiasts. He has worked at a few early-stage startups as a tech lead. He obtained his Graduate degree from the Robotics Institute, Carnegie Mellon University, and worked on Autonomous Navigation, Perception and Artificial Intelligence as a Research and Teaching Assistant.

My Projects


Multi-Agent Connected Autonomous Driving (MACAD) Gym environments for Deep RL.

...   ...


This book contains easy-to-follow recipes for leveraging TensorFlow 2.x to develop artificial intelligence applications. Starting with an introduction to the fundamentals of deep reinforcement learning and TensorFlow 2.x, the book covers OpenAI Gym, model-based RL, model-free RL, and how to develop basic agents. You'll discover how to implement advanced deep reinforcement learning algorithms such as actor-critic, deep deterministic policy gradients, deep-Q networks, proximal policy optimization, and deep recurrent Q-networks for training your RL agents. As you advance, you’ll explore the applications of reinforcement learning by building cryptocurrency trading agents, stock/share trading agents, and intelligent agents for automating task completion. Finally, you'll find out how to deploy deep reinforcement learning agents to the cloud and build cross-platform apps using TensorFlow 2.x.

...   ...

Multiple Object Tracking using LiDARs

Multiple objects detection, tracking and classification from LIDAR scans/point-clouds

...   ...

Hands-on Intelligent Agent development using OpenAI Gym

HOIAWOG! Your guide to developing AI agents using deep reinforcement learning. Implement intelligent agents using PyTorch to solve classic AI problems, play console games like Atari, and perform tasks such as autonomous driving using the CARLA driving simulator.

...   ...

Timeline of Events

November 6, 2020

[Talk]FIU Seminar on Multi-Agent Deep Reinforcement Learning for Connected Autonomous Driving

Delivered seminar at Florida Internation University School of Computing and Information Sciences (FIUSCIS). Abstract: The ability to autonomously navigate in 2D, 3D and unconstrained spaces by vehicles, robots or agents is desirable for several real-world applications. Autonomous driving on roads, which is a subset of the autonomous navigation space has become one of the major focus in the automotive industry in the recent times in addition to electrification. It involves autonomous vehicles navigating safely and socially from their start location to their desired goal location in usually complex environments. The autonomous driving field has advanced to the point of feasible deployments in the real-world. But they are limited in several ways including their domain of operation. The capability to learn and adapt to changes in the driving environment and in the intents of other road actors is crucial for autonomous driving systems to scale beyond the current, limited operation design domains. With the increasingly ubiquitous availability of 5G communication infrastructure, connectivity among vehicles provides a whole new avenue for connected autonomous driving. This talk is on using multi-agent deep reinforcement learning as a framework for formulating autonomous driving problems and developing solutions for these problems using simulation. This talk proposes the use of Partially Observable Markov Games for formulating the connected autonomous driving problems with realistic assumptions. The taxonomy of multi-agent learning environments based on the nature of tasks, nature of agents and the nature of the environment to help in categorizing various autonomous driving problems that can be addressed under the proposed formulation will be discussed. In addition, MACAD-Gym, a multi-agent learning platform with an extensible set of Connected Autonomous Driving (CAD) simulation environments that enable the research and development of Deep RL based integrated sensing, perception, planning and control algorithms for CAD systems with unlimited operational design domain under realistic, multi-agent settings will also be discussed. The talk concludes with remarks on autonomous navigation in 3D space, AirSim, Bonsai and an overview of Microsoft Autonomous Systems.

Click to

December 8, 2019

[NeurIPS19] Multi-Agent Connected Autonomous Driving using Deep Reinforcement Learning

The capability to learn and adapt to changes in the driving environment is crucial fordeveloping autonomous driving systems that are scalable beyond geo-fenced oper-ational design domains. Deep Reinforcement Learning (RL) provides a promisingand scalable framework for developing adaptive learning based solutions. Deep RLmethods usually model the problem as a (Partially Observable) Markov DecisionProcess in which an agent acts in a stationary environment to learn an optimalbehavior policy. However, driving involves complex interaction between multiple,intelligent (artificial or human) agents in a highly non-stationary environment. Inthis paper, we propose the use of Partially Observable Markov Games(POSG) forformulating the connected autonomous driving problems with realistic assumptions.We provide a taxonomy of multi-agent learning environments based on the natureof tasks, nature of agents and the nature of the environment to help in categorizingvarious autonomous driving problems that can be addressed under the proposedformulation. As our main contributions, we provide MACAD-Gym, a Multi-AgentConnected, Autonomous Driving agent learning platform for furthering research inthis direction. Our MACAD-Gym platform provides an extensible set of ConnectedAutonomous Driving (CAD) simulation environments that enable the research anddevelopment of Deep RL- based integrated sensing, perception, planning andcontrol algorithms for CAD systems with unlimited operational design domainunder realistic, multi-agent settings. We also share the MACAD-Agents that weretrained successfully using the MACAD-Gym platform to learn control policies formultiple vehicle agents in a partially observable, stop-sign controlled, 3-way urbanintersection environment with raw (camera) sensor observations. Paper: https://arxiv.org/abs/1911.04175 Code: https://github.com/praveen-palanisamy/macad-gym

Click to

September 7, 2019

Received “Best Reinforcement Learning ebooks of all time” award!

I’m happy to announce that my book, HOIAWOG! “Hands-On Intelligent Agents with OpenAI Gym: Your guide to developing AI agents using deep reinforcement learning”, made it to the Best Reinforcement Learning eBooks of All Time! compiled by BookAuthority. BookAuthority collects and ranks the best books in the world, and it is a great honor to get this kind of recognition. Thank you for all the reader’s support! You can learn more about the HOIAWOG book here. The source code for all the agents, algorithms and implementation details are available on GitHub. You can get a copy of the book from Amazon.

Click to

November 4, 2018

[ITSC18]POMDP and Hierarchical Options MDP with Continuous Actions for Autonomous Driving at Intersections

When applying autonomous driving technology to real-world scenarios, environmental uncertainties make the development of decision-making algorithms difficult. Modeling the problem as a Partially Observable Markov Decision Process (POMDP) [1] allows the algorithm to consider these uncertainties in the decision process, which makes it more robust to real sensor characteristics. However, solving the POMDP with reinforcement learning (RL) [2] often requires storing a large number of observations. Furthermore, for continuous action spaces, the system is computationally inefficient. This paper addresses these problems by proposing to model the problem as an MDP and learn a policy with RL using hierarchical options (HOMDP). The suggested algorithm can store the state-action pairs and only uses current observations to solve a POMDP problem. We compare the results of to the time-to-collision method [3] and the proposed POMDP-with-LSTM method. Our results show that the HOMDP approach is able to improve the performance of the agent for a four-way intersection task with two-way stop signs. The HOMDP method can generate both higher-level discrete options and lower-level continuous actions with only the observations of the current step. Citing: Z. Qiao, K. Muelling, J. M. Dolan, P. Palanisamy and P. Mudalige, “POMDP and Hierarchical Options MDP with Continuous Actions for Autonomous Driving at Intersections,” 2018 IEEE 21st International Conference on Intelligent Transportation Systems, Maui, 2018

Click to

June 26, 2019

[IV18]Automatic Curriculum Generation for RL in Autonomous Vehicles in Urban Environment

We address the problem of learning autonomous driving behaviors in urban intersections using deep reinforcement learning (DRL). DRL has become a popular choice for creating autonomous agents due to its success in various tasks. However, as the problems tackled become more complex, the number of training iterations necessary increase drastically. Curriculum learning has been shown to reduce the required training time and improve the performance of the agent, but creating an optimal curriculum often requires human handcrafting. In this work, we learn a policy for urban intersection crossing using DRL and introduce a method to automatically generate the curriculum for the training process from a candidate set of tasks. We compare the performance of the automatically generated curriculum (AGC) training to those of randomly generated sequences and show that AGC can significantly reduce the training time while achieving similar or better performance. keywords: {learning (artificial intelligence);mobile robots;optimal curriculum;human handcrafting;urban intersection crossing;DRL;training process;automatically generated curriculum training;randomly generated sequences;autonomous vehicles;urban environment;autonomous driving behaviors;urban intersections;deep reinforcement learning;autonomous agents;training iterations necessary increase;curriculum learning;required training time;AGC;Training;Autonomous vehicles;Task analysis;Learning (artificial intelligence);Machine learning;Heuristic algorithms}, Citing: Z. Qiao, K. Muelling, J. M. Dolan, P. Palanisamy and P. Mudalige, “Automatically Generated Curriculum based Reinforcement Learning for Autonomous Vehicles in Urban Environment,” 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, 2018, pp. 1233-1238. doi: 10.1109/IVS.2018.8500603

Click to

June 26, 2018

[IV18]Learning Vehicle Surrounding-aware Lane-changing Behavior from Observed Trajectories

Predicting lane-changing intentions has long been a very active area of research in the autonomous driving community. However, most of the literature has focused on individual vehicles and did not consider both the neighbor information and the accumulated effects of vehicle history trajectories when making the predictions. We propose to apply a surrounding-aware LSTM algorithm for predicting the intention of a vehicle to perform a lane change that takes advantage of both vehicle past trajectories and their neighbor’s current states. We trained the model on real-world lane changing data and were able to show in simulation that these two components can lead not only to higher accuracy, but also to earlier lane-changing prediction time, which plays an important role in potentially improving the autonomous vehicle’s overall performance. keywords: {mobile robots;road safety;road vehicles;vehicle surrounding-aware lane-changing behavior;autonomous driving community;vehicle history trajectories;surrounding-aware LSTM algorithm;autonomous vehicle;lane-changing prediction time;lane-changing intentions;Trajectory;History;Prediction algorithms;Automobiles;Training;Predictive models;Feature extraction;LSTM;lane-change intention}, Citing: S. Su, K. Muelling, J. Dolan, P. Palanisamy and P. Mudalige, “Learning Vehicle Surrounding-aware Lane-changing Behavior from Observed Trajectories,” 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, 2018, pp. 1412-1417. doi: 10.1109/IVS.2018.8500445

Click to

My Career


Building blocks for Autonomous Systems

Jun. 2019 - Present
Principal AI Engineer

General Motors R&D

Deep RL for Autonomous Driving

Jan 2016

Carnegie Mellon University

Autonomous Navigation, Perception & Deep Learning

Aug 2014