Praveen Palanisamy

Praveen Palanisamy is a Lead Principal AI Engineer and Manager at Microsoft and works on building AI/ML-powered platform products for Autonomous Systems. His current focus is on building enterprise-grade Generative AI platform capabilities including a suite of custom applications that leverage Microsoft Copilot stack, Azure OpenAI, and OSS LLMs. Recently, he has been building Copilot/Agent-based platform components, automating LLMOps pipelines, memory-agumented Agents, pretraining/finetuning custom LMMs/LLMs and prototyping full-stack AI-Apps. He leads a cross-functional team to research and engineer platform and system components that leverage and empower customers with AI capabilities such as for Project AirSim in the Autonomous Systems, Business AI Incubation group, building an end-to-end (Perception + Scene-Understanding + Prediction + Planning + Control) autonomy platform for autonomous aerial robots and systems using Simulation, Planet-scale Synthetics, and Deep Reinforcement learning using Project Bonsai. Prior to that, he was an Autonomous Driving AI Researcher at General Motors R&D in Michigan, where, he developed planning and decision making algorithms and architectures using Deep Reinforcement Learning. He is the lead inventor of 70+ patents in the area of autonomous systems. He has authored two practical books – HOIAWOG and TensorFlow 2.x RL Cookbook for use by ML engineers, researchers, students and enthusiasts. He has worked at a few early-stage startups as a tech lead. He obtained his Graduate degree from the Robotics Institute, Carnegie Mellon University, and worked on Autonomous Navigation, Perception and Artificial Intelligence as a Research and Teaching Assistant.

Key Technical Impacts & Metrics

83 Patents

22 Granted

50+ pending

500+ Patent Citations

50+ Companies including Google Deepmind, Tesla/Elon, Nvidia, Apple cite my name in their patents. See more impact highlights here

17+ Journal/Conference Papers

2 Books: hands-on Deep RL & AI TF2RL, HOIAWOG

670+ citations


20+ reviews as TPC/Reviewer



7+ full-stack AI Agents & Apps

  • 8+ finetuned LLMS + LMMs

  • GPT4 for mission planning

  • Llama2 for code PRs

  • Semantic-SAM + Grounding DINO for AutoGen

  • LERFs-based volumetric rendering for 3D Sims

  • LangChain/SemanticKernel + PromptFlow for Copilot Orchestration

  • Next.js (React) App for Conversational LLMs w/ RAG + Memory (Vector DB)

  • LLM serving using VLLM, Triton, Ray, SkyPilot

Team leader: 10+ direct reports, interns

Sr. AI Engineers, SWEs, UI/UX Devs, PhD interns

5+ direct Customer & Partner engagements

Open Source Projects


Cross-platform Memory and Retrieval-Augmented Finder (File Explorer) App to chat with your data to find answers powered by OpenAI-GPT/Llama2/Transformer.js models. You can load multiple DOCX, PDF, CSV, Markdown, HTML or other files and ask questions related to their content, and the app will use embeddings and LLMs to generate answers from the most relevant files and sections within your files. Leverages Vector DB for persistant Memory & Chats, supports asyncrhonous LLM streaming Ops and Conversation Orchestration. See project page for more details and how you can run or deploy on your own machine, cloud or hybrid environments.

...   ...


Multi-Agent Connected Autonomous Driving (MACAD) Gym environments for Deep RL.

...   ...


This book contains easy-to-follow recipes for leveraging TensorFlow 2.x to develop artificial intelligence applications. Starting with an introduction to the fundamentals of deep reinforcement learning and TensorFlow 2.x, the book covers OpenAI Gym, model-based RL, model-free RL, and how to develop basic agents. You’ll discover how to implement advanced deep reinforcement learning algorithms such as actor-critic, deep deterministic policy gradients, deep-Q networks, proximal policy optimization, and deep recurrent Q-networks for training your RL agents. As you advance, you’ll explore the applications of reinforcement learning by building cryptocurrency trading agents, stock/share trading agents, and intelligent agents for automating task completion. Finally, you’ll find out how to deploy deep reinforcement learning agents to the cloud and build cross-platform apps using TensorFlow 2.x.

...   ...

Multiple Object Tracking using LiDARs

Multiple objects detection, tracking and classification from LIDAR scans/point-clouds

...   ...

Hands-on Intelligent Agent development using OpenAI Gym

HOIAWOG! Your guide to developing AI agents using deep reinforcement learning. Implement intelligent agents using PyTorch to solve classic AI problems, play console games like Atari, and perform tasks such as autonomous driving using the CARLA driving simulator.

...   ...

Timeline of Events

December 14, 2022

[Webinar]Transforming Infrastructure Inspection with Simulation and Autonomy

Spoke at the Commercial Unmanned Aerial Vehicle (CUAV) news webinar on Transforming Infrastructure Inspection with Simulation and Autonomy. I was joined by John McKenna, Co-Founder & CEO of sees.ai and Timothy Reuter from Microsoft. Discussed a few key aspects on how running high-fidelity simulations at scale can accelerate mission planning, software & AI/ML model development and iteration cycles. Leveraging AI and the Autonomy Building blocks including pre-trained models that can be fine-tuned to build custom autonomy modules is a key enabler for accelerating the journey towards aerial autonomy. I also covered some of the key features and focus areas of the Microsoft Project AirSim platform that enables the entire end-to-end pipeline for aerial autonomy. I went over two specific application scenarios: 1. Cell Tower inspection and 2. Bridge inspection. Link to the Webinar page. Link to the recording of the webinar. A snapshot summary is available in the webinar handout slides. Post on LinkedIn.

Click to

November 6, 2020

[Talk]FIU Seminar on Multi-Agent Deep Reinforcement Learning for Connected Autonomous Driving

Delivered seminar at Florida Internation University School of Computing and Information Sciences (FIUSCIS). Abstract: The ability to autonomously navigate in 2D, 3D and unconstrained spaces by vehicles, robots or agents is desirable for several real-world applications. Autonomous driving on roads, which is a subset of the autonomous navigation space has become one of the major focus in the automotive industry in the recent times in addition to electrification. It involves autonomous vehicles navigating safely and socially from their start location to their desired goal location in usually complex environments. The autonomous driving field has advanced to the point of feasible deployments in the real-world. But they are limited in several ways including their domain of operation. The capability to learn and adapt to changes in the driving environment and in the intents of other road actors is crucial for autonomous driving systems to scale beyond the current, limited operation design domains. With the increasingly ubiquitous availability of 5G communication infrastructure, connectivity among vehicles provides a whole new avenue for connected autonomous driving. This talk is on using multi-agent deep reinforcement learning as a framework for formulating autonomous driving problems and developing solutions for these problems using simulation. This talk proposes the use of Partially Observable Markov Games for formulating the connected autonomous driving problems with realistic assumptions. The taxonomy of multi-agent learning environments based on the nature of tasks, nature of agents and the nature of the environment to help in categorizing various autonomous driving problems that can be addressed under the proposed formulation will be discussed. In addition, MACAD-Gym, a multi-agent learning platform with an extensible set of Connected Autonomous Driving (CAD) simulation environments that enable the research and development of Deep RL based integrated sensing, perception, planning and control algorithms for CAD systems with unlimited operational design domain under realistic, multi-agent settings will also be discussed. The talk concludes with remarks on autonomous navigation in 3D space, AirSim, Bonsai and an overview of Microsoft Autonomous Systems.

Click to

December 8, 2019

[NeurIPS19] Multi-Agent Connected Autonomous Driving using Deep Reinforcement Learning

The capability to learn and adapt to changes in the driving environment is crucial fordeveloping autonomous driving systems that are scalable beyond geo-fenced oper-ational design domains. Deep Reinforcement Learning (RL) provides a promisingand scalable framework for developing adaptive learning based solutions. Deep RLmethods usually model the problem as a (Partially Observable) Markov DecisionProcess in which an agent acts in a stationary environment to learn an optimalbehavior policy. However, driving involves complex interaction between multiple,intelligent (artificial or human) agents in a highly non-stationary environment. Inthis paper, we propose the use of Partially Observable Markov Games(POSG) forformulating the connected autonomous driving problems with realistic assumptions.We provide a taxonomy of multi-agent learning environments based on the natureof tasks, nature of agents and the nature of the environment to help in categorizingvarious autonomous driving problems that can be addressed under the proposedformulation. As our main contributions, we provide MACAD-Gym, a Multi-AgentConnected, Autonomous Driving agent learning platform for furthering research inthis direction. Our MACAD-Gym platform provides an extensible set of ConnectedAutonomous Driving (CAD) simulation environments that enable the research anddevelopment of Deep RL- based integrated sensing, perception, planning andcontrol algorithms for CAD systems with unlimited operational design domainunder realistic, multi-agent settings. We also share the MACAD-Agents that weretrained successfully using the MACAD-Gym platform to learn control policies formultiple vehicle agents in a partially observable, stop-sign controlled, 3-way urbanintersection environment with raw (camera) sensor observations. Paper: https://arxiv.org/abs/1911.04175 Code: https://github.com/praveen-palanisamy/macad-gym

Click to

September 7, 2019

Received “Best Reinforcement Learning ebooks of all time” award!

I’m happy to announce that my book, HOIAWOG! “Hands-On Intelligent Agents with OpenAI Gym: Your guide to developing AI agents using deep reinforcement learning”, made it to the Best Reinforcement Learning eBooks of All Time! compiled by BookAuthority. BookAuthority collects and ranks the best books in the world, and it is a great honor to get this kind of recognition. Thank you for all the reader’s support! You can learn more about the HOIAWOG book here. The source code for all the agents, algorithms and implementation details are available on GitHub. You can get a copy of the book from Amazon.

Click to

November 4, 2018

[ITSC18]POMDP and Hierarchical Options MDP with Continuous Actions for Autonomous Driving at Intersections

When applying autonomous driving technology to real-world scenarios, environmental uncertainties make the development of decision-making algorithms difficult. Modeling the problem as a Partially Observable Markov Decision Process (POMDP) [1] allows the algorithm to consider these uncertainties in the decision process, which makes it more robust to real sensor characteristics. However, solving the POMDP with reinforcement learning (RL) [2] often requires storing a large number of observations. Furthermore, for continuous action spaces, the system is computationally inefficient. This paper addresses these problems by proposing to model the problem as an MDP and learn a policy with RL using hierarchical options (HOMDP). The suggested algorithm can store the state-action pairs and only uses current observations to solve a POMDP problem. We compare the results of to the time-to-collision method [3] and the proposed POMDP-with-LSTM method. Our results show that the HOMDP approach is able to improve the performance of the agent for a four-way intersection task with two-way stop signs. The HOMDP method can generate both higher-level discrete options and lower-level continuous actions with only the observations of the current step. Citing: Z. Qiao, K. Muelling, J. M. Dolan, P. Palanisamy and P. Mudalige, “POMDP and Hierarchical Options MDP with Continuous Actions for Autonomous Driving at Intersections,” 2018 IEEE 21st International Conference on Intelligent Transportation Systems, Maui, 2018

Click to

June 26, 2019

[IV18]Automatic Curriculum Generation for RL in Autonomous Vehicles in Urban Environment

We address the problem of learning autonomous driving behaviors in urban intersections using deep reinforcement learning (DRL). DRL has become a popular choice for creating autonomous agents due to its success in various tasks. However, as the problems tackled become more complex, the number of training iterations necessary increase drastically. Curriculum learning has been shown to reduce the required training time and improve the performance of the agent, but creating an optimal curriculum often requires human handcrafting. In this work, we learn a policy for urban intersection crossing using DRL and introduce a method to automatically generate the curriculum for the training process from a candidate set of tasks. We compare the performance of the automatically generated curriculum (AGC) training to those of randomly generated sequences and show that AGC can significantly reduce the training time while achieving similar or better performance. keywords: {learning (artificial intelligence);mobile robots;optimal curriculum;human handcrafting;urban intersection crossing;DRL;training process;automatically generated curriculum training;randomly generated sequences;autonomous vehicles;urban environment;autonomous driving behaviors;urban intersections;deep reinforcement learning;autonomous agents;training iterations necessary increase;curriculum learning;required training time;AGC;Training;Autonomous vehicles;Task analysis;Learning (artificial intelligence);Machine learning;Heuristic algorithms}, Citing: Z. Qiao, K. Muelling, J. M. Dolan, P. Palanisamy and P. Mudalige, “Automatically Generated Curriculum based Reinforcement Learning for Autonomous Vehicles in Urban Environment,” 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, 2018, pp. 1233-1238. doi: 10.1109/IVS.2018.8500603

Click to

June 26, 2018

[IV18]Learning Vehicle Surrounding-aware Lane-changing Behavior from Observed Trajectories

Predicting lane-changing intentions has long been a very active area of research in the autonomous driving community. However, most of the literature has focused on individual vehicles and did not consider both the neighbor information and the accumulated effects of vehicle history trajectories when making the predictions. We propose to apply a surrounding-aware LSTM algorithm for predicting the intention of a vehicle to perform a lane change that takes advantage of both vehicle past trajectories and their neighbor’s current states. We trained the model on real-world lane changing data and were able to show in simulation that these two components can lead not only to higher accuracy, but also to earlier lane-changing prediction time, which plays an important role in potentially improving the autonomous vehicle’s overall performance. keywords: {mobile robots;road safety;road vehicles;vehicle surrounding-aware lane-changing behavior;autonomous driving community;vehicle history trajectories;surrounding-aware LSTM algorithm;autonomous vehicle;lane-changing prediction time;lane-changing intentions;Trajectory;History;Prediction algorithms;Automobiles;Training;Predictive models;Feature extraction;LSTM;lane-change intention}, Citing: S. Su, K. Muelling, J. Dolan, P. Palanisamy and P. Mudalige, “Learning Vehicle Surrounding-aware Lane-changing Behavior from Observed Trajectories,” 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, 2018, pp. 1412-1417. doi: 10.1109/IVS.2018.8500445

Click to

My Career


Building blocks for Autonomous Systems

Jun. 2019 - Present
Principal AI Engineer

General Motors R&D

Deep RL for Autonomous Driving

Jan 2016

Carnegie Mellon University

Autonomous Navigation, Perception & Deep Learning

Aug 2014