List

Implementing an Intelligent & Autonomous Car-Driving Agent using Deep n-step Actor-Critic Algorithm

This is a condensed quick-start version of Chapter 8: Implementing an Intelligent & Autonomous, Car-Driving Agent using Deep n-step Actor-Critic Algorithm discussed in the Hands-on Intelligent agents with OpenAI Gym book. The book chapter teaches you the fundamentals of Policy Gradient based reinforcement learning algorithms and helps you intuitively understand the deep n-step advantage actor-critic algorithm. The chapter then continues with a guide to implement a super-intelligent agent that can drive a car autonomously in the Carla driving simulator using both the synchronous as well as asynchronous implementation of the deep n-step advantage actor-critic algorithm. Being a quick-start guide, this post first lists down the concepts covered and then dives straight into the code structure and elaborates on how you can train deep n-step advantage actor-critic agents in the Carla driving environment. The implementation is in PyTorch and all the necessary code and even trained agent brains are available in the book’s code repository. This post explains the code structure file-by-file with references to the python scripts in the code repository so that it is easy to follow. The outline of this post is as follows:

  1. Brief chapter summary and outline of topics covered
  2. Code Structure
  3. Running the code

HOIAWOG A3C Carla 9

A sample screencapture showing 9 agents training asynchronously launched using the async_a2c_agent.py script with num_agents parameter in async_a2c_parameters.json set to 9. (Refer to Async A2C Training section for the command used to launch the training)

1. Brief chapter summary and outline of topics covered

This chapter teaches you the fundamentals of the Policy Gradient based reinforcement learning algorithms and helps you intuitively understand the deep n-step advantage actor-critic algorithm. You will then learn to implement a super-intelligent agent that can drive a car autonomously in the Carla simulator using both the synchronous as well as asynchronous implementation of the deep n-step advantage actor-critic algorithm.

Following is a higher-level outline of the topics covered in this chapter:

  • Deep n-step Advantage Actor-Critic algorithm
    • Policy Gradients
      • The likelyhood ratio trick
      • The policy gradient theorem
    • Actor-Critic algorithms
    • Advantage Actor-Critic algorithm
    • n-step Advantage Actor-Critic algorithm
      • n-step returns
      • Implementing the n-step return calculation
  • Implementing deep n-step Advantage Actor-Critic algorithm
  • Training an intelligent and autonomous driving agent
    • Training the agent to drive a car in the CARLA driving simulator

2. Code structure

  • a2c_agent.py –> Main script to launch the deep n-step Advantage Actor-Critic (A2C) agent
  • a2c_parameters.json –> Configuration parameters for the a2c_agent and the environment
  • async_a2c_agent.py –> Main script to launch the deep n-step Asynchronous Advantage Actor-Critic (A3C) agent
  • async_a2c_parameters.json –> Configuration parameters for the async_a2c_agent.py and the environment
  • batched_a2c_agent.py –> Example script showing how agents can be run in parallel with batches of environments
  • environment –> Module containing environment implementations, wrapper and interfaces
    • atari.py –> Wrappers and env pre-processing functions for the Atari Gym environment
    • carla_gym –> OpenAI Gym compatible Carla driving environment module (see chapter 7 for more details about impl)
      • envs –> the Carla Gym environment
        • carla –> Refer to Chapter 7 for implementation details
        • carla_env.py –> Carla driving environment implementation
        • init.py
        • scenarios.json –> Carla environment configuration parameters to change the driving scenarios: map/city, weather conditions, route etc.
      • __init__.py
    • __init__.py
    • utils.py –> Utilities to vectorize and run environment instances in parallel as separate processes
  • function_approximator –> Module with neural network implementations
    • deep.py –> Deep neural network implementations in PyTorch for policy and value function approximation
    • __init__.py
    • shallow.py –> Shallow neural network implementations in PyTorch for policy and value function approximations
  • logs –> Folder to contain the Tensorboard log files for each run (or experiment)
    • ENVIRONMENT_NAME_RUN_TIMESTAMP* –> Folder created for each run based on environment name and the run timestamp
      • agent_params.json –> The parameters used by the agent corresponding to this run/experiment
      • env_params.json –> The environment configuration parameters used in this run/experiment
      • events.out.tfevents.* –> Tensorboard event log files
  • README.md
  • trained_models –> Folder containing trained-models/”brains” for the agents
    • README.md –> Description of the trained agent “brains”/models with the naming conventions
  • utils –> Module containing utility functions to train/test the agent
    • params_manager.py –> A simple class to manage the agent’s and environment’s parameters

3. Running the code

  • Deep n-step Advantage Actor-Critic Agent:

    The a2c_agent.py is the main script that takes care of the training and testing of the deep n-step advantage Actor-Critic agent.
    The table below summarizes the argument that the script supports and what they mean. Note that, most of the agent and environment
    related configuration parameters are in the a2c_parameters.json file and only those few parameters that are more useful
    when launching the training/testing scripts are made available through the command line interface.

    Argument Description
    --env Name of the OpenAI Gym interface compatible environment. Use Carla-v0 if you want to train/test in the Carla driving environment. Supports other Gym environments as well
    --params-file Path to the json parameters file. Default=./a2c_parameters.json
    --model-dir Directory to save/load trained agent brain/model. Default=./trained_models
    --render True/False to render/not-render the environment to the display
    --test Run the agent in test mode using a saved trained brain/model. Learning is disabled
    --gpu-id GPU device ID to use. Default=0

    A2C Training

    Make sure the rl_gym_book conda environment with the necessary packages installed is activated. Assuming that you cloned
    the code as per the instructions to ~/HOIAWOG/, you can launch the Agent training script from the ~/HOIAWOG/ch8 directory using the following command:

    python a2c_agent.py --env Carla-v0 --gpu-id 0

    If a saved agent “brain” (trained model) is available for the chosen environment, the training script will upload
    that brain to the agent and continue training the agent to improve further.

    The log files are written to the directory pointed with the summary_file_path_prefix parameter (the default is logs/A2C_). When the training script is running, you can monitor the learning progress of the agent visually using Tensorboard. From the ~/HOIAWOG/ch6 directory, you can launch Tensorboard with the following command: tensorboard --log_dir=./logs/.
    You can then visit the web URL printed on the console (the default one is: http://localhost:6006) to monitor the progress.

    You can train the agent in any Gym compatible environment by providing the Gym env ID for the --env argument.
    Listed below are some short list of environments that you can train the agent in:

    Environment Types Example command to train
    Gym classic control python a2c_agent.py --env Acrobot-v1 --gpu-id 0
    Gym Box2D python a2c_agent.py --env LunarLander-v2 --gpu-id 0
    Gym Atari environment python a2c_agent.py --env AlienNoFrameskip-v4 --gpu-id 0
    Roboschool python a2c_agent.py --env RoboschoolHopper-v1 --gpu-id 1

    A2C Testing

    Make sure the rl_gym_book conda environment with the necessary packages installed is activated. Assuming that you cloned
    the code as per the instructions to ~/HOIAWOG/, you can launch the Agent testing script from the ~/HOIAWOG/ch8 directory using the following command:

    python a2c_agent.py --env Carla-v0 --test --render

    The above command will launch the agent in testing mode by uploading the saved brain state (if available) for this environment
    to the agent. The --test argument disables learning and simply evaluates the agent’s performance in the chosen environment.

    You can test the agent in any OpenAI Gym interface compatible learning environment like with the training procedure.
    Listed below are some example environments from the list of environments for which trained brains/models are made available
    in this repository:

    Environment Types Example command to train
    Gym classic control python a2c_agent.py --env Pendulum-v0 --test --render
    Gym Box2D python a2c_agent.py --env BipedalWalker-v2 --test --render
    Gym Atari environment python a2c_agent.py --env RiverraidNoFrameskip-v4 --test --render --gpu-id 0
    Roboschool python a2c_agent.py --env RoboschoolHopper-v1 --test --render --gpu-id 1
  • Asynchronous Deep n-step Advantage Actor-Critic Agent:

    The async_a2c_agent.py is the main script that takes care of the training and testing of the asynchronous deep n-step advantage Actor-Critic agent.
    The table below summarizes the argument that the script supports and what they mean. Note that, most of the agent and environment
    related configuration parameters are in the async_a2c_parameters.json file and only those few parameters that are more useful
    when launching the training/testing scripts are made available through the command line interface.

    Argument Description
    --env Name of the Gym-compatible environment. Use Carla-v0 if you want to train/test in the Carla driving environment. Supports other Gym environments as well
    --params-file Path to the json parameters file. Default=./async_a2c_parameters.json
    --model-dir Directory to save/load trained agent brain/model. Default=./trained_models
    --render True/False to render/not-render the environment to the display
    --test Run the agent in test mode using a saved trained brain/model. Learning is disabled
    --gpu-id GPU device ID to use. Default=0

    Async A2C Training

    NOTE: Because this agent training script will spawn multiple agents and environment instances, make sure
    you set the num_agents parameter in async_a2c_parameters.json file to sensible values based
    on the hardware of the machine on which you are running this script. If you are using the Carla-v0 environment to
    train the agent in the Carla driving environment, be aware that the Carla server instance itself needs some GPU resource to run
    on top of the agent’s resource needs.

    Make sure the rl_gym_book conda environment with the necessary packages installed is activated. Assuming that you cloned
    the code as per the instructions to ~/HOIAWOG/, you can launch the Agent training script from the ~/HOIAWOG/ch8 directory using the following command:

    python async_a2c_agent.py --env Carla-v0 --gpu-id 0

    The screencapture animation (GIF) at the top of this page was captured by launching the above command with num_agents in async_a2c_parameters.json set to 9.

    If a saved agent “brain” (trained model) is available for the chosen environment, the training script will upload
    that brain to the agent and continue training the agent to improve further.

    The log files are written to the directory pointed with the summary_file_path_prefix parameter (the default is logs/A2C_). When the training script is running, you can monitor the learning progress of the agent visually using Tensorboard. From the ~/HOIAWOG/ch6 directory, you can launch Tensorboard with the following command: tensorboard --log_dir=./logs/.
    You can then visit the web URL printed on the console (the default one is: http://localhost:6006) to monitor the progress.

    You can train the agent in any Gym compatible environment by providing the Gym env ID for the --env argument.
    Listed below are some short list of environments that you can train the agent in:

    Environment Types Example command to train
    Gym classic control python async_a2c_agent.py --env Acrobot-v1 --gpu-id 0
    Gym Box2D python async_a2c_agent.py --env LunarLander-v2 --gpu-id 0
    Gym Atari environment python async_a2c_agent.py --env SeaquestNoFrameskip-v4 --gpu-id 0
    Roboschool python async_a2c_agent.py --env RoboschoolHopper-v1 --gpu-id 1

    Async A2C Testing

    Make sure the rl_gym_book conda environment with the necessary packages installed is activated. Assuming that you cloned
    the code as per the instructions to ~/HOIAWOG/, you can launch the Agent testing script from the ~/HOIAWOG/ch8 directory using the following command:

    python async_a2c_agent.py --env Carla-v0 --test

    The above command will launch the agent in testing mode by uploading the saved brain state (if available) for this environment
    to the agent. The --test argument disables learning and simply evaluates the agent’s performance in the chosen environment.

    You can test the agent in any OpenAI Gym interface compatible learning environment like with the training procedure.
    Listed below are some example environments from the list of environments for which trained brains/models are made available
    in this repository:

    Environment Types Example command to train
    Gym classic control python async_a2c_agent.py --env Pendulum-v0 --test --render
    Gym Box2D python async_a2c_agent.py --env LunarLander-v2 --test --render
    Gym Atari environment python async_a2c_agent.py --env RiverraidNoFrameskip-v4 --test --render --gpu-id 0
    Roboschool python async_a2c_agent.py --env RoboschoolHopper-v1 --test --render --gpu-id 1

    That concludes the post! If you have any questions or need clarification or help with a step, feel free to reach out using the comments section below.

Facebooktwittergoogle_plusredditpinterestlinkedinmail
google_pluslinkedinrssyoutube
Summary
Implementing an Intelligent & Autonomous Car-Driving Agent using Deep n-step Actor-Critic Algorithm
Article Name
Implementing an Intelligent & Autonomous Car-Driving Agent using Deep n-step Actor-Critic Algorithm
Description
Condensed quick-start version of Chapter 8: Implementing an Intelligent & Autonomous Car Driving Agent using Deep Actor-Critic Algorithm discussed in the Hands-on Intelligent agents with OpenAI Gym book. The first section lists down the concepts covered and then dives straight into the code structure and elaborates on how you can train deep n-step advantage actor-critic agents in the Carla driving environment
Author
Publisher Name
HOIAWOG
Publisher Logo

  Posts

September 8th, 2018

Implementing an Intelligent & Autonomous, Car-Driving Agent using Deep n-step Actor-Critic Algorithm

Condensed quick-start version of Chapter 8: Implementing an Intelligent & Autonomous Car Driving Agent using Deep Actor-Critic Algorithm discussed in the Hands-on Intelligent agents with OpenAI Gym book. The first section lists down the concepts covered and then dives straight into the code structure and elaborates on how you can train deep n-step advantage actor-critic agents in the Carla driving environment

August 12th, 2018

Hands-On Intelligent Agents With OpenAI Gym (HOIAWOG)!: Your guide to developing AI agents using deep reinforcement learning

This book will guide you through the process of implementing your own intelligent agents to solve both discrete- and continuous-valued sequential decision-making problems with all the essential building blocks to develop, debug, train, visualize, customize, and test your intelligent agent implementations in a variety of learning environments, ranging from the Mountain Car and Cart Pole problems to Atari games and CARLA – an advanced simulator for autonomous driving. If you are someone wanting to get a head start in the direction of building intelligent agents to solve problems and you are looking for a structured yet concise and hands-on approach to follow, you will enjoy this book and the code repository. The chapters in this book and the accompanying code repository is aimed at being simple to understand and easy to follow. While simple language is used everywhere possible to describe the algorithms, the core theoretical concepts including the mathematical equations are laid out with brief and intuitive explanations as they are essential for understanding the code implementation and for further modifications and tailoring by the readers.
The book begins by introducing the readers to learning based intelligent agents, environments to train these agents and the tools and frameworks necessary to implement these agents. In particular, the book concentrates on deep reinforcement learning based intelligent agents that combine deep learning and reinforcement learning. The learning environments, which define the problem to be solved or the tasks to be completed, used in the book are based on the open source, OpenAI Gym library. PyTorch is the deep learning framework used for the learning agent implementations. All the code and scripts necessary to follow the book chapter-by-chapter are made available at the following
GitHub repository: Hands-On-Intelligent-Agents-With-OpenAI-Gym

August 11th, 2018

10 Machine Learning Tools that made it big in 2018

Developing smart, intelligent models has now become easier than ever thanks to the extensive research into and development of newer and more efficient tools and frameworks. While the likes of Tensorflow, Keras, PyTorch and some more have ruled the roost in 2017 as the top machine learning and deep learning libraries, 2018 had promises to be even more exciting with a strong line-up of open source and enterprise tools ready to take over – or at least compete with – the current lot.

In this article, we take a look at 10 such tools and frameworks which made it big in 2018.

January 27th, 2015

Tele-operating Andy–The lunar rover that is expected to bag the $20 Google Lunar Xprize

Andy-The lunar rover from CMU Tele-operating the Andy rover! If you are wondering who is Andy, read more here:http://www.cmu.edu/news/stories/archives/2014/november/november24_lunarroverandy.html Recently, [...]
September 28th, 2013

Prebuilt OpenCV library files for arm-linux-gnueabihf

Good news!… Now you can start to use OpenCV on your ARM based in no time! Just download the OpenCV-2.4.9-for-arm shared […]

August 23rd, 2013

Building OpenCV 2.4.x with full hardware optimization for the Pandaboard ES

This wiki will walk you through the process of cross compiling the latest (bleeding edge) OpenCV library with SIMD optimization […]

August 17th, 2013

Building Qt 5.x on pandaboard ES with OpenGL ES

Qt 4.x used a platform plugin for the OpenGL ES 2 The platform plugin approach: the HW platform is expected […]

August 16th, 2013

Installing libjpeg turbo on pandaboard with vectorization for SIMD utilizing the neon co processor

**Jpeg turbo** libjpeg-turbo  is a fork of the original IJG libjpeg which uses SIMD to accelerate baseline JPEG compression and decompression. […]

August 16th, 2013

Setting up an environment using chroot for developing applications for embedded targets

Setting up an environment using chroot for developing applications for embedded targets

August 16th, 2013

Setting up a cross compiling environment to build linux applications for embedded targets

1. Setting up an environment using scratchbox 2 for developing applications for embedded targets. This wiki will walk you through the […]