Code Monkey home page Code Monkey logo

airsim-rl's Introduction

AirSim-RL

This project provides experiments with Deep Reinforcement Learning on autonomous vehicles (car and uav drone) in AirSim. The followings are used to facilitate training models: *OpenAI Gym *Keras-RL *Stable Baselines DRL algorithms:

  • DQN
  • A2C
  • PPO2

Demo

Demo Video

  • In this demo, The Deep Q Network (DQN) is used. The reward calculated based on the distance between the car to the center of the track. The episode stops when it is close to off track.
  • The input for training is frames of RGB image captured from the head-front camera on car.

How do I set up my environment

  • Build AirSim, Unreal Engine 4 (UE4) on Windows (https://github.com/microsoft/AirSim/blob/master/docs/build_windows.md)
  • Create a new UE4 project with map RaceCourse
  • Install conda an environment with the important packages: python 3.7, tensorflow 1.14, keras 2.2.5 , keras-rl 0.4.2, gym 0.17.2 (The versions come along are just for reference)
  • Arrange some waypoints in the center of the road and name it as "WayPoints[number]" (see the following picture). This list of waypoints is used to locate the road approximately in map. Car's position and waypoints are used in calculate the reward during training. As illustrated below, UE4 'Empty Actors' are used as waypoints. These waypoints are invisible when the game is running. waypoints

How to train

  • First take a look in the parameters in Config.ini file to understand some settings, like input image resolution, action space of the agent, etc.
# settings related to UE4/airsim 
[airsim_settings] 
image_height = 144
image_width = 256
image_channels = 3
waypoint_regex = WayPoint.*
track_width = 12 

# settings related to training car agent
[car_agent]
# what are adjusted in driving the car
# 0: change steering only, 1(not supported yet): change throttle and steering,
# 2(not supported yet): change throttle, steering, and brake
action_mode = 0 
# steering value from left to right in range [-1, 1] 
# e.g: 0.3 means steering is from -0.3 (left) to 0.3 (right)
steering_max = 0.3
# the granularity of steering range as we use discrete values for steering
# e.g: 7 will produce discrete steering(with max=0.3) actions as: -0.3, -0.2, -0.1, 0.0, 0.1, 0.2, 0.3
steering_granularity = 7 
# car's acceleration, now it is fixed, but will add a range of values later 
fixed_throttle = 0.75 
# total actions of car agent, update this accordingly when changing the above settings
actions = 7 
  • Then, run the jupyter notebook (or train.py)

TODO

  • Expand action spaces, try more reward functions
  • Add OpenAI Gym env for UAV drone

References

airsim-rl's People

Contributors

hoangtranngoc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

airsim-rl's Issues

Getting RPC error, possible client and simulator version mismatch?

The image size the agent gets from AirSim simulator seems to be of size 0, therefore simGetImages function gets called. But even though only 2 args are provided (self and the request object), it says 3 arguments are provided. What exact version of Unreal was used for training? The Line I am referencing

response = super().simGetImages([ImageRequest(0, ImageType.Scene, False, False)])[0]

`WARNING:tensorflow:From C:\Users\VPBen\anaconda3\envs\AirSim-RL\lib\site-packages\rl\util.py:79: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From C:\Users\VPBen\anaconda3\envs\AirSim-RL\lib\site-packages\rl\util.py:79: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Training for 2000000 steps ...
Traceback (most recent call last):
File "C:/Users/VPBen/Desktop/RL_experiments/AirSim-RL/train.py", line 97, in
callbacks=callbacks)
File "C:\Users\VPBen\anaconda3\envs\AirSim-RL\lib\site-packages\rl\core.py", line 132, in fit
observation = deepcopy(env.reset())
File "C:\Users\VPBen\Desktop\RL_experiments\AirSim-RL\gym_airsim\airsim_car_env.py", line 63, in reset
observation = self.car_agent.observe()
File "C:\Users\VPBen\Desktop\RL_experiments\AirSim-RL\gym_airsim\car_agent.py", line 43, in observe
response = super().simGetImages([ImageRequest(0, ImageType.Scene, False, False)])[0]
File "C:\Users\VPBen\anaconda3\envs\AirSim-RL\lib\site-packages\airsim\client.py", line 103, in simGetImages
responses_raw = self.client.call('simGetImages', requests, vehicle_name)
File "C:\Users\VPBen\anaconda3\envs\AirSim-RL\lib\site-packages\msgpackrpc\session.py", line 41, in call
return self.send_request(method, args).get()
File "C:\Users\VPBen\anaconda3\envs\AirSim-RL\lib\site-packages\msgpackrpc\future.py", line 45, in get
raise error.RPCError(self._error)
msgpackrpc.error.RPCError: rpclib: client error C0002: Function 'simGetImages' was called with an invalid number of arguments. Expected: 2, got: 3

Process finished with exit code 1
`

WayPoint sorting is wrong

You are sorting WayPoint as strings.
If for example I want 15 WayPoint, and I define them as WayPoint0,...WayPoint14, their distances will be sorted as
WayPoint0,WayPoint1,WayPoint11,WayPoint12,...,WayPoint9

This is making big errors in the compute reward functions, since the distance to the destination is calculated wrong.

I solved it by importing natsort

and changed the line to

    natsort.natsorted(wp_names, reverse=False)

instead of

wp_names.sort()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.