Card image cap

Technology: Python, OpenCV, Keras, Tensorflow
Tags: CNN, Computer Vision, Autonomous Cars, Deep Learning
Image Credits: Raphaël Biscaldi on Unsplash

Teaching A Simulator How To Drive

One center-piece in the quest for autonomous cars is called behavioral cloning. By turning human driving in machine readable information it is possible to train a computer to learn to steer a car in the correct direction. While a lot of different approaches have been undertaken to solve this task, a very promising avenue that has been developed in the last decade is deep learning. Deep neural networks seem to have great capabilities to act on image data. In this project a convolutional neural network (CNN) is created that is capable of driving a vehicle in a computer simulation without human intervention. The final model scores a Mean Squared Error of .0176.

A video of the final model running on auto-pilot can be seen below.


The following project leverages image and steering data collected from a car simulator to create a convolutional neural network that learns to predict the correct steering angle and ultimately is able to drive itself. The following image shows a sample image of the dataset. Code, weights and further explanation can be found on GitHub.


The training set was collected using a software simulator. Before recording, a set of rounds were driven to get a feeling for the track and the controls. ’Normal’ or good driving behavior was first recorded using mouse movements as well as keyboard movements. After the first set of recordings an initial network structure was implemented to get a baseline result for how good the imagery works with a simple network structure. While analyzing the first records, it became obvious that most images showed uncritical situations where the car was going straight. The records also seem to introduce a one-sided bias since most of the curves of the training track are leaning towards one direction. To mitigate those effects, following steps have been taken:

  1. Record driving behavior in curves rather than on straight patches of the track.
  2. Capture critical moments such as leaving the road and recovering from both sides in an attempt to have the model learn more difficult situations.
  3. Adding reverse laps to the data set.
  4. Leveraging all camera angles with a steering correction value of .2 for measurements.

Model Architecture

The final model draws inspiration from the NVIDIA paper End-to-End Deep Learning for Self-Driving Cars. It contains four convolutional layers with a filter size of 5x5 and 3x3 as well as increasing depths from 24 to 64. Afterwards the information is funneled through four dense layers with decreasing size. While the original architecture works with fewer and smaller dense layers, the additional layers seem to have a smoothing effect on the predicted values. After implementing the additional depth of the network the car steers less 'choppy'. A cropping layer removes some of the unnecessary information by reducing the image size to 90x320. Afterwards normalization is applied to help the model converge quicker. An overview of the architecture can be seen below.


To evaluate the final model two measures were used. First a test set containing a new lap was recorded. The final model scored a Mean Squared Error of 0.0176. To further validate the model, the simulator was connected and steering angles were predicted based on new image data turning the simulator into an autonomous vehicle. After a few epochs of training, the model is able to drive the car by itself. The model learned to stay centered in the lane, steer away from hazardous terrain and is able to master challenging curves at a simulated pace of 18 mph. In all consecutive five rounds the car never left the lane and stayed within the boundaries (which actually outperforms my mouse-performance).