Artificial Intelligence (AI) and Machine Learning (ML) are big news and certain to have a disruptive effect on organisations. Over the past few years, the likes of recommendation engines and voice assistants have become part of my everyday life, and its a topic I want to get closer to. The challenge was that it traditionally seemed to require a PhD and deep expertise in algorithms. However, AWS have started bringing AI and ML into the hands of developers, with the introduction of hardware devices like
AWS DeepLens and
AWS DeepComposer. The one that stood out to me was
AWS DeepRacer. The opportunity recently came up to run a DeepRacer day at work. We were delighted to link up with Lyndon Leggate from deep to run the event. The main goal was to get a group of engineers interested in machine learning. So how did it go?
Introduction to Machine Learning
The day started out with some scene setting. We went through the three basic machine learning paradigms:
Supervised Learning - Supervised learning is often associated with tasks such as determing the type of animal from a given image. It involves labelling large data sets and effectively training the algorithm. Over time, the accuracy of the algorithm improves. For example, there is a spam folder in gmail, and users can report a message as not being spam, which feeds back into the algorithm. This allows email systems to determine whether a message is spam or not.
Unsupervised Learning - Unsupervised learning is the opposite where no labelling of data takes place. Instead, an algorithm is fed large sets of data, and looks for natural groupings and clusters. A classic example is recommendation engines with the likes of Netflix and Spotify. They take data sets around customer viewing or listening habits, and can start to identify patterns and groupings, which is why they can state that customers who liked one artist often like another one.
Reinforcement Learning - Reinforcement learning is focused on autonomous decision making by an agent to achieve a specified goal with no labelled input. This is carried out by specifying rewards, which are given based on a result of an action. In many respects, this is how we grow up as humans, as we discover the types of actions that result in pain, and those that end up with a reward. AWS DeepRacer fits into this category.
In reinforcement learning, an agent is set an objective to achieve a goal, and it interacts with an environment to maximise its total reward. The agent takes an action and reaches a new state, which has a reward (whether positive or negative) associated with it.
In the case of DeepRacer, the agent is the physical or virtual AWS DeepRacer vehicle. The objective was to complete a lap of a track with the goal of doing this in the fastest time possible. The agent interacts with the environment, which is the track itself, which includes other traffic or obstacles. At any point, the vehicle can take an action. This is when the vehicle moves in a particular direction (steering angle) at a specific speed. This action results in a reward, which is configured in a reward function.
The reward function describes the immediate feedback (as a reward or penalty score) when the DeepRacer vehicle moves fron one position on the track to a new position. As an example, you can give higher rewards if the vehicle is close to the middle line, if it has all four wheels on the track, and if it is going above a certain speed. Learning is a process of trial and error with the vehicle taking random actions, and over time, the vehicle learns what actions lead to the highest rewards, though it is possible to overtrain a model, or to introduce unintended consequences by incentivising certain behaviours.
You can find far more information in the Developer Guide
After racing in a private community race, it was then time to take to the track. It was a really simple process of downloading the chosen model, and then uploading to the AWS DeepRacer vehicle.
When racing, you get to control the throttle percentage, and watch your vehicle power around the track (or in many cases off track and into the barrier).
One of the things I noted was that the quickest model we raced on the physical track, was not the quickest in the virtual race. This is a common observation, and is known as the simulated-to-real (sim2real) performance gap.
Without doubt, this was one of the most fun work experiences I’ve had. The concept of AWS DeepRacer is ingenious. AWS provide a set of default templates, which means that with a little bit of explanation, anyone can get up and running. It then gives lots of opportunity to optimise performance, either through writing custom python code in a reward function, or tuning one of the set of hyperparameters. Where it excels is in the gamification. The competitive spirit kicked in, and the private virtual league brought out the banter and the bragging, and the physical track race saw everyone cheering on the laps. It broke down any barriers that may have existed, and even though the main goal was to learn some concepts of machine learning, it excelled as a team building event. There is a now a huge appetite to take this further and learn even more advanced topics.
For anyone interested, there is an active Slack Channel, so sign up, and hopefully see you on a race track soon!