What AV Safety Isn't - Part 2
This is Part 2 in a series discussing what AV safety isn’t. To see the other Parts in the series and our other posts check out our blog page.
At Retrospect, we have worked to advance AV safety research, inform industry practices, build the technology, and evangelize all in the name of safe autonomous vehicles.
There are certain ways to present AV safety so the typical person can form a logical position on whether or not AVs are safe enough. The average person’s understanding is critical because soon we will all have AVs either in development or in deployment near enough to us that a safety opinion will be formed one way or another. These opinions will be based on the interactions and perceptions we have. We all know the importance of making a good first impression, and establishing the contex first for AV safety, is going to be key for a good first impression from the public.
But rather than talk about what AV safety *is,* we will start by very simply explaining what AV safety *isn’t.* Understanding what AV safety isn’t is important so that those of us who are not AV developers can distinguish between true evidence of safety and not merely clever advertisements.
There’s so much to unpack in AV safety that we are going to split the topic into multiple posts - each addressing a handful of key topics. Be sure to follow us on LinkedIn for our latest post.
Under-developed L4 Autonomy Labeled “L2” autonomy
For those uninitiated, the SAE J3016 standard defined different "levels of automation" to help identify the differences between automated feature sets in vehicles. It has been the cornerstone and most-widely used standard in terms of giving us categories for automation. More importantly, it states levels 0, 1, and 2 of automation are features where the human driver "must constantly supervise these support features" and states that for these levels "You are driving."
Levels 3, 4, and 5 state "You are not driving when these automated driving features are engaged."
The trend in automated driving features available in cars you or I can buy over the last several years has been more and more capabilities in their automated driving features. What was once just cruise control with distance control to keep you far enough behind the car ahead of you became augmented with lane keeping assistance (nudges on your steering wheel to keep you in your lane) which then became lane centering with distance control and is now highway driving automation with automatic lane changing. But all of these features have been labeled Level 2 or below. Yes - even Tesla's Autopilot and Full Self Driving features are described as requiring the driver to constantly supervise them -- essentially openly stating they are not Level 3, 4, or 5 automation.
Level 4 and 5 automation are the sort of things non-retail automated vehicles (ones you or I cannot buy) are targeting. Waymo/Cruise/Argo/Aurora/TuSimple/etc. robo taxis and autonomous semi trucks. Those have yet to achieve their targets but that's another conversation. The point is - no retail vehicles exist describing themselves >officially< as Level 4 or Level 5 automation.
However, capabilities of retail automated vehicles have increased tremendously and show no signs of stopping. Yet, their manufacturers will continue to label themselves as Levels 1 and 2 despite seriously encroaching, when working properly, the functionality, of Level 4 automation.
I’ll put it as clearly and bluntly as possible:
Companies have an incentive to tout their automation features as it makes them look more advanced than competitors. But the companies have no incentive to label these products as Levels 4 or 5 because that would put the liability of any failures squarely on themselves.
So, companies have to ride this fence between touting ever-increasing capabilities and features (always advertised as boosting your safety of course)- but keeping you the driver on the hook as liable for the vehicle's actions if anything goes wrong.
This isn't safety. Studies have documented how quickly human drivers become over-confident in vehicle automation features. Humans quickly fade from their roles as supervisors for driving automation features. Shoving more and more automation features into vehicles and calling them Level 2 is dangerous.
The best way to stay ahead of this is to have an equally advanced Driver Monitoring System that is good enough to make sure the driver is not distracted from their job as supervisor for these features.
The worst thing a company can do is anything that embraces the misconception that these features don't require constant human monitoring (Yes, Tesla's Full Self Driver takes the cake for now but expect other manufacturers to push the envelope in this macabre jockeying for position). Perhaps as bad would be a company releasing incomplete features to customers (labeled beta or not) with the plan to learn from these deployments to improve their systems to one day be good enough to label Level 4.
This is the worst of the worst. The result is the unholy trinity of: 1) Advanced automation that quickly gains human trust. 2) But hides behind the Level 2 lower-automation level expectation that the driver "constantly supervise" this feature yet 3) Is essentially unfinished/half-baked/undone because the plan is to improve it based on field data. In this situation - the car owner (you) has become the Test Driver / Test Engineer for this company without knowing it. They are then development drivers driving a development vehicle - something that even within the industry is reserved for people with special training and experience.
Scenario Testing
Scenario testing is selecting a set of road maneuvers that the autonomous vehicle is put through. The outcome of the scenario(s) is the result. A two-lane highway where a vehicle cuts in front to pass the autonomous vehicle is a simple, if under-specified, example of a scenario.
There was a time when scenario testing was in the running to be the way to evidence an autonomous vehicle to be safe. The idea went that if you just picked the right set of scenarios you could justly claim the autonomous vehicle as safe. This is no longer seen as true by the industry.
Now let me explain before anyone gets upset. I’m not saying there is no value in scenario testing - there certainly is. But there is no set of scenarios that one can say “if the autonomous vehicle makes it through these - it is safe.”
So here’s why Scenario Testing alone isn’t safety:
1) The sheer number of combinations of variable conditions and maneuver nuances that make up real-world driving is far too large to capture as a reasonably-sized set of scenarios.
2) “Difficult” scenarios for autonomous vehicles are poorly understood and likely to be very different from difficult scenarios for humans-- Yet humans would be choosing the scenarios.
3) Safety is not simply the superficial, observable output.
Number one is pretty self-explanatory. Number two speaks to the nature of how any large box of software works and amplified by the pitfalls of artificial intelligence software. Remember that AVs will have upwards of 1 billion lines of code. Compare that to the just 50 million lines of code in Microsoft Windows. Worse yet, when artificial intelligence fails it can often be for reasons no person would have imagined and this has been shown in the AV world with something as simple as stickers on a stop sign. If the stickers on a stop sign test wasn’t one of the scenarios in your proposed scenario list - that AV could have passed it and been deemed ‘safe’ when it wasn’t.
But Number 3 really drives the point home. Safety is not simply the superficial, observable output. Safety is the rationale for why one action is chosen and what happens when the vehicle encounters a situation it cannot handle and starts making bad decisions. So even if you picked a great set of scenarios that really did put the specific AV system being tested under situations that are really challenging for it, the fact that it made it through the scenarios successfully doesn’t tell you anything about its rationale or how it would faire when it encounters a situation it can’t handle well.
A quick shout-out to my colleagues working on scenarios -- There is value for scenario testing. Having a representative set of scenarios informed by real driving data could well serve some preliminary internal company test phase or perhaps even an early testing permit authorization at some government level. Although developers are anxious to talk about (removing) hurdles to deployment, reasonable hurdles to allow public testing are just as underdeveloped and in need of justification and research. University of Michigan’s Mcity researched a set of such tests and appropriately called out that passing such a test does not constitute proof of safety but has other values, especially in the realm of demonstrating “Roadmanship.”
Saying “our autonomous vehicles are always learning”
“Our system is always learning…” itself implies there is learning to be done. An 8-year-old is “always learning” that doesn’t mean they’re a safe driver. A 4-year-old is doing even more learning -- because they know that much less than the 8-year-old. They’re definitely not safe. The given that learning is occurring is in no way a substitute for safety. If anything, this is a layer of a multi-tier logical-fallacy cake being pushed. If they can convince you that their AVs are “always learning” then they can better justify public road deployments (so they can learn!) and they can argue that the many miles they’ve driven autonomously is evidence of how safe they are because they’ve been learning that whole time (does this mean they were under-educated through those miles? Were they unsafe?) The icing on the cake is that they may tell you on a different advertising thread how safe they already are.
Let’s get into the “our system is always learning” statement. This makes system seem almost alive. It makes you think if an AV didn’t know the kid in crutches was a person and flew right by it - hey it now learned! Next time it’ll get it right! This is not how it works. For the system to have any chance of getting it right next time it (or rather an engineer) needs to:
Know the system got something wrong
Update the training set in some way
At the least by adding this new photo, labeled correctly
Better yet through engineers retrospectively looking into why it got this wrong and seeing if there was some crazy bias -- like maybe it doesn’t think humans would ever have crutches — and then remove that bias
Rerun some of the training or otherwise produce an update to the trained model
Test / gain confidence that this updated training didn’t break something else
Deploy the updated model to the AV
This process could take weeks, months, or even years to get the new learning into the vehicle. Especially the testing step. So let’s get something straight -- “our system is always learning” doesn’t mean what you might think.
If you want to learn more about how AI in autonomous vehicles works check out this blog post.
More to come
What constitutes AV safety is a complicated subject. But what *is not* AV safety - especially among the information being put out nowadays - is easier to explain. I want to emphasize that I’m not ‘against’ autonomous vehicles, but excited to see them develop. I've spent the better part of the last ten years bringing them to market through work in the business, product, validation, system design, and software engineering of autonomous vehicles. It is not only possible to safely deploy autonomous vehicles - it is possible to inform the user base and the public that autonomous vehicles truly are safe, which is the most critical part.
In the upcoming parts of this series, we’ll talk about things like L4 Autonomy disguised as L2, Scenario Testing, AV driver’s licenses, and simulations, and more. Have a concept you’d like us to discuss in the series? Speak up in the comments below or reach out directly.