What We Need to Talk About When We Talk About AI

Ever since Alan Turing’s “imitation game,” we’ve been acutely aware of the importance of measuring the capabilities of computers against our own miraculous brains. The British pioneer’s method, outlined in 1950, is primitive today, but it sought to answer a persistent question: How will we tell when a machine has become as (or more) intelligent than a human being?

Defining such progress is imperative for productive conversations about artificial intelligence. Specifically, the question of what can be considered artificial general intelligence — a “mind” as adaptable as our own — needs to be considered using a set of shared parameters. Currently, the term lacks precise definitions, making predictions of AGI’s arrival and impact simultaneously both unnecessarily alarmist or insufficiently concerned.

Consider the hopeless spread of predictions on AGI. Earlier this year, the preeminent AI researcher Geoffrey Hinton predicted “without much confidence” that AGI could be present within five to 20 years. One attempt to collate a sample of approximately 1,700 experts offered timing estimates from next year to never. One reason for the chasm is that we haven’t decided collectively what we’re even talking about. “If you were to ask 100 AI experts to define what they mean by ‘AGI,’ you would likely get 100 related but different definitions,” notes a recent paper from a team at DeepMind, the AI unit within Google.

One of the paper’s co-authors, Shane Legg, is credited with popularizing the AGI term. Now he and his team are seeking to set up a sensible framework with which to measure and define the technology — a taxonomy that can be used to help assuage or heighten fears and offer straightforward context to non-experts and legislators.

The effort is modeled on the system for describing the capabilities of self-driving cars. In 2014, SAE International (formerly the Society of Automotive Engineers) defined six distinct levels of autonomous capability, from Level 0 — human driver in full control of vehicle’s operation — to Level 5 — full automation of all the vehicle’s functions in all conditions. The scale has proved useful for lawmakers to set rules of the road and for the public to understand their cars’ capabilities. A car with Level 2 automation — steering, lane changes, acceleration and deceleration, in some settings, mostly on highways — can be legally driven on the road today on the condition that a human is sitting alert to take over immediately. But Level 4 or 5 cars, such as Alphabet’s Waymo cars on trial in San Francisco, need special permission to be used in public and are subject to additional oversight on their performance.

Classifying AGI will be much more complex than autonomous vehicles because the latter is merely a subset of the former. But the leveling system is useful for AI, too. In assessing capabilities, the DeepMind team split AI into two groups: narrow and general. A narrow AI, for instance, could have superhuman capability for one application, such as protein folding, but be incapable of writing a simple short story. To be considered AGI, according to DeepMind, a system must demonstrate a “wide range of non-physical tasks, including metacognitive abilities like learning new skills.”