OpenAI Five is an interesting milestone in an Artificial Intelligence (AI) project that sketches out a clear path to general military applications for AI. OpenAI Five is basically five AIs that play a particularly complex video game (Defence of the Ancients, 'Dota 2') as a team. There are lots of interesting and important things about it.
It’s interesting because it is a very ambitious project, which so far has been very successful. A year ago it was successfully tested in a very limited version of the game against very good human players – a 1v1 format where players had almost no freedom in shaping how the game would be played. Last week it was successfully tested in a version of the game far less limited – 5v5 but with a lot of the game’s possible complexity stripped back. In this test it was able to consistently and very convincingly beat teams of five human players, even those drawn from the 99.95th percentile (but not against professional players when the rules were changed at a week’s notice). The AI showed all kinds of interesting emergent behaviours, including the apparent ability to deceive and shape adversaries. It ran on a tremendously powerful and expensive computer, but it wasn’t a supercomputer in the traditional sense; it was a bunch of commercial products (like 256 $7000 USD graphics cards)....And it taught itself to play.
A year ago, when the very limited version of OpenAI first demonstrated itself in the 1v1 context, Commander Forces Command challenged Army about it on Twitter which really caught my attention, but I was intensely sceptical about OpenAI. Even though the tournament at which it was demonstrated had a prize pool of $25 million USD, it hadn’t ever occurred to me that Dota was something other than an incredibly niche game that didn’t garner notice outside of its own community, so it was incredibly strange to see a Major General posting about it. The OpenAI demonstration didn’t impress me too much – it was in such an incredibly limited version of the game that I assumed that scaling it to teamplay with lots of gameplay options over greater lengths of time would immediately hit enormous roadblocks. I wasn’t alone in this assessment; it's success is interesting because even the designers didn’t think that current machine learning techniques would be able to convincingly deal with some of the challenges that the format throws up. It turns out that both myself and the designers were excessively sceptical about the state of current AI algorithms.
Potential role of AI in planning military operations
If you’re in the military, say you’re a general service officer, then the technical specifications should be extremely interesting because they’re not too far off relieving you of ever planning a military operation. OpenAI Five deals with a very different problem space than traditional gaming AIs. While there are infinitely many possible games of chess, each turn only offers each player about 35 valid moves on average, games only go to about 40 moves on average, players make their moves sequentially, both players can see the whole board and there is only one agent with unknown intentions. By comparison, each player in Dota has between 1000 and 170,000 valid moves per turn, there are thirty turns per second, games go for about 80,000 turns, all ten players take their turns at the same time, no one has access to even a tiny fraction of the total information about the board state, and there are nine agents who have unknown intentions. These technical numbers are not impressive because they are higher; they are impressive because they mimic the ways that humans have to make decisions against other humans in ways that 'turn' based strategy games can’t test.
OpenAI is a general tool. It was not built to play Dota, and that’s hardly the only thing it’s achieved. It’s done a variety of interesting things, from sumo wrestling to learning to use a hand, though none as taxing as the OpenAI Five. If given an appropriate learning space then planning contemporary military operations hardly seems a great stretch for the general architecture.
As a military planner, you operate with less freedom than this in most situations, and with more information. You are also almost certainly not as good at military planning as 99.95th percentile DotA players are at their craft. Your planning experience comes from a few dozen hours of TEWTs (Tactical Exercise Without Troops) and a few hundred hours of (simulated) battle command and planning (at some point in the past) and you are drawn from a field of a few thousand applicants. In comparison, top level DotA players practice for upwards of 14 hours a day, every day and are drawn from a field of 10-30 million players who practice at least once a month. We are very fond of saying that war is the most complex of human endeavours, which may be true, but none of us plan whole total wars because we break down complex things into less complex components. Some variant of OpenAI could probably already plan formation military operations better than you but for one critical fact – the information on which we plan military operations is not machine comprehensible right now. Writing this today, and (acknowledging that for all I know the Defence Science and Technology Group (DSTG) or Defence Advanced Research Projects Agency (DAPRA) have some AI we don’t know about), a machine can’t assimilate the various reports we use to ascertain information about friendly and enemy dispositions, it can’t get much militarily useful information out of available maps, it can’t go and stand on a hill to decide which bits the map got right or wrong and it certainly can’t eyeball a nervous officer commanding who’s fudged something in their reporting. That’s hugely important, because the whole reason OpenAI could teach itself is because it was operating in an entirely digital space and (I’m not making this up) squeeze 900 days of training out of each real 24 hour day.
This leads to the inevitable question: when and how will these tools replace us? Machines are getting better at interpreting the world around them, and the military uptake of autonomous or semi-autonomous platforms with sensor suites seems certain to accelerate. We’re also starting to make real (albeit mostly manually) inroads into creating a digital representation of battle with tools like BMS (Battle Management System). There are a variety of factors that could speed up or slow down the process. If military AI is the subject of additional international treaties then the process will be slowed down. If western tech companies continue their trend of limiting support to the militarisation of AI, then progress will be difficult. If there is a major war, then the process will speed up almost certainly irrespective of treaties or current tech attitudes towards military AI. This is the true significance of OpenAI Five – it has demonstrated that AI does not appear to require any fundamental advances to be better than humans at complex, adversarial, real-time decision making – the advances are required in our other systems to make the world and our operations sufficiently machine-comprehensible and machine-trainable – any AI advances that we achieve are just going to make the process easier and the ultimate result more inhumanly competent.
I think the major risk for early adoption of military AI good enough to replace human planners is what's called 'overfitting'. This is broadly when an AI learns to solve its training environment too well, resulting in it overspecialising when solving the training set at the cost of its fitness for the real problem set. To visualise the human equivalent of overfitting, imagine (or remember) someone who’s very good at incorporating the preferences of whoever is marking them into their plan, who never fights the green and always comes up with plans that brief well, but who never properly learns the underlying concepts they’re supposed to be applying because they’ve never needed to. Normally you detect overfitting by having separate training and test data and deal with it by giving the AI more varied training examples. However, this is a serious problem for a prospective military AI, because its training and testing space would necessarily be simulations, and it would learn and ruthlessly exploit sim-isms. Our simulations themselves are hazy best guesses full of sim-isms because, thankfully, we fight wars too infrequently to have anything approaching perfect information about how they really work (or will really work for the next war). This lack of adequate training space will be a serious challenge that military AI development will need to overcome.
Then there’s AI risk. Some of us scoff at fictional accounts of machines existentially threatening humanity, like Skynet (in Terminator), the Butlerian Jihad (in Dune), the Machines (in The Matrix) and so on, as Hollywoodisms. We might be unwise to do so; many very clever people who are experts in the area don’t laugh it off lightly, and they find the idea of a sufficiently clever AI who wants to make paperclips frightening, let alone ones that we might design and let loose to fight total war. Great solutions to potential runaway military AIs are not apparent to me – I’ve seen a lot of different people (for example Sean Welsh and Michael Shoebridge) arguing the ethical necessity of a “human-in-the-loop” for lethal systems, but humans are slow and fallible. Maybe we will take the ethical high ground and keep humans in lots of small and time-sensitive loops, but the side that limits the fewest humans to the fewest large and long decision loops is going to have a force making a lot more and better decisions a lot quicker. With that in mind, it’s hard to see how there’ll be much in the way of real choice in the long run. Massed conventional forces under escalating AI control may well be our generation’s doomsday weapon of mutually assured destruction, equivalent to strategic nuclear weapons. It's possible that smaller forces under AI control will carry the equivalent taboo of tactical nuclear weapons.
AI is an emerging field that already influences our lives. The results search engines show you, as well as the news and comments on Facebook, are all products of machine learning. In some jurisdictions police use trend analysis carried out by AI, judges refer to AI recommendations in deciding on parole, and politicians use AI to direct their campaigns. When you order goods from an online store, elements of the supply chain that delivers it are probably AI optimised and large portions of that supply chain are automated. Share markets are largely run by machines – arbitrage and most trades occur far too quickly for humans to keep up and decisions on how and when to make them are largely made by AI. Even HR departments are often AI augmented in large businesses; AIs have decided to fire people and have carried out the firing, conversely they are often the first gatekeeper for a job application. The military is no different. We will ultimately not have a real choice in whether or not we adopt AI decision making because before long, fielding a force not directed by military AI will be the equivalent of fielding a force without radios or firearms.