In August of 2017, billionaire entrepreneur Elon Musk developed an Artificial Intelligence (AI) named ‘OpenAI’ to learn the online strategy game, Defence of the Ancients 2 (DotA 2).[1] DotA 2 has been regarded as one of the most complex online strategy games.[2] The OpenAI system was designed, with very simple parameters, to learn basic instructions such as: dying is bad, taking damage is bad, killing enemies is good and winning the game is good. This enabled the system to explore its environment and develop its abilities. From there the system built on these lessons and attempted more complex manoeuvres and functions with each game run until it reached its goal of winning the game.
Initially the system would stand still until it discovered movement. However as the system could run multiple concurrent games consecutively without rest, within a few weeks, OpenAI was executing more complex commands and was at a level of strategic competence that took champion human players years to master. When a subsequent match was held between top DotA 2 players and the OpenAI, the OpenAI won all its matches. Feedback from the professional players included the difficulty they experienced in competing against a system that was executing strategies that had never been conceived of before. Subsequently these strategies were adopted by the players to great effect, up until the system (inevitably) designed effective counter-strategies that were not able to be beaten.
OpenAI has not undermined the competition of DotA 2. Instead it has helped revolutionise and refresh the perspectives of elite online players and their approach to the game of DotA 2 that make them more effective competitors. To illustrate this Mashable reported on the capability and impact OpenAI had on the DotA 2 gaming world. [3].
Similar breakthroughs in cognitive based games have been achieved in the past, the first being Deep Blue developed by IBM. The computer system, designed in 1996 to play chess, first played against World Chess Champion Garry Kasparov and lost 3-1. However, in 1997, a rematch was organised and Deep Blue defeated Kasparov, 2-1.
Another example was the AI system AlphaGo that was programmed to learn the much more complex Chinese game of Go. In 2015, AlphaGo defeated professional player (a 9-dan player) Lee Sedol (ranked second on the international ladder of Go players) 4-1.
Neural network learning is becoming an increasingly potent learning system that allows AI programs to learn and excel through trial and error. The general nature of this learning approach would allow an AI system to achieve elite level competency in any gaming system it is programmed to play by simple repetition and adjustment.
How does this apply to the armed forces?
The current application for AI in the armed forces is refining its role as an autonomous system with access to weaponry that can execute targets faster than a human can. There is no doubt that faster and more accurate target acquisition, enhances survivability on the battlefield. However, I propose that the alternative advantages of AI can do more than operate as guidance systems but also as robust wargaming instructors.
When undertaking enemy appreciation during a TEWT (tactical exercises without troops) there is a risk that cognitive biases in favour of Blue Force will undermine the capability of an enemy’s Most Dangerous Course of Action (MDCOA) to artificially compliment a planned decisive event. By using an OpenAI system it would be able to comprehend and learn capabilities through Army’s electronic gaming platforms, already in service. This will then be able to do two things:
Firstly the system’s ability to dispassionately calculate strategies to attempt to outmanoeuvre Australian forces, in game, will provide realistic training to an enemy’s MDCOA, and challenge the OODA loop process; thus providing a higher calibre of strategic and tactical training to officers attending courses such as COAC (Combat Officer Advanced Course).
Secondly, the constant evolution of machine thinking could potentially inform officers of various ranks and corps on alternative means of tactical and strategic thinking. Some examples include: informing a support commander of how to most efficiently connect logistical nodes from an SPOD (sea point of disembarkation) to front line forces or providing insight to a manoeuvre commander on how to utilize different deception strategies to best engage the enemy. Such insights may at least provide different courses of action for commanders to consider, and at best may inspire doctrine to be revised.
This proposal is framed within the scope of conventional warfare. It is therefore reiterated that the point of the AI system would be as an instructive tool to inform commanders, not as a replacement for commanders. This means that the SMAP (staff military appreciation process) still serves a vitally important function for the commander.
The proposed function of AI in this perspective is to utilise AI learning systems to provide a stronger, more challenging OPFOR (opposing force) and inspire the officer’s creativity and lateral intellect. In short, an AI learning system that can challenge commanders to be become better thinkers at all levels, would be an advantageous proposition for Army to pursue.