(Image Credit: iStockPhoto/nicescene)
The growth of AR hit Pokémon Go was staggering and far exceeded the humble expectations of developer Niantic Labs. Within just 15 minutes of launching in Australia and New Zealand, player traffic surged well past Niantic’s expectations and created a range of stability issues.
While it was a troublesome launch period, it could have been much worse if Google's Customer Reliability Engineering (CRE) team didn't step in. CRE is a new team at Google of technical staff and became the first customer of Niantic after the developer called them for reinforcements ahead of the US launch planned for the next day.
Many of Pokémon Go's services use Google Cloud, which is no surprise considering Niantic was owned by Google until October last year. Cloud Datastore became a direct proxy for the game’s overall popularity given its role as the game’s primary database for capturing the Pokémon game world.
Niantic was able to focus on deploying live changes for its players and treat it like a service which continues to improve.
The developer set a worst-case scenario of 5x expected traffic and prepared for this number accordingly. When their game launched, player traffic surged to 50x the initial expectation. You can imagine the mixture of panic and excitement in the crisis room at traffic over ten times even the worst-case estimate. In response, Google CRE provisioned extra capacity on behalf of Niantic to stay well ahead of their record-setting growth.
CRE stuck with Niantic and continued to create and deploy solutions as they occurred. Working together, a complete review of Niantic's architecture was conducted under the supervision of Google Cloud engineers to ensure the millions of new players entering the game weren't left disappointed even if servers were not reliable for all from the start.
The application logic for Pokémon Go runs on Google Container Engine (GKE) and is powered by the open-source Kubernetes project – making it a great example of container-based development. Through choosing this method of development, Niantic was able to focus on deploying live changes for its players and treat it like a service which continues to improve.
In a feat which Google compares to 'swapping out the plane's engine in-flight', CRE and Niantic upgraded to a newer version of GKE to allow for more than a thousand additional nodes to be added to its container cluster in preparation for the game's Japanese launch while taking measures not to disrupt existing players.
"On top of this upgrade, Niantic and Google engineers worked in concert to replace the Network Load Balancer, deploying the newer and more sophisticated HTTP/S Load Balancer in its place. The HTTP/S Load Balancer is a global system tailored for HTTPS traffic, offering far more control, faster connections to users and higher throughput overall — a better fit for the amount and types of traffic Pokémon GO was seeing," Google explains.
By going through each problem and making the required upgrades in partnership with the CRE team, Niantic launched Pokémon Go in Japan without incident despite the number of users signing up tripling over the US launch two weeks earlier. If you ask me, that's impressive.
Are you impressed with how Google and Niantic handled the Pokémon Go launch? Let us know in the comments.