homeblogflaky tests and monkeys what i learned at gtac 2014

Flaky Tests and Monkeys: What I Learned at GTAC 2014

The Google Test Automation Conference was held in the Kirkland, WA Google campus over two days last month. This is the first conference I’ve attended that is dedicated to what I do for a living - developing automated tests and testing infrastructure.

Testing is key to releasing high quality software and automating ensures that testing is repeatable, reliable and fast. Testing and test tooling has struggled to gain respectability. It is seen as a stepping-stone to the engineering department - not a career. To be in a room packed full of smart, capable people dedicated to testing was amazing. Even though we test different products we were all discussing the same themes and challenges.

I heard a lot about flaky tests and monkeys. For a test automator, flaky tests are controversial - described as ‘poison’ and ‘worse than no test’. Yet, having a test that flips between pass and fail could also indicate that the test itself touches so many aspects of a given system, both internal and external, that it could be an excellent final proof of operability. Almost every presentation ended up mentioning these types of tests and how to deal with them in your testing system.

And then there are the monkeys. We love to call our bots by monkey names. Netflix has a whole simian army - Chaos Monkey, Doctor Monkey, Chaos Gorilla, and more! Probably not all that surprising given our job is to monkey with systems.


Automate everything, including your automation.

Never Send a Human to do a Machine’s Job: How Facebook uses bots to manage tests - Roy Williams (Facebook)

The continuous integration pipeline for Puppet/Puppet Enterprise ends with a Jenkins log. If tests go red, we depend upon knowledgeable team members to read the log and recognize known flakes (which we track nowhere but in our own memories), determine the reliability of the red (re-run, re-test, re-configure) and then track down the appropriate person to assign responsibility (the test writer or the developer associated with the feature under test). We throw these logs away after a few months.

In comparison, Facebook collects and stores test results. A system of bots monitors collected test outcomes. A bot reruns failed tests to ensure the results are trustworthy. A bot discovers flaky tests by comparing current results to historic reports. A bot monitors for new tests in the system and doesn’t include them in reported results until they are reliably green. A bot notifies test owners of tests that need examination. All with no human intervention.

This talk has given me six months of work (at least). We’re now designing our own centralized test result center and auto-monitoring system. No more scrolling through a giant log on Jenkins!

Mobile testing is hard, but improving.

Way back in my early days at Mozilla, I did some work testing Firefox on mobile devices. It was terrible. The devices were painful to configure and once configured they would crash or simply turn themselves off. Maintaining any sort of consistent, repeatable test environment was just barely possible.

That was years ago and things are still terrible but getting better. A lot of smart people are working on a lot of clever systems with both emulators and racked devices. While I don’t test on mobile these days, it is likely that I will again in the future, and I’ll be referring to these talks when I do.

You can automate it. No really, you can.

Test Automation on an Infrared Set-top Box - Olivier Etienne (Orange)

At the end of day one, Olivier Etienne described his work at Orange automating the testing of set-top boxes. The boxes communicate by infrared remote control. He and his team customized RaspberryPis with infrared transmitters. The RaspberryPis enable communication between a Selenium test system and set-top boxes connected to TVs. The test system sends action requests, which are turned into the appropriate infrared messages which the set-top box then responds to resulting in an update to the TV display. A screenshot of the TV is then sent back up the pipe to Selenium where it can be checked for correctness. Let’s face it, this is all pretty crazy. And cool. Think that you can’t automate something? Yes you can!

Manual Testing is dead. Long live Exploratory Testing!

Automated testing, automated pipelines, automated monitoring - everything is becoming automated. It may be a little presumptive that a room full of automation developers would consider manual testing dead, but we are all dedicated to working towards a world where manual testing is no longer necessary. No person should have to sit for eight hours a day following a script of test steps and clicking a series of buttons; that’s what computers are for.

This is not to say that actual human interaction with the product under test is no longer important. Automation should take over any test that is clearly defined and scoped, tests that are expected to pass and continue to pass, anything understood and repeatable. This frees up QA to do the interesting stuff that machines can’t do: experiment. Exploratory testing is where new tests are discovered. Creative people find creative ways to break things.

##Let’s all go to GTAC!##

If you want to learn about test automation, GTAC is where you want to be - and it’s free. Unfortunately, they only accepted 300 people this year so you do need to get your registration in and cross your fingers that you will be selected. Even if you don’t make it in the door, Google makes great efforts to simulcast the entire conference - including providing closed captioning. You can watch all of the talks without being confined to a stiff-backed plastic chair for 8 hours. My one complaint: Google, your chairs hurt!

I’m very interested in attending GTAC next year and hey, with all the neat stuff I’m planning on building, maybe I’ll even have something to present.