Continuous Integration on Android
Coltin Caverhill, Software Engineer
- Apr 20, 2017
Continuous Integration (CI) is a powerful tool for the Android team at Yelp. It gives us a platform to ensure quality on all eight of our Android apps, allowing us to emphasize testing and ensure our newest features aren’t breaking our oldest masterpieces. For example, we don’t want Yelfies to crash the app while you’re writing your next great review.
To achieve this, we have a dedicated cluster of AWS machines running our CI server. These machines run JUnit tests, static analysis, build APKs, merge code, and send us notifications. They also work with Firebase Test Lab to start up around 10,000 emulators each day to run our UI tests.
This post will walk through how you can replicate our setup.
You Miss 100% of the CI You Never Build
We use Jenkins at Yelp as our CI server. Thanks to the hard work of our Release Engineering and Operations teams, we can use Terraform to describe the machines we want in our cluster by specifying the AWS instance type and number of machines. Terraform will handle starting up the machines, and shutting down old ones. We also use Puppet on top of this, which defines how each machine is configured, making it easy to have a consistent kernel version or describe how SSH will work.
Snippet from our Terraform file:
While you can quickly get set up with Jenkins on AWS without Puppet or Terraform, we find they cut down significantly on our maintenance and allow us to easily scale out machines as our needs change.
Regardless of how you configure your machines you’ll need to create jobs that do things. For us these jobs are building Android apps into APKs, running static analysis, tests, and aggregating statistics/information.
We originally configured these jobs through the Jenkins web UI. It worked okay just to get us started, but it was very error prone and led to many issues. It was common for someone to make a change and not realize something unrelated broke. Usually the culprit was this person testing their job and seeing that it worked, but not realizing another job that relies on it would break the next time it ran. Whoever ran that parent job would now think they were at fault, and spend time debugging and trying to track down how they could have possibly broken it. This problem compounds the larger your team gets and the more changes that are in flight.
The solution for us was the Jenkins Job Builder (JJB). This allowed us to define jobs as YAML or JSON files. More importantly, we could source control, and thus code review our jobs. This meant our changes had a lot more quality behind them, and code reviews encouraged knowledge sharing. At least one other person was now aware of our changes.
Now that we can define jobs, let’s do something useful with them.
Some very easy to set up jobs are ones that run static analysis on your code. You can configure tools like PMD, FindBugs, Checkstyle, Android Lint, or JUnit tests to run with gradle. Once you can execute these on a local development machine, it’s very easy to get them to run on CI with Jenkins without any special setup. Using the JJB it might look like this:
You could of course do this through the web UI if you’re not inclined to set up the JJB just yet. This job will run each of the gradle tasks, and publish artifacts (like the Lint report) making it very easy to figure out what you need to fix when something goes wrong. You can now have this job run before people merge code, giving a lot more quality to new changes.
One important thing CI does for us is building android apps (APKs), and we’ve found a really great way of doing that with Docker. Docker allows us to configure an environment that can build Android apps and then drop that environment on arbitrary machines, like our Jenkins AWS slaves. A good starting point is this Dockerfile, which we used as inspiration for our own.
Building APKs on CI allows us to verify that the changes we make can still compile and build working apps. It also allows us to run automated UI tests. UI tests on Android require emulators or physical devices. Running emulators on AWS machines, or putting together a device farm, are no easy tasks. Thankfully “Emulators as a Service” are a growing trend, and we’ve had great success with Firebase Test Lab.
To use Test Lab you need to install the Google Cloud SDK and sign up for Firebase Test Lab. Once you’ve done those two things it’s very easy to run tests on these emulators. You can find details here but your invocation will look something like this:
This will start up a Nexus4 emulator running Android Lolipop 5.1 (API 22) and run all of your UI tests against your app. When the tests are done running you can get video, logcat, and xml test results.
There is a lot of flexibility with how you run this command allowing you to run only specific tests or test classes. For Yelp, it’s very important that we get results back very quickly, so we spin up hundreds of emulators every time we want to run tests. Each emulator is responsible for a single test class, allowing us to get fast results even if we add a lot more tests.
In order to start up emulators on our Jenkins machines, we created another Docker container that knows about the Google Cloud SDK. Luckily, one is already provided by Google, making it trivial to start up emulators.
Continuous integration is a powerful tool that requires a lot of love to get working and maintain, but if you’re willing to put in the work, you’ll get a lot out of it. Having a tool that automatically builds your commit, runs static analysis and tests on it, and then merges it into your master branch… do you feel that? Yeah, goosebumps.