Handling Flaky Tests: Reducing the Noise in Your CI/CD Pipeline

by Alastair Wilkes in

Detecting Unreliable Tests

CI/CD bridges the gaps between development and operations by automating the building, testing and deployment of applications. But there is a frequent blindspot within CI/CD that requires higher observability to combat it’s strain on DevOps pipeline. Within testing, a common foe among teams is the fickle flaky test. 

Flaky tests cause mistrust in the test themselves. Additionally, they strain timelines because teams often rerun the tests multiple times before deciding to investigate them. There are no tools that flag flaky tests, knowledge about them is tribal. Thus, this lack of observability becomes a hidden barrier to a teams deliverability

The ability to have indicators or insights into flaky tests can give you higher visibility into areas that need prioritizing.

Create Confidence with Flaky Tests Observability

Common culprits of flaky tests include infrastructure issues, poorly written tests, and memory failures. Additional time is required for every test failure to scrutinize the actual issue, true error or a false test outcome.

Classifying flaky tests requires remembering the history of a test suite. In a large test suite, classifying flakes becomes a substantial burden and observability is often limited to tribal knowledge or siloed by individuals.

Every team has policies for handling flaky tests. These often include: 

  • Running tests multiple times until they pass
  • Testing them later in the pipeline
  • Moving the test out in their own runs 

Regardless of option, each of these steps slows down developers and team velocity. By strengthening flaky test observability and tracking historical tests, your development team can feel confident in which tests to pursue and which are flakes.

Adopting Flaky Tests Insights Into Your DevOps Pipeline

We created Flaky Tests Insights to empower development teams to strengthen CI observability by quickly identifying flaky tests. Unobtrusive and easy to incorporate within your test suite, our tool empowers your developers to efficiently pinpoint the top flaky tests in a test suite to prioritize fixing the right tests 

Launchable assigns each test a flakiness score from 0-1, with higher scores indicating higher likelihood of flakiness, to give your developers a clearer picture of which flaky tests to tackle first. Flagging flaky tests early, you can speed up test failure analysis.

By increasing the accuracy and repeatability of flaky test detection, teams waste less time chasing after tests with lower flakiness scores and can prioritize critical tests that indicate higher likelihood of flakiness. They also can feel confident in tracking the history of their test suite in a central report, gaining further visibility into historic flaky test results.

Quiet the Noise in Your Software Delivery Pipeline

Mute the noise caused by flaky tests within CI and strengthen your DevOps pipeline. Pairing Flaky Tests Insights with the Predictive Test Selection further amplifies the feedback loop effectiveness. Your tests get better, your test selection gets better, your team receives faster feedback, and finally it’s easier to triage problems which makes your developers happier.

If flakiness is one of your team’s biggest problems, sign up for our Flakiness Dashboard Beta. It takes less than 30 minutes - and you’ll be on your way to empowering your developers and increasing your velocity.

The state of the art in tackling Flaky Tests

Watch the on-demand webinar to learn how software teams around the world has tackled this problem, from “idol” companies like Google & GitHub, to the “next door neighbor” company just like yours, so that you can tackle this in your team, and see how this is a part of a bigger emerging movement.

Your cart