What happens when an automated test doesn’t provide a reliable, consistent result? Not only does it hinder the entire development process, but it also costs money and time to figure out what went wrong. This is test flakiness.
If you haven’t encountered flaky tests, go through this article as we try to outline their causes and solutions to fix them.
What Is Test Flakiness?
Here’s a riddle for you: What kind of test passes and fails periodically? A bad one!
A flaky test is one that both fails and passes with no changes in code. Simply put, a build fails only occasionally: sometimes it passes, and sometimes it fails, without any modifications to the source code or execution environment.
Why You Should Deal With Test Flakiness
Today, the need for automated testing has increased because it saves time and money. And one of the crucial elements of an automated test is its determinism: if the code hasn’t changed, the results shouldn’t either.
Flaky tests, also known as nondeterministic tests, pass or fail without an apparent reason, even though they are run with the same configuration, without any changes to test data, source code, or environment. These tests produce unreliable, inconsistent results. You can’t be sure if your code is free of bugs on a successful test run. You can’t infer if it’s worth the time and money to delete the whole test and rewrite it from scratch, which might ensure the test results’ accuracy.
These nondeterministic tests are quite costly since they require much maintenance and effort. They often need engineers to retrigger entire builds on CI. This, in turn, wastes a lot of time waiting for new builds to complete productively.
In other cases, if the test continues to perform inconsistently, the tester may as well stop running that test altogether, losing critical test coverage, which goes against the extent of automation.
Flaky tests not only slow your CI/CD pipeline but also erode the team’s confidence in the entire automation suite.
Determining The Cause Of Flaky Tests
The reasons for test flakiness can be numerous. Any issue with the code, problems with the tests, or some external factors compromising the test results can cause flakiness.
The tests themselves introduce flakiness because of the way you wrote them. One has to make certain assumptions about the state of a system or the test data while writing tests. These invalid assumptions often generate flakiness.
At times, issues with newly written code also cause nondeterministic output. The source code that relies on random inputs such as dates, random values, or remote services generates flaky tests.
Another primary source of nondeterministic tests is the difference in the environment, such as operating system variations, environment variables, libraries, number of CPUs, or networks. Even minor version changes in a library can introduce unexpected behavior or bugs.
Flakiness is also a consequence of order dependency: when the tests are executed differently than planned.
Unfortunately, flaky tests are not rare. They can bring entire production to a temporary halt. However, they can be dealt with and resolved.
Fixing the Test Flakiness
Although it’s difficult to completely eliminate flaky tests in automated testing, there are ways to minimize the number.
The first step to getting them under control is identifying flaky tests as soon as possible. Once you figure out the scope of the problem, you can prioritize fixing them. Several criteria like the level of business risk, test timings, or the amount of effort required can help you prioritize flaky tests. Just do not procrastinate! Now, this may sound easy, especially for small builds. But it isn’t practical on a large scale. However, it’s always the first and maybe the only safe option if the test covers a critical user path.
Another common strategy is to keep rerunning the test until it passes. This step doesn’t require debugging, but it is lazy. Besides hiding the symptoms of the problem, it will slow down your test suite even more while making the solution impracticable.
Another remedy is to isolate the tests that produce inconsistent results from your reliable ones and quarantine them into a separate test run group. This may seem a feasible option if release speed is more important than dealing with inconsistencies the test might uncover, even restoring confidence in the rest of your test suite. However, it’s important that you make a plan to fix the test in the future.
If nothing else works, you may even choose to delete the flaky test so that it doesn’t disturb your test suite anymore. Sure, it does save you money and time since you won’t be fixing or debugging the test. But it comes at the expense of losing a bit of test coverage and bug fixes. Remember, you wrote the test for a reason!
With platforms like Testsigma, you can quickly determine if a test is flaky or permanently broken or if the software has a bug. It’s a unified, fully customizable automated testing tool with an intelligent AI that automatically fixes broken scripts, heals dynamically changing elements, and suggests fixes for test failures. You can execute tests in your local browser/device or run across 800+ browsers and 2000+ devices on Testsigma’s cloud-hosted test lab.
Regardless of how you handle flaky tests, it is a must to keep track of which tests produce unreliable results, how you dealt with each failed test result and the reason for the failure. Document each flaky test and the steps you take to mitigate nondeterminism. It will help you and your team maintain faith in the test suite besides developing best practices to prevent flakiness. Additionally, it allows you to notice recurring patterns that could potentially be resolved.
Conclusion
As mentioned before, test flakiness is a common problem across every engineering organization, big or small, and sadly, flaky tests will never disappear in full. As seen from the wide variety of failures, low flakiness in automated testing can be pretty challenging. However, when taken into consideration, they can be cut to a minimum, which will impact the overall health of your automated test suite.