Thursday, Aug 29, 2024

Why Playwright is less flaky than Selenium

Yes, this is a link post to my own post on switching Rails system tests from Selenium to Playwright, which is newer, faster, and by all accounts I've ever heard from anyone who's used both of them: better.

Since posting this, I have heard several complaints from skeptics, all along the same lines of: how could Playwright possibly be less flaky than Selenium? After all, the tests are written with the same Capybara API. And, being the default, Capybara's Selenium adapter has had many more years of bug fixes and hardening. To these people it simply does not make intuitive sense why Selenium tests would fail erratically more often than Playwright.

Here's my best answer: Playwright is so fast that it forces you to write UI tests correctly on day one. Selenium isn't.

Because UI tests that automate a browser do so asynchronously through inter-process communication, the most common way for tests to be susceptible to non-deterministic failures is when that communication is meaningfully slower than the browser itself under certain circumstances and faster in others.

Two of the most common examples follow. (I use the word "page" below very loosely, as it could apply to any visible content that is shown or hidden in response to user action.)

A script that finds a selector that exists both before and after navigation:

Be on Page A with an element matching some selector .foo
Click a button to go to Page B, which also contains a .foo element
Find .foo
Your test is now in a race condition. Either:
- Your test will search for .foo before Page B loads, causing it to fail
- Page B will load before your test searches for .foo, and continue successfully

A script that fails to properly wait:

Be on Page A
Click a button to go to Page B
Find something on the page without appropriately waiting for it to appear (the bulk of Capybara's API, as with many UI testing frameworks, is delineated between "waiting" vs. "non-waiting" search methods)
Your test is now in the same sort of race condition. Either:
- The non-waiting search will run before Page B loads, causing it to fail
- Page B will load before your non-waiting search, and continue successfully

Counter-intuitively, the faster your browser automation tool is, the more often the test will fail during race conditions like those above and those failures are a good thing.

If you select something that exists on both pages or without properly waiting, Playwright will almost always be faster than your app and you'll simply never see an improperly-written test pass, either in development or in CI. Instead, you'll be forced to write the test correctly the first time.

Selenium, meanwhile, is so slow under so many conditions and in so many environments that it's not uncommon at all to write a passing test full of race conditions like those above, have the test pass every single time you run it locally, but then see it fail some of the time in CI. Worse, as time goes on your code will become more complex and both your app and your tests will become slower in their own ways. This can lead to apps that had previously been fast enough to always pass in spite of any race conditions to begin failing with alarming frequency years later.

And of course, when that happens, you're in a real pickle. Erratic failures are inherently hard to reproduce. And if a test has been passing for years and is suddenly failing, you aren't likely to remember what you were thinking when you wrote it—meaning that if you can't reliably reproduce the failure, you're unlikely to be able to fix any such race conditions by just looking at the code.

Anyway, that's why.

Switching an existing test suite from Selenium to Playwright won't magically fix all the flaky tests that Selenium let you write by virtue of its being slower than your app. In fact, the first time you run Playwright, you're likely to see dozens of "new" errors that have in fact been hiding in your tests like land mines all along. And that's a good thing! Fix them! 🥒

justin.searls.co

Why Playwright is less flaky than Selenium

Got a taste for hot, fresh takes?