justin․searls․co

Yes, this is a link post to my own post on switching Rails system tests from Selenium to Playwright, which is newer, faster, and by all accounts I've ever heard from anyone who's used both of them: better.

Since posting this, I have heard several complaints from skeptics, all along the same lines of: how could Playwright possibly be less flaky than Selenium? After all, the tests are written with the same Capybara API. And, being the default, Capybara's Selenium adapter has had many more years of bug fixes and hardening. To these people it simply does not make intuitive sense why Selenium tests would fail erratically more often than Playwright.

Here's my best answer: Playwright is so fast that it forces you to write UI tests correctly on day one. Selenium isn't.

Because UI tests that automate a browser do so asynchronously through inter-process communication, the most common way for tests to be susceptible to non-deterministic failures is when that communication is meaningfully slower than the browser itself under certain circumstances and faster in others.

Two of the most common examples follow. (I use the word "page" below very loosely, as it could apply to any visible content that is shown or hidden in response to user action.)

A script that finds a selector that exists both before and after navigation:

  1. Be on Page A with an element matching some selector .foo
  2. Click a button to go to Page B, which also contains a .foo element
  3. Find .foo
  4. Your test is now in a race condition. Either:
    • Your test will search for .foo before Page B loads, causing it to fail
    • Page B will load before your test searches for .foo, and continue successfully

A script that fails to properly wait:

  1. Be on Page A
  2. Click a button to go to Page B
  3. Find something on the page without appropriately waiting for it to appear (the bulk of Capybara's API, as with many UI testing frameworks, is delineated between "waiting" vs. "non-waiting" search methods)
  4. Your test is now in the same sort of race condition. Either:
    • The non-waiting search will run before Page B loads, causing it to fail
    • Page B will load before your non-waiting search, and continue successfully

Counter-intuitively, the faster your browser automation tool is, the more often the test will fail during race conditions like those above and those failures are a good thing.

If you select something that exists on both pages or without properly waiting, Playwright will almost always be faster than your app and you'll simply never see an improperly-written test pass, either in development or in CI. Instead, you'll be forced to write the test correctly the first time.

Selenium, meanwhile, is so slow under so many conditions and in so many environments that it's not uncommon at all to write a passing test full of race conditions like those above, have the test pass every single time you run it locally, but then see it fail some of the time in CI. Worse, as time goes on your code will become more complex and both your app and your tests will become slower in their own ways. This can lead to apps that had previously been fast enough to always pass in spite of any race conditions to begin failing with alarming frequency years later.

And of course, when that happens, you're in a real pickle. Erratic failures are inherently hard to reproduce. And if a test has been passing for years and is suddenly failing, you aren't likely to remember what you were thinking when you wrote it—meaning that if you can't reliably reproduce the failure, you're unlikely to be able to fix any such race conditions by just looking at the code.

Anyway, that's why.

Switching an existing test suite from Selenium to Playwright won't magically fix all the flaky tests that Selenium let you write by virtue of its being slower than your app. In fact, the first time you run Playwright, you're likely to see dozens of "new" errors that have in fact been hiding in your tests like land mines all along. And that's a good thing! Fix them! 🥒

Writing Becky a feature for managing frequently asked questions but halfway through my editor crashed and I lost all my work so now I'm fresh out of faqs

Copied!

When developers have to keep thawing and fixing code during a code freeze, it can cause code freezer burn, which can be REALLY costly to fix

Copied!

In theory, I really like using generated columns in Postgres (which Rails supports with t.virtual migrations), but in practice it creates too many situations where persisted Rails models lie to you because generated columns are only updated if you think to call reload

Copied!
Breaking Change artwork

v19 - Feature Complete

Breaking Change

It is me, I am back! And after working too hard and becoming too dull I'm finally done* with that big app I've been working on.

I'm running behind on e-mails and need your best and worst takes for the next episode: podcast@searls.co.

Find out what that asterisk refers to below:

*I am, in fact, not at all done.

I got so frustrated getting the runaround from my ISP that I filed an FCC complaint and… holy shit.

  1. Complaint triggers mandatory contact from ISP within a few days
  2. Got a call from Spectrum corporate, super kind and helpful/understanding guy
  3. Delegates to research team to look into the issue and have answers within 24 hours.

Government in action.

Copied!

TIL that inline JavaScript handlers actually define an event variable for you:

This actually works:

〈a onclick="event.preventDefault(); this.closest('dialog').close()"〉

Maybe everyone else knew this, but it took me over 25 years apparently.

Copied!

One thing I really dig about the AI/LLM boom is that it's the first time since the 90s that otherwise "non-technical" people in my life have had any reason at all to get excited about computer-ass computing.

Seeing friends and family get interested in how ChatGPT works while improving at how they use it is way more gratifying than the last 15 years of the industry merely training the world how to doomscroll.

Copied!

A decoupled approach to relaying events between Stimulus controllers

Part of the allure of Stimulus is that you can attach rich, dynamic behavior to the DOM without building out a long-lived stateful application in the browser.

The pitch is that each controller is an island unto itself, with each adding a particular kind of behavior (e.g. a controller for copying to clipboard, another for displaying upload status, another for drag-and-drop reordering), configured entirely via data attributes. This works really well when user behavior directly initiates all of the behaviors a Stimulus controller needs to implement.

This works markedly less well when a controller's behavior needs to be triggered by another controller.

You'll never guess what happens next…