Tuesday, Jun 18, 2024

Running Rails System Tests with Playwright instead of Selenium

Last week, when David declared that system tests have failed, my main reaction was: "well, yeah." UI tests are brittle, and if you write more than a handful, the cost to maintain them can quickly eclipse any value they bring in terms of confidence your app is working.

But then I had a second reaction, "come to think of it, I wrote a smoke test of a complex UI that relies heavily on Turbo and it seems to fail all the damn time." Turbo's whole cloth replacement of large sections of the DOM seemed to be causing numerous timing issues in my system tests, wherein elements would frequently become stale as soon as Capybara (under Selenium) could find them.

Finally, I had a third reaction, "I've been sick of Selenium's bullshit for over 14 years. I wonder if I can dump it without rewriting my tests?" So I went looking for a Capybara adapter for the seemingly much-more-solid Playwright.

And—as you might be able to guess by the fact I bothered to write a blog post—I found one such adapter! And it works! And things are better now!

So here's my full guide on how to swap Selenium for Playwright in your Rails system tests:

Step 1: Install

Get rid of the selenium-webdriver gem and add capybara-playwright-driver in its place:

 group :test do
   gem "capybara"
   gem "capybara-playwright-driver"
 end

This will also pull in playwright-ruby-client, which is compatible with a specific version of the playwright package, so you need to be sure that you install the correct version. Fortunately the Ruby client ships with a Playwright::COMPATIBLE_PLAYWRIGHT_VERSION constant that will tell you what that version is. Additionally, the playwright install command will download its own browsers (chromium, webkit, and firefox by default) to a platform-dependent cache directory. If you leave everything in its default place, the Ruby client should automatically find them.

To automate this, I threw these shell commands in my project's script/setup script right after bundle install and yarn install, so anyone setting up the project can install or update Playwright as needed:

export PLAYWRIGHT_CLI_VERSION=$(bundle exec ruby -e 'require "playwright"; puts Playwright::COMPATIBLE_PLAYWRIGHT_VERSION.strip')
yarn add -D "playwright@$PLAYWRIGHT_CLI_VERSION"
yarn run playwright install

It's safe to repeatedly run the above commands, and they should take less than half a second if playwright is up-to-date and its browsers are cached.

Step 2: Test setup

As you might expect, configuring Playwright starts with ripping out anything having to do with selenium, probably located in test/application_system_test_case.rb:

require "test_helper"

class ApplicationSystemTestCase < ActionDispatch::SystemTestCase
  # delete this, and anything else you've added about Selenium:
  driven_by :selenium, using: :chrome, screen_size: [1400, 1400]
end

You can probably(?) just swap that for driven_by :playwright if you want, but I like to be able to control which browser I'm running and whether it's headless using ENV flags, so now my test/application_system_test_case.rb looks like this:

require "test_helper"

Capybara.register_driver :my_playwright do |app|
  Capybara::Playwright::Driver.new(app,
    browser_type: ENV["PLAYWRIGHT_BROWSER"]&.to_sym || :chromium,
    headless: (false unless ENV["CI"] || ENV["PLAYWRIGHT_HEADLESS"]))
end

class ApplicationSystemTestCase < ActionDispatch::SystemTestCase
  driven_by :my_playwright
end

As you can see, this will run with Chromium and a UI by default. Setting PLAYWRIGHT_BROWSER to "webkit" or "firefox" will change which browser is launched. And setting a CI env var (as every CI service does) or PLAYWRIGHT_HEADLESS will configure the driver to run headlessly.

Step 3: Set up CI

I use GitHub Actions for CI, and I wanted to make sure:

Playwright would install correctly
It only installed the browsers the build used (just Chromium, in my case)
It cached those browsers between runs

To accomplish this, I added these three steps immediately after yarn install in my workflow YAML:

- name: Cache Playwright Chromium browser
  id: playwright-cache
  uses: actions/cache@v4
  with:
    path: ~/.cache/ms-playwright
    key: playwright-browsers-${{ runner.os }}-${{ hashFiles('yarn.lock') }}

- name: Install Playwright Chromium browser (with deps)
  if: steps.playwright-cache.outputs.cache-hit != 'true'
  run: yarn run playwright install --with-deps chromium

- name: Install Playwright Chromium browser deps
  if: steps.playwright-cache.outputs.cache-hit == 'true'
  run: yarn run playwright install-deps chromium

As you might be able to suss out, if there is a cache hit, we're spared the Chromium install but still need to run the playwright install-deps command, which will make sure whatever supporting tools Playwright requires are available.

All told, this setup results in an additional ~20 seconds of setup overhead each time the action is run. Not fantastic, but I can live with it.

Step 4: Fix your tests

It's unlikely all your tests will magically work under Playwright, but I was genuinely impressed by how few issues I ran into.

My issues were all minor:

Text nodes of XML elements are returned with empty newlines intact under Selenium, but empty newlines are stripped under Playwright. I had to adjust one assertion of an Atom feed as a result
The Selenium driver will allow calls to accept_confirm and return a string without a block, but Capybara's API specifies (and Playwright expects) that whatever action leads to a confirm prompt being shown must be invoked inside a block passed to accept_confirm
The Playwright Capybara adapter rescues a number of non-fatal errors for you, but it also puts the messages of those errors, even if they're out of your control to fix, so I had to slap together an unfortunate helper to selectively squelch them and keep my console output clean

If you run into more issues, the adapter's docs were really helpful to understanding its capabilities and limitations.

Step 5: Realize that your tests are a lot more stable

I immediately noticed a dramatic improvement in test stability. Overall runtime was roughly the same, but I went from a 30% failure rate under Selenium when running my entire suite to less than 5% under Playwright.

Better yet, I found that Playwright was failing more predictably at the same two or three call sites, which made it much easier to reproduce and debug. Within about an hour of tweaking the offending tests, I'd solved each issue and was able to run the suite consecutively 200 times without a single failure.

The fact that things are more "stable feeling" is a bigger deal than it might sound like, as performance under Selenium was just erratic enough that I never made any real headway in my attempts to fix the same underlying issues over the past two months, but I managed to resolve them all in a single afternoon with Playwright. I've gone from a wholly unreliable, flaky build to one that I can reasonably rely on to tell me if my app is broken.

And that's it, I guess?

I'm still a Playwright noob, so I'm sure I'll run into other issues down the road, but I have to hand it to Capybara for successfully abstracting their driver API such that third parties can implement working adapters and to Yusuke Iwaki for building the Playwright Ruby client and Capybara driver. This was a rare pleasant experience with open source, where everything more or less Just Worked the first time I tried it.

Anyway, while it's probably not enough to overcome DHH's blanket announcement that system tests were a failure, you may find that switching from Selenium to Playwright results in your tests themselves failing less often.