justin․searls․co

A lot of content around here boils down to links to someplace else, for which all I have to add is a brief call-out or commentary. As such, the headlines for each of these posts link to the original source article. (If you want a permalink to my commentary, try clicking the salt shaker.)


Yes, this is a link post to my own post on switching Rails system tests from Selenium to Playwright, which is newer, faster, and by all accounts I've ever heard from anyone who's used both of them: better.

Since posting this, I have heard several complaints from skeptics, all along the same lines of: how could Playwright possibly be less flaky than Selenium? After all, the tests are written with the same Capybara API. And, being the default, Capybara's Selenium adapter has had many more years of bug fixes and hardening. To these people it simply does not make intuitive sense why Selenium tests would fail erratically more often than Playwright.

Here's my best answer: Playwright is so fast that it forces you to write UI tests correctly on day one. Selenium isn't.

Because UI tests that automate a browser do so asynchronously through inter-process communication, the most common way for tests to be susceptible to non-deterministic failures is when that communication is meaningfully slower than the browser itself under certain circumstances and faster in others.

Two of the most common examples follow. (I use the word "page" below very loosely, as it could apply to any visible content that is shown or hidden in response to user action.)

A script that finds a selector that exists both before and after navigation:

  1. Be on Page A with an element matching some selector .foo
  2. Click a button to go to Page B, which also contains a .foo element
  3. Find .foo
  4. Your test is now in a race condition. Either:
    • Your test will search for .foo before Page B loads, causing it to fail
    • Page B will load before your test searches for .foo, and continue successfully

A script that fails to properly wait:

  1. Be on Page A
  2. Click a button to go to Page B
  3. Find something on the page without appropriately waiting for it to appear (the bulk of Capybara's API, as with many UI testing frameworks, is delineated between "waiting" vs. "non-waiting" search methods)
  4. Your test is now in the same sort of race condition. Either:
    • The non-waiting search will run before Page B loads, causing it to fail
    • Page B will load before your non-waiting search, and continue successfully

Counter-intuitively, the faster your browser automation tool is, the more often the test will fail during race conditions like those above and those failures are a good thing.

If you select something that exists on both pages or without properly waiting, Playwright will almost always be faster than your app and you'll simply never see an improperly-written test pass, either in development or in CI. Instead, you'll be forced to write the test correctly the first time.

Selenium, meanwhile, is so slow under so many conditions and in so many environments that it's not uncommon at all to write a passing test full of race conditions like those above, have the test pass every single time you run it locally, but then see it fail some of the time in CI. Worse, as time goes on your code will become more complex and both your app and your tests will become slower in their own ways. This can lead to apps that had previously been fast enough to always pass in spite of any race conditions to begin failing with alarming frequency years later.

And of course, when that happens, you're in a real pickle. Erratic failures are inherently hard to reproduce. And if a test has been passing for years and is suddenly failing, you aren't likely to remember what you were thinking when you wrote it—meaning that if you can't reliably reproduce the failure, you're unlikely to be able to fix any such race conditions by just looking at the code.

Anyway, that's why.

Switching an existing test suite from Selenium to Playwright won't magically fix all the flaky tests that Selenium let you write by virtue of its being slower than your app. In fact, the first time you run Playwright, you're likely to see dozens of "new" errors that have in fact been hiding in your tests like land mines all along. And that's a good thing! Fix them! 🥒

The new ‌Mac mini‌ will be the first major design change to the machine since 2010, making it Apple's smallest ever desktop computer. The new ‌Mac mini‌ will apparently approach the size of an Apple TV, but it may be slightly taller than the current model, which is 1.4 inches high. It will continue to feature an aluminum shell. Individuals working on the new device apparently say that it is "essentially an iPad Pro in a small box."

I can't be the only person thinking "I wonder if I could plug this into a portable USB power bank, throw it in my bag, and then use run Mac Virtual Display on my Vision Pro without needing to carry a laptop… can I?

I really enjoyed this discussion with host Tim Chaten about the state of Apple Vision Pro. It was recorded a couple weeks after WWDC, which meant the memory was fresh enough to keep all of Apple's announcements top of mind but distant enough to imagine various directions things could go from here.

I gotta say, it was nice talking to someone who knows and cares more about the platform than I do. Some real "there are dozens of us!" energy around the Vision Pro right now.

Yesterday, Gruber broke what, in my opinion, is the most important news story regarding Apple Vision Pro since its launch in February. Emphasis mine:

VisionOS 2 is not getting any Apple Intelligence features, despite the fact that the Vision Pro has an M2 chip. One reason is that VisionOS remains a dripping-wet new platform — Apple is still busy building the fundamentals, like rearranging and organizing apps in the Home view. VisionOS 2 isn't even getting features like Math Notes, which, as I mentioned above, isn't even under the Apple Intelligence umbrella. But another reason is that, according to well-informed little birdies, Vision Pro is already making significant use of the M2's Neural Engine to supplement the R1 chip for real-time processing purposes — occlusion and object detection, things like that. With M-series-equipped Macs and iPads, the Neural Engine is basically sitting there, fully available for Apple Intelligence features. With the Vision Pro, it's already being used.

Not being able to run Apple Intelligence would be a devastating blow to any role Vision Pro might serve as Apple's halo car—an expensive gadget most people won't (and shouldn't) buy, but which plays an aspirational role in the lineup and demonstrates their technology and design prowess.

Now, couple this with rumors that work on Vision Pro 2 has been suspended, and it starts to look like we won't see any Apple Intelligence features on the visionOS platform until late 2026 at the earliest. How dated and limited will Apple Vision Pro seem in late 2026 if most new features coming to Apple's other platforms—including, one imagines, updated Watch, Apple TV, and HomePod hardware—don't find their way to Vision Pro, putting its user experience further and further behind?

At launch, I heard a lot of people jokingly refer to Vision Pro as, "an iPad strapped to your face." Recently, as it's become clear most people are using it to watch TV and for Mac screen sharing, Marco Arment said it was more like a mere Apple TV strapped to your face. But if the hardware really can't support Apple Intelligence and isn't going to be updated for several years, how long before Vision Pro feels like an original HomePod strapped to your face?

I was excited to be hosted by the Changelog folks for a recap discussion following Apple's WWDC keynote on Monday. If you listened to my WWDC spoiler cast, you might like this after-action report.

Changelog & Friends 48: Putting the Apple in AI – Listen on Changelog.com

A few errata and missed pitches:

  • I didn't mention this on the podcast, but I was deeply disappointed to see Apple didn't expose any system-level model that can just be invoked as a general purpose LLM API. This would have been a game-changer for small developers who are currently hobbled by figuring out how to roll out meaningful LLM features without risking that the cost of calling through to the OpenAI API will eclipse the revenue generated by app sales and subscriptions
  • When discussing why Apple Intelligence requires an iPhone 15 Pro, I whiffed on the reason (which became clear later that day) that the root cause is memory. Devices with less than 8GB of RAM probably can't run Apple Intelligence without the base operating system falling over
  • I referenced Mac Virtual Display working with the "Wireless NIC turned off". That's not quite right. The network interface needs to be on, but if neither device is connected to a wifi network, screen sharing will work over a peer-to-peer wifi connection

I hope you'll listen! I like Jerod and Adam a lot. The whole Changelog family of podcasts is fantastic. You can tell they're smart because they have yet to invite Breaking Change to join the network.

Yesterday in the Platform State of the Union, Apple celebrated 10 years of Swift and then in the very next slide announced they finally built a testing framework for it.

Digging in more, I found this forum post that a "vision document" (I'm not familiar with Swift people's vernacular for this stuff) for testing direction had been accepted.

Anyway, this is all interesting in its own right and something I'll be following generally, but I have to say that for such a broad and important topic, that forum thread had an absolutely incredible time-to-mocking-derail metric. The very first reply is about mocking and then seemingly half of the subsequent posts in the thread are people spouting off their personal opinions on mocking instead of any of the more important stuff.

I don't claim to be an expert in very much, but having built several mocking frameworks, spoken about mocking practices (including a stealth mocking keynote at RailsConf and a not-so-stealth mocking closing talk at JSConf), and even named the company I co-founded after a mocking term, I feel like I have some authority to say the following: boy howdy is it a bummer that most developers only understand mocking as a utility and lack any comprehension of how to deploy mocks in a well-defined, consistently-executed software development workflow.

I'm very happy to have nailed an approach that works really well for me, but I'll probably always view it as a major failure of my career that I was never able to market that approach effectively. Even now, I don't have a single authoritative URL to point you to for my "Discovery Testing" approach that would provide a clear explanation of what it is, why it's good, and how to use it. And for me personally, the moment has probably passed and I just need to live with that failure, I imagine.

One of the things I think about a lot with respect to testing practices (including test-driven development) and their failure to really "stick" or spread more broadly is that they transcend any one testing framework or programming language. As a result, consultants like me were so absorbed just porting slightly-different versions of the various tools we needed to every new language that there's no such thing as a single README or book that could explain to a normal developer how to be successful. I tried to write a book once that would serve as a tabula rasa across languages and I almost immediately became trapped in a web of complexity. Other programming idioms and methods that apply across languages are similarly fraught, but mocking's relative unimportance to the primary task of shipping working software probably doomed it from the start.

Anyway, it's cool that mocking will be possible in Swift Testing.

Humane, of overpriced AI tchotchke and parting-foolish-VCs-with-their-money fame:

Our investigation determined that the battery supplier was no longer meeting our quality standards and that there is a potential that certain battery cells supplied by this vendor may pose a fire safety risk.

I would normally be worried about people's safety here, but I'm pretty confident that all of Humane's customers had already stopped using it months ago.

I'm happy to share that I'll be speaking at Rails World 2024. Everything I heard about last year's event was overwhelmingly positive, and my interactions with Amanda give me every confidence that this year's event in Toronto will be great, as well. Japan's RubyKaigi—which has grown to 1400 attendees and attracted dozens of corporate sponsors—has set a high bar for any conference that aims to blend community-building, professional development, and in-person collaboration to push a technology forward, but every indication suggests Rails World is on the right track.

My topic? Glad you asked.

In keeping with the "one-person framework" motif, I'm calling it "The Empowered Programmer", as a sort of sequel to my 2019 presentation, The Selfish Programmer. I'll be talking about the Rails 7 app I've been building this year, in support of my wife's eponymous Better with Becky business.

A few themes that might emerge:

  • The value of proving out the app's basic plumbing with a lower stakes proof-of-concept, so as to avoid packing one's most naive, unconventional code into its most important "MVP" features (in this case, by building Beckygram before breaking ground on the more important strength-training system)

  • Why to adopt and how to get the most out of relatively recent Rails-ecosystem tools like Hotwire, Active Storage, and Solid Queue—many teams skip omakase stack stuff out of habit or because they're upgrading an older app, but staying on the rails has greatly accelerated my productivity as a solo developer

  • The various (mis-)adventures I've had with GPT-4 as my only pairing partner

  • How nice it is to not have React or Webpack anywhere in my codebase. Seriously, Stimulus and Turbo really feel like the "JavaScript sprinkles" we should have had all along, and the amount of pain they can spare you from trying to balance a single-page JavaScript app with a Rails backend is profound

  • Plenty of other takes, served hot

XOXO is back for one last festival this August. Having always wanted to attend, I was about to buy a ticket when I thought to click through to the COVID policy mentioned in their announcement e-mail:

All XOXO participants are required to wear a high-quality mask at all times while inside Washington High (including Revolution Hall, Show Bar, and all common areas inside the venue), the reserved area for Park Pass holders in the tent, as well as at any festival event on our schedule where masks are required.

I knew XOXO was frequented by hipsters, so I'll grant that an all-N95-all-the-time policy in late-2024 is decidedly vintage.

Park Pass holders will have access to reserved seating in the tent at Washington High Park, a large, shaded, well-ventilated space for viewing the simulcast of our main stage programming. Park Pass holders are required to wear a high-quality mask at all times while in the reserved seating area.

Even outdoors, too? Pass.

John Hawthorn, a brilliant Rubyist and contributor of quite a lot of important open source in the Ruby ecosystem references a benchmark for a gem that lets you invoke the Crystal programming language from Ruby.

I'll spoil the post here by giving you the before and after.

Before:

The "crystalized" version runs about 4x faster than the pure Ruby version.

After:

Now it's Ruby that's 5 times faster than Crystal!!! And 20x faster than our original version.

Writing a program that behaves the way you want is hard, but that's not the end of the journey. Without an understanding of how the computer will execute your instructions, you're left at the mercy of a bunch of arbitrary performance implications that can lead to misguided beliefs emerging within teams and organizations.

A common refrain in recent years is, "we're rewriting all our critical sections of Ruby into Rust." Closer inspection by an expert almost always finds flaws in the assumptions (or, if there are any, analyses) that lead to these whole-cloth "optimizations", however.

I've seen organizations try to pull off pretty much every approach you can think of to escape performance problems and technical debt. The more dramatic the maneuver—like, say, bifurcating one's codebase into two languages and then bridging them—the more likely it is to fail to accomplish the stated goal or to fail outright.

Trust me when I say it's almost always better to dance with the one who brought you. Embracing the problem usually leads to better solutions than running from it.

In a post on X (formerly Twitter), Kosutami explained that Apple has stopped production of FineWoven accessories due to its poor durability

Sorry cows, your skin is simply too good to not wrap our metal devices in.

The company may move to another non-leather material for its premium accessories in the future.

On second thought, it's very unlikely Apple would ever go back to leather. My guess is FineWoven quietly disappears and isn't immediately replaced with another "premium" material in 2024.

The co-founder of a nascent game studio learned of a significant internal leak when a reporter called him for comment. He notified his publishing partner who responded to the leak by cutting funding. Then, after breathing what I'm sure were a series of very deep, very audible sighs he decided to say, "fuck it," and close the whole damn studio:

As a result of the cancellation of the publishing relationship and after careful consideration, I am closing Possibility Space. Today is your last day of employment at Possibility Space and Prytania Media. Your final paycheck including pay for work through the end of today will be deposited to your account, along with any other required payments, as dictated by your work location.

And exit the industry while he's at it:

As of today I am stepping away from the game industry to focus on my family and care for Annie [Delisi Strain]. I wish all of you the best as you navigate this complex industry and the challenges and opportunities ahead.

In a recent episode of Breaking Change I talked at length about why the "AAA" gaming industry is falling apart before our eyes, so reactions like this don't seem so unreasonable to me. I've been wondering aloud for my entire career why the hell anyone gets into the gaming business when virtually everyone involved could make an order of magnitude more money working half as many hours doing literally anything else involving software. Now that the free money spigot has been shut off, investors are starting to realize the same thing and are looking for any plausible pretense to pull the plug.

I'm sure the journalist named in the piece (incidentally, by his own employer, via this article) doesn't feel awesome that his reporting indirectly led to his sources losing their jobs. And the fact he apparently never bothered to contact Annie Delisi Strain—the rare-for-the-industry woman CEO currently on medical leave—for comment is certainly a bad look. But he's probably got enough to worry about, considering that one of the few industries closer to the brink of collapse than games is games journalism.

Note that this App Store policy change is global, not only to comply with the EU's Digital Markets Act.

You can read the updated policy directly, but The Verge breaks it down:

The move should allow the retro console emulators already on Android — at least those that are left — to bring their apps to the iPhone. Game emulators have long been banned from iOS, leaving iPhone owners in search of workarounds via jailbreaking or other workarounds. They're also one of the key reasons, so far, that iPhone owners in the European Union might check out third-party app stores now that they're allowed in the region. Apple's change today could head that off.

There goes the only reason I'd ever be interested in a third-party marketplace. I suspect I'm not alone in that.

The Wall Street Journal (News+ link) with a pretty wild profile of one of Apple's most interesting and influential leaders:

People close to Schiller describe his three main hobbies as cars, Boston sports teams and Apple, where he is still known to work nearly 80 hours a week, respond to emails almost immediately and answer phone calls at any time. He is also heavily involved in philanthropic endeavors, including an institute at Boston College, his alma mater, that carries his name, the Schiller Institute for Integrated Science and Society.

If you've been a senior executive someplace for decades and your company still relies on you working 80-hour weeks, I don't know how else to say it: you fucked up. You should have figured out how to develop other leaders and delegate responsibility to them by now. And unless your unique contributions are going to make the difference in curing cancer, ending global hunger, or bringing peace to the Middle East, you're only exacerbating the problem by continuing to work around-the-clock into your mid-60s.

This is a very good list. A few things I hadn't seen before but will instantly add to my project.

Hirb is great when inspecting elements in the console. It's a mini view framework for IRB/Console. It can handle displaying information in tables and pages. It's not quite powerful enough to build a full fledge TUI application, but it's really useful for quickly inspecting data in the console. Say you want to print the attributes of the last 10 signed in users. Hirb would let you display them as a table instead of a bunch of long lines, It makes it a lot easier to visually parse information. It's not Rails-specific but comes with Active Record support out of the box.

Looks like a worthy successor to one of my favorite gems, table_print.

Apple has internally tested a new Apple Pencil with visionOS support, according to a source familiar with the matter. This would allow the Apple Pencil to be used with drawing apps on the Vision Pro, such as Freeform and Pixelmator.

One hopes you're supposed to wave it around in the air like a conductor might.

The Weibo user explained that the ‌iPad Pro‌'s new matte display option will be offered in addition to the standard, glossy glass finish. It apparently features -4° to +29° of haze and may tout some kind of blue-light blocking technology to help protect the eyes. Matte screen protectors for the iPad have become popular, so it is possible that Apple is trying to offer such an option at the point of purchase for those who want it.

I wonder what this means for display performance in direct Florida sunshine, as the current iPad lineup is worthless outdoors here.

Apple is in discussions with Google to integrate its Gemini AI engine into the iPhone as part of iOS 18, according to Bloomberg's Mark Gurman.

Through iOS 5, Maps and YouTube were native apps that Apple built and which were backed by Google services. This was advantageous for both parties at first. Apple wasn't nearly ready to roll out its own mapping service and Google was more focused on growing YouTube's reach than monetizing it. Eventually, it stopped making sense for either party, and they went their separate ways.

The primary media narratives about this focused on Steve Jobs' "thermonuclear" threat over Android's copying of the iOS UI and the degree to which the two companies had begun to compete on services. But one thing that was lost in the discussion—which never really squared with the fact Google has continued to pay Apple tens of billions a year to be Safari's default search engine—was that both companies maintain relatively-tenuous moats to lock in customers.

Right now, Google needs people to reach for its AI and search stack before a generation of users learn to "GPT it", and Apple needs an AI stack for its platform that can compete with the dozens of devices set to launch that are little more than thin candy shells on top of OpenAI's API.

I really hate the idea of this deal, and I bet executives at both companies do, too. Which is why it's so unfortunate that it also makes sense.

Gripping story, overall, and worth a read. This bit stuck out to me as something I'd never considered before, but felt obvious as soon as I was exposed to it:

Political communicators are sticking to approaches developed for an era when ticket-splitters and swing voters composed a sizeable chunk of the electorate. But with a body politic that has sorted into two highly polarized parties—with just one-tenth of voters torn between them—the logic of persuading voters to support a candidate has grown obsolete. Ad campaigns should instead promote the Democratic Party itself, Malchow proposes, particularly at moments when news events might help it win new adherents, such as after a mass shooting, which thrusts gun-control policy back into the news and voters might be ready to reconsider their allegiances.

To wit: in an era of extreme party polarization, 90% of people in the US are voting based on party affiliation, but campaign advertising is still centered on candidate choice. This isn't just inefficient, it's counter-productive, since most candidates run away from their parties in general elections because both parties' brands are so toxic. Focusing money and messaging on bolstering a party's brand seems like a much smarter way to meet this moment of overwhelmingly party-line voting.

I can only hope I'll still have meaningful insights to offer others during my final week on earth.