justin․searls․co

Glad to see Jerod properly follow up on this one:

In September of last year, I covered a post by Mike Judge arguing that AI coding claims don't add up, in which he asked this question:

If so many developers are so extraordinarily productive using these tools, where is the flood of shovelware? We should be seeing apps of all shapes and sizes, video games, new websites, mobile apps, software-as-a-service apps — we should be drowning in choice. We should be in the middle of an indie software revolution. We should be seeing 10,000 Tetris clones on Steam.

I was capital-T Triggered by this, having separately fired off my own retort to Judge's post at the time, and even going so far as creating a Certified Shovelware README badge:

Certified Shovelware

And that badge has gotten a lot of action in the intervening 2 months. Shit, last night I released twocount'em, two—Homebrew formulae last night. I wouldn't have bothered creating either were it not for the rapacious tenacity of coding agents. (Please ignore the fact that both projects exist in order to wrangle said coding agents).

Anyway, I've been thinking a lot about that Shovelware post ever since, and again recently with all the mainstream press coverage of ClawdBot/Moltbot/OpenClaw this week—especially as I see long-term skeptics of AI's utility like Nilay Patel finally declaring this as the moment where he sees the value in agents. (My dude, I was automating my Mac with claude-discord-bridge and AppleScript from my doctor's office in late July!)

But I was too lazy to take those thoughts and do anything with them. Unlike me, Jerod did the work of rendering the chart that properly puts the original, "where's the shovelware," complaint to rest. Rather than hotlink his image, I encourage you to click through to see for yourself:

This, to me, looks like the canary in the coal mine; the bellwether leading the flock; the first swallow of summer; the… you get the idea.

Hard agree. I said then and continue to agree with myself now that (1) it makes no sense to start the clock in November 2022, because no AI coding products prior to terminal-based coding agents ever mattered, and (2) people woefully underestimate the degree to which programmers are actually late adopters. (Raise your hand if you're still refusing to install macOS Tahoe, for fuck's sake.)

Even today, I'd be shocked if over 5% of professional programmers worldwide have attempted to adopt a terminal-based coding agent in anger. The amount of technically-useful, mostly-broken software we're going to be inundated with a year from now will be truly mind-bending.

Why is OpenAI so stingy with ChatGPT web search?

For however expensive LLM inference supposedly is, OpenAI continues to be stupidly stingy with respect to web searches—even though any GPT 5.2 Auto request (the default) is extremely likely to be wrong unless the user intervenes by enabling web search.

Meanwhile, ChatGPT's user interface offers:

  • No way to enable search by default
  • No keyboard shortcut to enable search
  • No app (@) or slash (/) command to trigger search
  • Ignores personalization instructions like "ALWAYS USE WEB SEARCH"
  • Frequently hides web search behind multiple clicks and taps, and aggressively A/B tests interface changes that clearly will result in fewer searches being executed

All of this raises the question: how does ChatGPT implement search? What is the cost of the search itself and the extent of chain-of-thought reasoning needed to aggregate and discern the extraordinary number of tokens that need to be ingested by those search results?

It's interesting that OpenAI is so eager to goose usage by lighting dumpsters full of venture capital on fire, but is so stingy when it comes to ensuring their flagship product knows basic facts like "iPhone Air is a product that exists."

Early in my career, I met a few COBOL developers who came out of retirement in the run-up to January 2000, getting paid $300+ per hour to remediate Y2K bugs when nobody else was left who knew COBOL.

Suspect a similar trajectory for highly-skilled, well-rounded "pre-AI" engineers

Copied!

Coding agents are a TON of fun if you are (1) extremely ambitious and (2) have middling standards.

They are still fun if you are (1) extremely ambitious and (2) ruthlessly exacting standards, but markedly less so.

Copied!

Now that I've spent ten hours with Claude Code after a few months with Codex CLI, I can say with confidence:

  • Claude is much much faster
  • Claude makes much stupider mistakes much more often, even with Opus 4.5
  • With either agent, I end each session frustrated and exhausted
Copied!

LOL, also apparently Adobe Premiere on iPad will frequently silently fail when generating captions. And there's no way to export them as text, subtitle files, etc.

This is starting to seem like bad software that nevertheless gets recommended to people ceaselessly.

Copied!

Trying Adobe Premiere for the first time since version 6.0 in 2002. Paid for Creative Cloud Pro. First thing I tried: start a project on iPad, sync via cloud, finish on my Mac.

LOL, nope. Their "cloud" can't sync projects. It's just a one-way, manual upload and import. And slow.

Copied!

Over the last 20 years, my time in Japan felt like Season 1 of Pluribus. A nation acting in harmony, going from happy to see me everywhere I went to constantly signaling they "need some space"

In cities, it's thanks to over-tourism. In the country, it's anti-immigrant populism.

Copied!

Tooling for coding agents is overly focused on scaling numerous parallel workers instead of ensuring correctness. That continues to be where all my time goes and is the real barrier to scaling up. (e.g., Why am I exploratory testing this UI when vision models could be doing it?)

Copied!

This Opus 4.5 failure mode rarely happens with Codex: Claude copy-pasted a source listing by mistake, edited that instead of the correct file, declared success. When I pointed out its error, it confidently asserted: "The code is correct - this is a browser caching issue." 🖕

Copied!

After hearing lots of Opus 4.5 hype, I decided to switch to Claude Code this month from Codex CLI (with which I have a separate set of frustrations), and I've learned two things:

  1. Claude Code is significantly more productive (parallel by default) and generally smarter (smarter coding) than it was in August

  2. When literally anything goes wrong, it routinely takes the path of least resistance (making excuses, declaring victory when things don't work, reaching for workarounds when basic reason points elsewhere)

Disappointed overall. Both GPT-5.2 and Opus 4.5 (and their CLI's surrounding chrome) feel barely improved since last summer.

Copied!