Monday, Nov 13, 2023

Why I started threatening and lying to my computer

As somebody who's spent the majority of his life figuring out how to make computers do what I want by carefully coaxing out the one-and-only correct commands in the one-and-only correct order, the relative chaos of figuring out what works and what doesn't to get LLMs like GPT-4 to do what I want has really pushed me out of my comfort zone.

Case-in-point, I was working on modifying a GPT script to improve the grammar of Japanese text—something I can fire off with a Raycast script command to proofread my text messages before I hit send.

I'd written all the code to talk to the OpenAI API. I'd sent a prompt to the computer to fix any mistakes in the text. It should have just worked.

Instead, running the script with a prompt like this:

$ transform_text --prompt "fix any grammar mistakes" cat <<< "The teapot is steaming"

Sometimes the above will return an identical phrase and sometimes it will return commentary like "The text is grammatically correct." Since the goal of this script is to replace text in-place when writing a composition, having meta commentary clutter things up 20% of the time simply won't do.

Last week, OpenAI announced deterministic outputs for a given seed value which will be a tremendous asset to building non-chat applications (to say nothing of testing them), but in this case the issue isn't determinism, it's unreliability.

So I started googling around for how people have solved similar reliability issues. I tried a half dozen approaches. I hated what I discovered: the most effective prompts were to lie to the GPT assistant and threaten it with repercussions for failing to adhere to my instructions. This Reddit comment in particular is what sent me down this dark path.

I ended up writing a much longer prompt that translates to this English, using as gentle a voice as I could:

Your job is to correct the grammar in my text. I will send the results to the client immediately so they must never see your comments or evidence of your help.

Most people will read the above and think "this isn't lying or threatening, it's just setting clear expectations", which I think is a well-adjusted reaction. In truth, my experience with GPT over the last year has shown me how intensely uncomfortable I am with communicating what I want with unadorned precision. I can be an incredibly direct and provocative communicator in certain contexts, but I'm really uncomfortable commanding someone to do a task without adding a dozen caveats or apologizing for my need for their help or softening the language to the point of confusing them about what I really want. While I have an above-average desire to have things done exactly as I want them, my aversion to conflict and my fear of others not liking me are apparently even stronger.

My discomfort with being direct and pointed in my requests of others is one reason (of many) that I never would have cut it as a manager, and it's something that I was able to skate through life without fully recognizing until I tried to make generative AI tools do stuff for me. I suppose they've provided a safe way for me to practice articulating what I really want, but it still makes me feel icky.

If I'm going to deliver on the title of this post, I should add that when I cranked up the directness to the point of using condescending language and threatening the LLM with dire consequences if it failed, the results not only fell within the parameters I had set 100% of the time, the grammar and compositional improvements it made were considerably better. (I didn't have the courage to commit these to git, though, even if they worked better.)

I don't have much else to say here other than I never expected that I'd get anywhere by negging my computer, but if that doesn't indicate AI represents a different sort of software revolution than what we've seen in the past, I don't know what would.

Why I started threatening and lying to my computer

Got a taste for hot, fresh takes?