How to run Claude Code against a free local model
Last night, Aaron shared the week-old Claude Code demo, and I was pretty blown away by it:
I've tried the "agentic" features of some editors (like Cursor's "YOLO" mode) and have been woefully disappointed by how shitty the UX always is. They often break on basic focus changes, hang at random, and frequently require fussy user intervention with a GUI. Claude Code, however, is a simple REPL, which is all I've ever really wanted from a coding assistant. Specifically, I want to be able to write a test in my editor and then tell a CLI to go implement code to pass the test and let it churn as long as it needs.
Of course, I didn't want to actually try Claude Code, because it would have required a massive amount of expensive API tokens to accomplish anything, and I'm a cheapskate who doesn't want to have to pay someone to perform mundane coding tasks. Fortunately, it took five minutes to find an LLM-agnostic fork of Claude Code called Anon Kode and another five minutes to contribute a patch to make it work with a locally-hosted LLM server.
Thirty minutes later and I have a totally-free, locally-hosted version of the Claude Code experience demonstrated in the video above working on my machine (an MacBook Pro with M4 Pro and 48GB of RAM). I figured other people would like to try this too, so here are step-by-step instructions. All you need is an app called LM Studio and Anon Kode's kode
CLI.
Running a locally-hosted server with LM Studio
Because Anon Kode needs to make API calls to a server that conforms to the Open AI API, I'm using LM Studio to install models and run that server for me.
- Download LM Studio
- When the onboarding UI appears, I recommend unchecking the option to automatically start the server at login
- After onboarding, click the search icon (or hit Command-Shift-M) and install an appropriate model (I started with "Qwen2.5 Coder 14B", as it can fit comfortably in 48GB)
- Once downloaded, click the "My Models" icon in the sidebar (Command-3), then click the settings gear button and set the context length to
8192
(this is Anon Kode's default token limit and it currently doesn't seem to respect other values, so increasing the token limit in LM Studio to match is the easiest workaround) - Click the "Developer" icon in the sidebar (Command-2), then in the top center of the window, click "Select a model to load" (Command-L) and choose whatever model you just installed
- Run the server (Command-R) by toggling the control in the upper left of the Developer view
- In the right sidebar, you should see an "API Usage" pane with a local server URL. Mine is (and I presume yours will be) http://127.0.0.1:1234
Configuring Anon Kode
Since Claude Code is a command-line tool, getting this running will require basic competency with your terminal:
- First up, you'll need Node.js (or an equivalent runtime) installed. I use homebrew and nodenv to manage my Node installation(s)
- Install Anon Kode (
npm i -g anon-kode
) - In your terminal, change into your project directory (e.g.
cd ~/code/searls/posse_party/
) - Run
kode
- Use your keyboard to go through its initial setup. Once prompted to choose between "Large Model" and "Small Model" selections, hit escape to exit the wizard, since it doesn't support specifying custom server URLs
- When asked if you trust the files in this folder (assuming you're in the right project directory), select "Yes, proceed"
- You should see a prompt. Type
/config
and hit enter to open the configuration panel, using the arrow keys to navigate and enter to confirm- AI Provider: toggle to "custom" by hitting enter
- Small model name:" to "LM Studio" or similar
- Small model base URL:
http://127.0.0.1:1234/v1
(or whatever URL LM Studio reported when you started your server) - API key for small model: provide any string you like, it just needs to be set (e.g. "NA")
- Large model name: to "LM Studio" or similar
- API key for large model: again, enter whatever you want
- Large model base URL:
http://127.0.0.1:1234/v1
- Press escape to exit
- Setting a custom base URL resulted in Anon Kode failing to append
v1
to the path of its requests to LM Studio until I restarted it (If this happens to you, press Ctrl-C twice and runkode
again) - Try asking it to do stuff and see what happens!
That's it! Now what?
Is running a bootleg version of Claude Code useful? Is Claude Code itself useful? I don't know!
I am hardly a master of running LLM locally, but the steps above at least got things working end-to-end so I can start trying different models and tweaking their configuration. If you try this out and have landed on a configuration that works really well for you, let me know!