LM Studio 0.4.16: Your Local Models Just Went Mobile

If you've read much of what I write, you know I have a soft spot for running AI locally — on my own
machines, my own hardware, no per-token meter running. I've written about
local OCR with a vision model in LM Studio,
about why price is a feature,
and about a whole practical setup for cheap and local AI.
The tool that sits at the center of most of that for me is LM Studio, and it just shipped
0.4.16 — a release that's more interesting than the version bump suggests.

A quick refresher: what LM Studio is

For anyone who hasn't used it: LM Studio is a desktop app for discovering, downloading, and running
large language models on your own computer. It runs on macOS, Windows, and Linux; it gives you a
friendly chat UI; it lets you browse and pull models (GGUF and Apple's MLX formats); and — the part I
use most — it exposes a local, OpenAI-compatible server so your own apps and agents can talk to a
model running on your machine exactly as if it were a cloud API. No data leaves the box, no API bill,
no rate limit but your own RAM.

That last point is the whole reason I keep coming back to it. When the inference happens on hardware I
own, privacy isn't a policy I have to trust — it's a physical fact.

What's new in 0.4.16

The headline of this release is that your local models are no longer stuck to the machine they run
on. Here's what shipped, in plain terms.

Locally — LM Studio's mobile app

The big one: Locally, a new iPhone and iPad app from the LM Studio team. This is the mobile
companion the local-AI crowd has been asking for — a proper, first-party way to use models from your
pocket instead of being tethered to your desk.

LM Link — your desktop's big models, on the go

On its own a phone can only run fairly small models — there's just not much RAM in your pocket. LM
Link is the clever bit that fixes that: it connects Locally on your phone to the full LM
Studio running on your desktop, so you can use your largest models — the 30B, 70B, whatever your
workstation can hold — from your phone. The heavy lifting stays on the powerful machine; your phone is
just the front end.

And critically for this release: LM Link no longer requires waitlisting. It's open to everyone now,
so you don't have to request access and wait — install Locally, link it, and go.

For me this is the genuinely exciting part. I have a beefy Mac that happily runs big models; what I
didn't have was a good way to reach it when I'm away from the desk. Now I can — and the model still
runs on my hardware, so the privacy story stays intact even though I'm querying it from a phone on the
other side of town.

A bigger default context window

Small but welcome: the default context length is now 8k tokens (up from the previous default). It
means more of your conversation and pasted material fits before the model starts forgetting the start —
a saner out-of-the-box experience, especially for the document and code work a lot of us do. (You can
always raise it further per-model if your memory budget allows.)

Security hardening and a real GPU bug fix

Rounding it out:

Security hardening — exactly the kind of unglamorous work you want in a tool that runs a local
server and now talks to a mobile app.
A GGUF multi-GPU fix — this one matters if you run multiple cards. It fixes bugs in GPU
ON/OFF selection and Priority Order that were affecting some CUDA 12, ROCm, and Vulkan
setups. If you've ever fought LM Studio over which GPU it should actually use, this release is worth
grabbing for that alone.

Why a point release deserves a post

On paper "0.4.16" is a minor version. In practice it crosses a meaningful line: local AI stops being
something that only happens at your desk. The pattern LM Link establishes — heavy model on the
powerful machine you own, lightweight client wherever you happen to be, inference never leaving your
hardware — is exactly the shape I want the whole local-AI story to take. It's the privacy and
cost-control of running your own models, without the one real downside (being chained to the box).

If you already live in LM Studio, update and try linking Locally to your desktop. And if you've been
cloud-only because "local AI means sitting at the workstation," this release quietly removes that
excuse. I'll be running my big models from my phone this week and grinning about it.

If you give it a spin, tell me how LM Link holds up on your setup — I'm always listening on the links on
the about page.