Jun 15, 20267 min read/2026/06/15/apple-wwdc-foundation-models-free-cloud-ai-for-developers/

WWDC Deep Dive, Part 1: Apple's Native AI for Developers — Foundation Models, Free Cloud Compute, and the Siri Intent Trap

This was, by most accounts, a tick year for Apple — the kind of release where the OS itself barely
moves. Liquid glass got refined, toolbars changed again on the Mac, and the headline features were
about polish and "safe for everybody" marketing. As an app developer that's not the worst outcome:
it means you can spend the year refining your own UI instead of re-learning how every button renders.

But underneath the calm, the part that actually matters to those of us who build things is the AI
story. And it's genuinely better this year — with some very Apple-shaped caveats. Let me cut through
it the way I'd want it cut for me: where does each model run, what does it cost, and how do I call
it?

The on-device Foundation Model finally got smarter

For the last year, the Foundation Models framework gave you exactly one thing: access to the small
model that ships on the phone. Everyone who tried to build something real with it hit the same wall
fast — it just wasn't a smart enough model. You'd prototype a feature, watch it fall over on anything
non-trivial, and quietly shelve it.

Two things changed:

  1. The local model itself got better. It's a new on-device model with image processing, and from
    the partnership signals (Apple has been working with Google) there's a real chance the underlying
    architecture is Gemma-flavored. Nobody outside Apple knows for sure yet — wait for someone to
    actually decode it before betting on specifics.
  2. There's now a second, bigger model you can reach — for free.

That second point is the one I'd circle on the whiteboard.

The free cloud model is the real announcement

Here is the bit that changes what a small developer can ship: if your app has fewer than two
million downloads
, you can call a larger, server-backed model running on Apple's private cloud
compute — at no cost to you.

Apple didn't say how big it is. My honest guess is somewhere in the 30B-to-a-few-hundred-B
parameter range, but that's a guess. What matters is that in practice, a ~30B-class model is a
completely different animal from the on-device one. It's the difference between "this feature
technically runs" and "this feature is actually useful."

Why does free matter so much? Because the single biggest stumbling block for an indie or small-team
AI feature has never been the code — it's the billing. The moment you wire up your own API keys to
OpenAI or anyone else, you've signed up to pay per token for every user, forever, which means you now
have to build metering, subscriptions, and abuse protection just to break even. Apple absorbing that
cost removes the wall.

Concretely, this unlocks features that were previously "nice idea, can't afford it":

  • A habit/fitness app generating an AI summary of your month from the data already in its database.
  • A podcast app doing on-device-feeling intelligence without the developer eating an OpenAI bill.
  • AI features on older hardware — an iPhone with no on-device Apple Intelligence at all can still
    reach the cloud model.

And switching between local and cloud is reportedly close to a one-line change, which is exactly
the ergonomics you want when you're prototyping.

The honest caveat: this is subsidized, and subsidized things get un-subsidized. Nobody knows the
rate limits, and a free, abusable, server-backed model is primed for abuse — so expect Apple to
clamp down or start charging eventually. My take: ship the feature now, design it so you can swap the
backend later, and don't build a business that only survives if Apple's free tier lasts forever.

There's a Foundation Models CLI now

Filed under "things I did not have on my bingo card": Apple — the great GUI company — shipped a
command-line tool for Foundation Models (fm). It comes in the box on macOS, it's the now-standard
fancy terminal UI, and it gives you direct access to the models from the shell.

From what's been shown, there's very little login ceremony — it appears to lean on your iCloud+
status for some rate-limited pool of tokens. Predictably, the CLI is macOS-only (no iPhone), which is
its own little joke given the GUI-platform-ships-a-CLI inversion. The interesting open question for
developers: can you exercise the cloud model from the CLI during development, before you've shipped
an app, using your dev account? If so, that's effectively free AI compute on your machine for
experimentation. Worth testing the moment you install it.

The two-framework confusion: Foundation Models vs Core AI

Here's where Apple makes you work. There are two things in play and they're easy to conflate:

  • Foundation Models — the framework you use to call Apple's models (the on-device one and the
    free cloud one). This is what 95% of app developers want. It also gained a plugin mechanism so you
    can route to non-Apple server models through it.
  • Core AI — the new framework for running your own / arbitrary models on device. If you're
    the rare developer who trains a network and wants to put your weights (or a model you pulled off
    GitHub) onto the device, this is the path.

If you've ever wondered why Apple's on-device ML story feels like quicksand, it's because this is the
n-th framework for executing neural nets on device. We've cycled through a parade of them over the
years, each announced as "the one." Core AI is this year's "the one." Make your peace with the
possibility of Core AI 2 next year, and don't over-invest in the low-level path unless you genuinely
train models.

For the plugin mechanism: yes, everyone gravitates to the OpenAI API shape as the lingua franca, but
it's not a clean standard — reasoning/thinking tokens are handled differently across providers, and
even OpenAI doesn't really promote its own classic endpoint anymore. If you depend on a specific
third-party model, expect to write (and babysit) a plugin.

The Siri / App Intents trap

Now the part that genuinely frustrates me. Apple's pitch for making your app "AI native" is roughly:
index all your data in Spotlight, hand it to Apple, and expose every action of your app as an
intent.
Do that, and Siri can reason over your app.

The problem is the shape of the contract. You can declare a data model — but you don't fully control
the schema
; you have to pour your app's data into Apple's pre-defined schemas. You can expose intents
— but again, into Apple's pre-defined intents. This is the same road Apple keeps walking, and it's
exhausting.

You could see it in the demo. This year's sample app was an origami app — which suddenly behaved like a
messaging app for the Siri segment, because the messaging schema is the one that maps cleanly onto
Siri's intents. The natural demo for an origami app would be "Siri, how do I fold a crane?" → app
surfaces the folding instructions. That wasn't the demo, and it wasn't the demo for a reason: the
schema doesn't support it.

The developer takeaway, said plainly: just let us ship MCP servers for our apps. We already have
the pattern. Developers like the pattern. It's an easy pattern, and Apple has effectively implemented
the shape of it elsewhere. Forcing every app's semantics through a fixed catalog of intents and
schemas is the thing standing between "my app cooperates with the assistant" and "my app contorts
itself to fit a messaging metaphor."

There's also a real engineering worry lurking here, the same one you hit building any tool-using
agent: if Spotlight/Shortcuts exposes thousands of actions as tools, you flood the model's context.
Shortcuts has always suffered from "open it, see 8,000 actions, close it." Natural-language Shortcuts
("just tell it what you want") is a nice front door, but the tools-as-context problem underneath is
unsolved and worth watching.

What I'd actually do with this

If you build apps, here's the short list:

  • Adopt the free cloud model now for one concrete, previously-too-expensive feature — a monthly
    AI summary, a smart analysis over data you already store. Keep the integration thin so you can swap
    backends.
  • Use the one-line local↔cloud switch to gracefully degrade: local model where it's good enough,
    cloud where you need the brains, and still function on old hardware.
  • Wire up App Intents and Spotlight indexing because that's the only door Apple gives you to the
    assistant — but go in clear-eyed that you're fitting their schemas, not designing your own.
  • Stay out of Core AI unless you actually train models.
  • Plan for the subsidy to end. Treat free AI compute as a runway, not a foundation.

It's a tick year for the OS. For AI features, it's quietly the most useful WWDC for small developers in
a while — as long as you build for the day the free tier grows teeth.

Next in this series: Safari goes vibe-coding
— on-the-fly extensions, the genuinely useful "Notify Me," and Apple's quiet turn toward
background web-scraping.