Open-source CLI tool exposes Apple's built-in Foundation Model on macOS 26 as an OpenAI-compatible local server — no keys, no cost.
A developer released 'apfel', an open-source tool that exposes Apple's SystemLanguageModel (shipped with macOS 26 Tahoe) as a CLI, interactive chat, and OpenAI-compatible HTTP server at localhost:11434. Apple ships this LLM on every Apple Silicon Mac as part of Apple Intelligence but only exposes it via Siri and first-party apps. Apfel routes around that restriction using the FoundationModels Swift framework. No API keys, no billing, all inference runs on the Neural Engine — fully offline.
Apfel turns the Apple Silicon Neural Engine into a free, always-available OpenAI-compatible endpoint at localhost:11434. Any existing code that calls the OpenAI SDK works instantly with a single base_url swap — no token budget, no rate limits, no cold starts. The caveat: macOS 26 is required, model capability is below GPT-4 class, and context window is limited — but for dev tooling, shell scripts, and privacy-sensitive pipelines, this is a legitimate zero-cost inference layer.
Swap base_url to localhost:11434 in your existing OpenAI SDK integration and benchmark latency + quality against your current API call on a real task you run daily — if it passes your quality bar, you just eliminated that API cost entirely.
Install apfel: brew install apfel (or follow the GitHub README), then run: apfel server
Tags
Sources
Signals by role
Also today
Tools mentioned