ModelsMedium Impact·Thursday, March 19, 2026

Multiverse Computing Brings Compressed AI Models to Edge Devices

Spanish startup Multiverse Computing launches CompactifAI app and API portal, offering compressed models from OpenAI, Meta, DeepSeek, and Mistral that run locally without cloud infrastructure.

What happened

Multiverse Computing has launched the CompactifAI app and an API portal making its quantum-inspired compressed AI models publicly accessible. The company has compressed models from OpenAI, Meta, DeepSeek, and Mistral, including a model called Gilda that runs fully offline on mobile devices. It currently serves 100+ enterprise customers including Bank of Canada, Bosch, and Iberdrola. The startup raised a $215M Series B last year and is reportedly pursuing a €500M round at a €1.5B+ valuation.

Why it matters to you

personalized

Multiverse now offers an API portal for compressed versions of major models — same architectures you already use, but runnable on-device or in constrained environments. This is a direct alternative to cloud inference for latency-sensitive or air-gapped deployments. The compression is quantum-inspired tensor decomposition, not standard quantization, which may yield different trade-offs between size and capability degradation.

What to do about it

Hit the CompactifAI API portal this week and benchmark their compressed DeepSeek or Mistral variant against your current cloud inference endpoint — measure tokens/second, latency, and output quality on your top 10 production prompts to determine if the switch eliminates your external dependency.

Try this now

Go to multiverse-computing.com, access the CompactifAI API portal, and run this prompt against their smallest available model: 'Summarize the following paragraph in 3 bullet points: [paste any dense technical doc].' Compare the output quality and response time to your current provider in the same tab.

Community

8 comments