Spanish startup Multiverse Computing launches CompactifAI app and API portal, offering compressed models from OpenAI, Meta, DeepSeek, and Mistral that run locally without cloud infrastructure.
Multiverse Computing has launched the CompactifAI app and an API portal making its quantum-inspired compressed AI models publicly accessible. The company has compressed models from OpenAI, Meta, DeepSeek, and Mistral, including a model called Gilda that runs fully offline on mobile devices. It currently serves 100+ enterprise customers including Bank of Canada, Bosch, and Iberdrola. The startup raised a $215M Series B last year and is reportedly pursuing a €500M round at a €1.5B+ valuation.
Multiverse now offers an API portal for compressed versions of major models — same architectures you already use, but runnable on-device or in constrained environments. This is a direct alternative to cloud inference for latency-sensitive or air-gapped deployments. The compression is quantum-inspired tensor decomposition, not standard quantization, which may yield different trade-offs between size and capability degradation.
Hit the CompactifAI API portal this week and benchmark their compressed DeepSeek or Mistral variant against your current cloud inference endpoint — measure tokens/second, latency, and output quality on your top 10 production prompts to determine if the switch eliminates your external dependency.
Go to multiverse-computing.com, access the CompactifAI API portal, and run this prompt against their smallest available model: 'Summarize the following paragraph in 3 bullet points: [paste any dense technical doc].' Compare the output quality and response time to your current provider in the same tab.
Tags
Signals by role
Also today
Tools mentioned