Exla
Winter 2025 · Active
An SDK to run transformer models anywhere
Founded: 2025
Team size: 0
About
Exla aggressively quantizes AI models to minimize memory usage and maximize inference speed. Whether you're deploying LLMs, VLMs, VLAs, or custom models, Exla reduces memory footprint by up to 80% and accelerates inference by 3–20x - all with just a few lines of code.
https://cal.com/exla-ai/schedule
Founders
Pranav NairCo-Founder
CTO at Exla. Previously a kernel engineer at Apple leading sleep/hibernation for all Apple devices. B.S. Computer Science from Purdue.
Viraat DasFounder
CEO @ Exla. Previously machine learning engineer @ Amazon.