Soren

About

Soren builds autonomous agents to replace human engineers on manual AI evaluation tasks. Evals are essential for building reliable AI systems, yet they remain incredibly time-consuming. Teams spend countless hours maintaining their evals and digging through piles of logs and traces to debug their systems. At scale, this level of manual work simply isn’t sustainable. Soren changes that with powerful agents that work alongside your team. They reason across test cases and logs to pinpoint root causes, then run targeted experiments to surface better-performing solutions. New test cases are added whenever new behaviors are detected, so engineers can stop doing ad-hoc maintenance. We're building a future where AI handles the work and humans simply provide oversight.

Founders

Kevin XieFounder

Kevin studied AI + Math at MIT before leaving at 19. During that time, he published two papers with Harvard, built one of the world’s largest LLM benchmarks, discovered new ways to break Chain-of-Thought reasoning, and was named a Neo Scholar Finalist. Now at Soren AI, he's focused on building evals to make the future of AI more reliable. Previously, he worked on improving MRI techniques and played tennis, ranking top 150 nationally.