Show HN: We Built a Small LLM Comparison Page and Accidentally a Platform

1 points by sistillisteph 5 hours ago

My cofounder and I built a quick site comparing LLMs. It was basically “what model is good at what.” It was fun and useful, and a fun side project. But feedback pushed us into A/B testing models in production.

So we built Fallom (homage to Asimov), a platform where you can compare how multiple models perform on your own evals or production data. You can easily see cost and performance differences and know if it’s worth switching models.

Would love feedback from anyone who’s built internal model testing pipelines. We learned a lot the hard way and are still learning.