Production Has Opinions

Most organizations are not struggling with models.

They are struggling with systems.

Over the past decade, I have spent a lot of time building AI and data science systems inside large enterprises. One pattern keeps repeating. The tools keep getting better. The models keep getting better. The demos look more impressive every year.

But getting something to actually work in production, in real workflows, with real data, with real users, is still surprisingly hard.

The Bottleneck Is Rarely the Algorithm

The bottleneck is everything around it:

Messy, fragmented data that was never designed to support the kind of inference you need
Workflows that do not map cleanly to model inputs and outputs
Unclear ownership across teams, where the people who built the model are not the people who have to live with the results
Systems that were never designed for AI-driven decisions: built for reporting, not for acting
Incentives that reward pilots and discourage the kind of long-term platform investment that production actually requires

None of these are model problems. They are organizational and systems problems, and no amount of better algorithms fixes them.

What Generative AI Has Made More Visible

Generative AI has made this gap more visible.

The barrier to a working demo has never been lower. But building something reliable enough to support real decisions still takes platform thinking, not just model thinking.

The demo almost always works. Production has opinions. Production is where real users make real decisions, where edge cases accumulate, where latency matters, where the mismatch between what the model was trained on and what the world actually looks like starts to show.

Why This Led to Toutami

This gap is one of the reasons I started Toutami, not because the models were not good enough, but because the layer between the model and the decision was missing.

Building that layer means treating reliability as a first-class concern from the start, not something you bolt on after the demo gets approved.

The interesting work is not getting the model to run.

It is making the system reliable enough that people are willing to depend on it.