I do all of my “AI” development on top of AWS Bedrock that hosts every available model except for OpenAIs closed source models that are exclusive to Microsoft.
It’s extremely easy to write a library that makes switching between models trivial. I could add OpenAI support. It would be just slightly more complicated because I would have to have a separate set of API keys while now I can just use my AWS credentials.
Also of course latency would be theoretically worse since with hosting on AWS and using AWS for inference you stay within the internal network (yes I know to use VPC endpoints).
There is no moat around switching models unlike Ben says.
It's like saying "Star Wars is the best movie in the world" - to some people it is. To others it's terrible.
I feel like it would be advantageous to move away from a "one model fits all" mindset, and move towards a world where we have different genres of models that we use for different things.
The benchmark scores are turning into being just as useful as tomatometer movie scores. Something can score high, but if that's not the genre you like, the high score doesn't guarantee you'll like it.
The analysis fails to mention that if TPUs take market share from Nvidia GPUs, JAX's software ecosystem likely would also take market share from the PyTorch+Triton+CUDA software ecosystem.
"the naive approach to moats focuses on the cost of switching; in fact, however, the more important correlation to the strength of a moat is the number of unique purchasers/users."
I do all of my “AI” development on top of AWS Bedrock that hosts every available model except for OpenAIs closed source models that are exclusive to Microsoft.
It’s extremely easy to write a library that makes switching between models trivial. I could add OpenAI support. It would be just slightly more complicated because I would have to have a separate set of API keys while now I can just use my AWS credentials.
Also of course latency would be theoretically worse since with hosting on AWS and using AWS for inference you stay within the internal network (yes I know to use VPC endpoints).
There is no moat around switching models unlike Ben says.
"advertising would make ChatGPT a better product."
And with that, I will never read anything this guy writes again :)
[delayed]
Idk if I'm just holding it wrong, but calling Gemini 3 "the best model in the world" doesn't line up with my experience at all.
It seems to just be worse at actually doing what you ask.
It's like saying "Star Wars is the best movie in the world" - to some people it is. To others it's terrible.
I feel like it would be advantageous to move away from a "one model fits all" mindset, and move towards a world where we have different genres of models that we use for different things.
The benchmark scores are turning into being just as useful as tomatometer movie scores. Something can score high, but if that's not the genre you like, the high score doesn't guarantee you'll like it.
Outside of experience and experimentation, is there a good way to know what models are strong for what tasks?
It's a good model. Zvi also thought it was the best model until Opus 4.5 was announced a few hours after he wrote his post
https://thezvi.substack.com/p/gemini-3-pro-is-a-vast-intelli...
The analysis fails to mention that if TPUs take market share from Nvidia GPUs, JAX's software ecosystem likely would also take market share from the PyTorch+Triton+CUDA software ecosystem.
not even google thinks this will happen, given their insistence on only offering TPU access through their cloud
As the OP points out, Google is now selling TPUs to at least some corporate customers.
they are not though
"the naive approach to moats focuses on the cost of switching; in fact, however, the more important correlation to the strength of a moat is the number of unique purchasers/users."
I was not able to find any research that posits that moat strength is determined by customer diversity.
I think customer diversity correlates instead with resilience.
I'm not suggesting I know better than the author, but I think they might be confusing moat for network effect.
It's indeed the case with social nets like Xwitter and Instagram that network size serves as a moat but that's for different reasons.
Of course, ChatGPT and Claude and friends might not have reached their final form and might yet rely on networks in the future.