Pretraining a frontier model is an enormous upfront expense; inference is the recurring one, paid every time anyone asks the model anything. That economics shapes the products you use: free tiers route you to fast, cheap models, daily caps exist because inference costs money, and architectures like mixture of experts exist largely to make inference cheaper.
It also explains why engines do not re-read the whole web for every query. Inference has a budget, so an AI answer pulls a handful of retrieved sources, not hundreds. Being one of the few clear, authoritative pages on your topic is worth more than being one of many vague ones, because only a few make it into the answer.