Rumors are worth squat when they’re most likely put in motion by the people with a vested interest in this industry.
Let’s talk about profits when there’s real data from the IPO documentation.
You can make some educated guesses and find out some limits on inferencing cost by looking at 3rd party providers on platforms like openrouter. You can get some median cost /tok for a given model size. Then make some educated guesses on SotA model sizes, and you can get an estimate on pure cost of serving a model. Error bars and all that, of course. But still a range, with some limits.