I have personal anecdotal evidence that they're getting more efficient: I've had...

I have personal anecdotal evidence that they're getting more efficient: I've had the same 64GB M2 laptop for three years now. Back in March 2023 it could just about run LLaMA 1, a rubbish model. Today I'm running Mistral Small 3 on the same hardware and it's giving me a March-2023-GPT-4-era experience and using just 12GB of RAM.

People who I trust in this space have consistently and credibly talked about these constant efficiency gains. I don't think this is a case of selling compute for less than it costs to run.