Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think a locally hosted LLM would be powerful enough for the supposed "agentic browsing" scenarios - at least if the browser is still supposed to run on average desktop PCs.




Not yet, but we’ll hopefully get there within at most a few years.

Get there by what mechanism? In the near term a good model pretty much requires a GPU, and it needs a lot of VRAM on that GPU. And the current state of the art of quantization has already gotten us most of the RAM-savings it possibly could.

And it doesn't look like the average computer with steam installed is going to get above 8GB VRAM for a long time, let alone the average computer in general. Even focusing on new computers it doesn't look that promising.


By M series and amd strix halo. You don't actually need a gpu, if the manufacturer knows that the use case will be running transformer models a more specialized NPU coupled with higher memory bandwidth of on the package RAM.

This will not result in locally running SOTA sized models, but it could result in a percentage of people running 100B - 200B models, which are large enough to do some useful things.


Those also contain powerful GPUs. Maybe I oversimplified but I considered them.

More importantly, it costs a lot of money to get that high bus width before you even add the memory. There is no way things like M pro and strix halo take over the mainstream in the next few years.


This is probably their plan to monetize this. They will partner with a AI company to 'enhance' the browser with a paid cloud model and the local model has no monetary incentive not to suck.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: