Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a counterpoint from yesterday.

I looked up a medical term, that is frequently misused (eg. "retarded"), and asked the Gemini to compare it with similar conditions.

Because I have enough of a background in the subject matter, I could tell what it had construed by its mixing the many incorrect references with the much fewer correct references in the training data.

I asked it for sources, and it failed to provide anything useful. But once I am looking at sources, I would be MUCH better off searching and only reading the sources might actually be useful.

I was sitting with a medical professional at the time (who is not also a programmer) and he completely swallowed what Gemini was feeding him. He commented that he appreciates that these summaries let him know when he is not up to date with the latest advances, and he learnt alot from the response.

As an aside, I am not sure I appreciate that Google's profile would now associate me with that particular condition.

Scary!





This is just garbage in, garbage out. Would you better off if I gave you an incorrect source? What about three incorrect ones? And a search engine would also associate you with this term now. Nothing you describe here seems specific to AU.

The issue is how terrible the LLM is at determining which sources are relevant. Whereas a somewhat informed human can be excellent at it. And unfortunately, the way search engines work these days, a more specific search query is often unable to filter out the bad results. And it’s worst for terms that have multiple meanings within a single field.

That word "somewhat" in "somewhat informed" is doing a lot of lifting here. That said, I do think that having a little curation in the training data probably would help. Get rid of the worst content farms and misinformation sites. But it'll never be perfect, in the same way that getting any content in the world today isn't perfect (and never has been).

It’s not even about content farms and misinformation. It’s about which of the results even are talking about the same topic at all. You should have seen what came up when I searched info about doses for a medication that comes in multiple forms and is used for multiple purposes. Even though I specified the form and purpose I was interested in, the SERP was 95% about other forms and purposes with only two that were topical to mine in the first two pages. (And yes, I tried the query in several forms with different techniques annd got substantially the the same results.) The AI summary, of course, didn’t distinguish which of those results were or were not relevant to the query, and thus was useless at best, and dangerously misleading at worst.

Try the same with Perplexity?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: