Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ever had to troubleshoot bit flips on a non-ECC system? One friend felt like he was going crazy as over the course of two months his system degraded from occasional random errors to random crashes, blue screens and finally to no POST. Another time, a coworker had to stare at raw bytestreams in Wireshark for hours to find a consistently flipped bit.


Don't overclock your memory.


All of these were with stock, non XMP clocks.


Well then… test your memory :)


How often do you test your memory? The nice thing about ECC is it's always testing your memory, and (if it's set up properly!) you'll get notified when it begins to fail. Without ECC, your memory may begin to fail, and you'll have to deal with the consequences between when it starts to fail and when you detect it.

(Of course, I don't run ECC on my personal systems, but at least I'm wandering knowingly into the abyss)


Testing your memory detects if you have bad RAM, which ECC isn't going to help with anyway. Perfectly fine memory will experience random bit flips from environmental factors. Your PC components and UPS also degrade over time and can cause random bit flips. ECC is there to catch problems as they happen and ideally take corrective action before bad data propagates


> Ever had to troubleshoot but flips on a non-ECC system?

No.

> One friend felt like he was going crazy

Tell him about memtest86.


Wow I came back to post this exact reply. I set my system to a slightly high frequency, ran memtest overnight with errors.

Set it back down to a supported frequency, ran a full memtest suite again with no errors.

Never had any issues since.


> Wow I came back to post this exact reply. I set my system to a slightly high frequency, ran memtest overnight with errors. > > Set it back down to a supported frequency, ran a full memtest suite again with no errors.

Cool. You tested your memory at some point in the past.

How do you know it's still working properly and hasn't flipped any bits?

You don't. Because you have no practical way of testing the integrity of the data without running an intrusive tool like memtest86 that basically monopolizes the use of the computer.

Being able to detect these types of memory errors at a hardware level while the processor is doing other things is the fundamental capability that ECC gives you that you otherwise wouldn't have, no matter how thoroughly you run memtest86.


You likely wouldn't know if you had random bit flips. It'd manifest as silent data corruption. You might be okay with that. Others aren't.

It's not a matter of overclocking. Bit flips are a fact of life running with 32+ GB RAM. Leaving your machine on 24/7 (even if in sleep) stacks the odds against you.


Obviously this is just anecdote but I have a work laptop with 128GB of non-ECC ram , use all of it every day and never noticed any issues. I'm not saying there aren't any, but it just....works.


You have silent bit flips, they silently corrupted data instead of causing a visible error.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: