Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Apparently a few months ago it became known on the Chinese internet that the 980 Pro, 970 Evo Plus with new controller, and OEM versions are prone to getting unreadable sectors, where SMART 'Media and Data Integrity Errors' increases on every read attempt.

https://www.reddit.com/r/buildapc/comments/x82mwe/samsung_ss... https://www.reddit.com/r/DataHoarder/comments/x8arle/psa_sam...

How I came across this: Ran into this last week(!) on a 6-month old drive -- but I'm not in China....hmm. Not just one bad batch? Interestingly, it's non deterministic - the data is backed up but trying ddrescue, it occasionally succeeds at reading a few kilobytes from the 5 MB of several runs of 512-16384 bytes that can't be read or written. Curious to see what happens with a firmware update and secure erase.



PS: I'm one of the victim with a 970 Evo Plus. The company that provided aftersell services, Lobcom, did not want to provide any RMA services and claimed nothing wrong is found.

The scamming company in question: https://zh.lobcomgroup.com/


My anecdata:

tl;dr: All 3 of my Samsung M.2 NVMe SSDs have failed in less than 3 years. 100% failure rate.

My first SSD was a 1TB Samsung 970 EVO. It failed after 2 years and 8 months. It was replaced under warranty with a 1TB 970 EVO Plus.

That replacement has now also failed after 1 year and 9 months.

I bought a 2nd 1TB 970 EVO Plus in May 2019. It has now also failed (2 years and 7 months).

Both are expected to be replaced under warranty.

The 2 970 EVO Plus SSDs clearly had hardware errors (that were not accurately reflected in SMART data) that caused everything from system hangs, game crashes to file corruption on OTHER drives. I couldn't believe it at first but after 5 days of testing and trial and error, I had it confirmed. As soon as I removed those SSDs, my PC was completely stable again.

In the meantime, I have bought a Kingston KC3000 1TB drive as I no longer trust Samsung M.2 NVMe SSDs. On the other hand, I have a Samsung EVO 850 SATA drive which has been rock-solid.


My anecdata, I have been running 4x 500GB Samsung 850 EVOs in Raid 0 continuously without failures since early 2015.


The article mentions issues with the 900-series drives. It seems like the 800-series are still rock solid (also been running them for s few years now without issue)


Unfortunately there have been recent issues with the 870 EVO series also: https://www.techpowerup.com/forums/threads/samsung-870-evo-b...

There may be multiple, different issues with Samsung parts at play here. The 900 series issues seem to have been addressed with a f/w update; the 870 EVO issues were - allegedly - caused by bad NAND and the devices needed to be replaced.

ofc part of the problem here is the lack of public acknowledgement / information from Samsung on these issues.


Similarly my M.2 NVMe 950 pro has been in an always on machine that gets a ton of use since 2016.


The parent posts mentioned 970 and 980, not 850.


Is it possible that your motherboard or PSU is killing the drives?

Could also just be sheer chance, of course.


How does this happen? Got any background info?


Poor voltage regulation from the motherboard or power supply could glitch the controller of the drive causing I/O errors or failures.


As an example, an old Asus board of mine has trouble with modern m2 drives. A PICe m2 adapter solved the problem and the Samsung ssd worked without issues thereafter.


I've bought 6-8 m.2 Samsung 970 EVO Plus and 980s since 2018, and none have failed to date.

Anecdata is the worst, I'm sorry to hear about this happening to you. It's surely frustrating and upsetting.


Worth checking if you have any thermal issues with it. Mine failed in a similar way due to presumably a rookie mistake of forgetting to remove the thermal pad tape on the mobo.


It's not likely that thermal issues would cause bad reliability on these things. At worst you could expect intermittently bad performance. You can check for this condition with `nvme smart-log`. If your device was often overheated, it would have "critical composite temperature time" non-zero. My Samsung that has been in service for years and has no thermal solution has a value of 1 minute and I happen to know that is because I heated it with a hair dryer to find out what would happen if it crossed the critical temperature.


"I happen to know that is because I heated it with a hair dryer to find out what would happen if it crossed the critical temperature."

Ah this is a fantastic and true hacker mindset :)

Willing to tamper with fairly expensive equipment just for the heck of it.


Ha, interesting! Makes sense, the drive is supposed to just throttle itself before it can reach unsafe temps. I’ll def try to check, didn’t know the drive recorded that - thanks for the tip. In any case, now I know RMA is in order


The controller is less thick than the NAND flash so don't make proper contact with the thermal pad. I just discovered mine is affected by this. After heeavy reading the controller is at 67C while the NAND is at 42C.

https://www.youtube.com/watch?v=I8Z09nU554Q


Hmm, that still seems like it should be ok. Tjmax is usually over 100C (though for NANDs they recommend 70C I think)


My anecdata, I've had 5 Samsung SSDs and they've all performed great.

I'd point the finger at your PSU or motherboard. That's way too many failures for it to be the SSDs.

Samsung couldn't stay in business if that was a normal failure rate.


> that caused everything from system hangs, game crashes to file corruption on OTHER drives.

Interesting. Maybe my M2 (WD 570) is the cause for the hangs in my system. Thank you very much!


I can second EVO 850 SATA. Mine has been rock-solid since 2015.


My anecdata, I have a 840 Pro, 850, 850 EVO, 970 and 980 Pro, all still running for years


My 980 pro failed witihn two months of purchasing it in late 2022


I wonder if Qvo are still subject to the same issues.


Hmm I'm going to need to check my Samsung ssd from oct 2021 that failed the first week of Jan 2023. I had started noticing some quirks in spring 2022 but it wasn't a super important drive so I ignored it.


I have similar issue. It started failing mid last year. Then it got more and more frequent toward the end of the year. Last month I got tired of reinstalling OS for the 4th time and got a new system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: