PC randomly crashing

Questions about other topics - please check if your question fits better in another category before posting here
Forum rules
Before you post please read how to get help
AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Sun Sep 01, 2019 10:08 pm

I don't think it is just a hardware failure, everything is like new and the pc does work properly some days, if the PC crashes and I reboot, it boots every time, and fails after like 15 mins, or some days won't crash all day.
EDIT: i attemped to install linux mint 19.1 on my ssd, and it looks like it unmounted itself in the beginning of the process and lost the linux partition :( good thing I did a backup
EDIT 2:
ok so I booted into the SATA drive and now i'm transferring 65 GB of files to my backup external drive, i'm at 20 GB and no crashing, it seems the nvme only malfunctions when I boot from it
EDIT: the transfer failed at 25 GB but didn't crash

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Mon Sep 02, 2019 12:41 am

AndreZoio wrote:
Sun Sep 01, 2019 10:08 pm
EDIT: the transfer failed at 25 GB but didn't crash
That wouldn't be unespected with the NVMe drive now secondary. You'd probably have seen the same messages in dmesg, but with the failing drive in that case not containing the root fs, the system as such can continue. Expectation/advise still as in last message.

User avatar
74m3_G33k
Level 1
Level 1
Posts: 13
Joined: Wed Dec 15, 2010 5:37 pm
Location: UK

Re: PC randomly crashing

Post by 74m3_G33k » Mon Sep 02, 2019 5:22 am

I have been having similar problems with occasional random crashes and until I found your post just now I was completely stuck. I have a Samsung 850 EVO 500GB SATA SSD that is my boot drive for Mint 19. I had assumed the crashes were something to with data corruption (I get a rolling screen of error messages similar to your 2nd screenshot) but e2fsck never turned up any problems.

The fact that you're having problems with a Samsung SSD and in Windows as well suggested a search for OS crashes with Samsung SSDs and not referencing Linux. It seems a lot of people are having similar problems on Windows and not just with Samsung SSDs, e.g. https://blogs.visigo.com/chriscoulson/ssd-freezing-fix/ & https://forum.corsair.com/v3/showthread ... 2&t=105812

It looks like this is a problem with SSDs not supporting an Intel chipset feature called Rapid Storage Technology (RST) and specifically a feature of RST called Link Power Management (LPM). There seem to be ways to disable this in Windows with registry hacks but the only fix for Linux (at least for SATA devices - not sure about NVMe devices) is to disable AHCI in for the device in the BIOS by setting it to IDE mode.

Now, I've not tried this yet! When I have time to backup my root partition and I don't need to use my main PC urgently in case this borks the system completely I will give it a go and post results here. Also something to bear in mind that switching to IDE mode (if that is meaningful with an NVMe device - I'm not sure it is) would likely disable automatic TRIM so this might have to be done with a cron job I suppose.

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Mon Sep 02, 2019 4:50 pm

Hello, how can I turn on this IDE mode? is it on Linux or on the BIOS?

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Mon Sep 02, 2019 5:07 pm

Please note that this would be a very bad idea (if still possible at all these days; it'd surprise me). IDE instead of AHCI would for one disable TRIM support, something which you need to keep an SSD functioning well over time. It is moreover undoubtedly also not the issue; Link Power Management is implied when for example suspend/resume has issues, but not when the system crashes while transferring 60G of data such as above. And doubly not since this just started happening at some point.

If you must, there's a possibility your BIOS has a "(A)LPM setting somewhere under an advanced menu; certainly do not pick IDE even if possible. It may not be a NICE conclusion but, save rummaging through / resetting BIOS settings as also advised somewhere above, your symptoms and their history leave very little room for anything but hardware fault. SSD, motherboard or PSU. Note; IDE would change timing very significantly and could as such change symptoms, but even if they do for you, do not take that as proof of anything. IDE on a 960 Pro is like pouring gasoline in a Tesla.

[EDIT] A much later edit... I above say "IDE instead of AHCI would for one disable TRIM support", but due to looking for something else just now happened upon me having tested earlier such to not in fact be the case: viewtopic.php?t=281294#p1551312. This is not to say that it's not a bad idea to use IDE rather than AHCI mode (i.e., NCQ support is AHCI-specific) but let's still edit this in, even if only in an effort to have myself remember this time...
Last edited by rene on Mon Sep 09, 2019 11:56 pm, edited 1 time in total.

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Mon Sep 02, 2019 5:20 pm

Well, I already contacted Samsung to get a RMA, and now I'm trying a method of reviving SSDs by power cycling them... I think that the days the pc works fine, is because the day before I'd leave my pc energized but turned off to charge my bike lights on the USB ports, I'll go biking and leave the pc turned on, if it doesn't crash then maybe the power cycling... as of now, 15 minutes and no crashing, and downloading lots of updates
EDIT: isn't it possible to be some firmware or software issue with the SSD controller? maybe Samsung or Windows botched something, that would explain why the stopping working suddenly and erratic behavior, there was some kind of firmware issue for my ssd model in the past

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Mon Sep 02, 2019 5:32 pm

Fundamentally it certainly could be an SSD-firmware issue but you'd have had to knowingly apply that firmware update through Samsung's "Magician" or alike. I.e., the assumption is that you would've correlated. Moreover, while some of my replies may perhaps be short-ish, I have in fact been googling around for the issue; a 960 Pro is a very popular model, enthusiast-model even, and while you find issues for any piece of hardware you stick into google, not anything I found for the 960 Pro looks like a known issue similar to yours.

You may have noticed (partly by me saying so a few times...) that I'm still mostly suspicious of power delivery to the drive, perhaps more than the drive proper; I as such hope that the RMA works out...

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Mon Sep 02, 2019 6:08 pm

Oh, I should by the way add that I did find some overheating issues with the 960 Pro but when I earlier asked about its temperature and/or it being situated in enough airflow that wasn't really answered. Later then, inxi -s (was in fact useful but) did not include the SSD sensor. If you have "hddtemp" installed then inxi -Dx or sudo inxi -Dx or simply directly sudo hddtemp /dev/nvme0n1 would be interesting. And especially if you see it getting real high at the moment of trouble...

[EDIT] It was answered by you specifying the PCIe adapter with fan. So likely not the issue, but still an interesting probe...

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Tue Sep 03, 2019 2:15 pm

I checked the temps of the ssd, they never get above about 28°C, also if I touch the ssd with my finger when it crashes, it isn't very hot. I searched a little and it seems windows users complained that their pcs froze because the 960 pro, and it was a windows update from october that caused it. I'm going to update windows as a last resort and if it doesn't resolve itself, I'll just RMA the SSD and meanwhile I'll use my notebook's SSD to work :( that's what I get for buying high end components I guess...

User avatar
thx-1138
Level 7
Level 7
Posts: 1922
Joined: Fri Mar 10, 2017 12:15 pm
Location: Athens, Greece

Re: PC randomly crashing

Post by thx-1138 » Tue Sep 03, 2019 3:30 pm

...googling for "failed to set APST feature (-19)" (from your screens above), returns quite a few similar bug reports...
it appears that someone might get around such via passing nvme_core.default_ps_max_latency_us=0 in grub?

Doesn't cost you anything trying that out obviously, but please do take this with more than a grain of salt though,
as that might turn out to eventually just be a further side-effect of something else altogether
(even more if you say such happens under Windows as well).

Edit: maybe also try researching / googling "nvme_core.default_ps_max_latency_us=0" "samsung 960"... :|

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Tue Sep 03, 2019 3:44 pm

I tried that grub latency trick, doesn't get better :( also, to the other guy... I took the ssd off just after a crash and it does get hot but not very hot... however it is strange that the temp sensor always displays low temps, maybe the sensor burned up and the ssd crashed itself because of the heat?

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Tue Sep 03, 2019 3:47 pm

rene wrote:
Mon Sep 02, 2019 6:08 pm
Oh, I should by the way add that I did find some overheating issues with the 960 Pro but when I earlier asked about its temperature and/or it being situated in enough airflow that wasn't really answered. Later then, inxi -s (was in fact useful but) did not include the SSD sensor. If you have "hddtemp" installed then inxi -Dx or sudo inxi -Dx or simply directly sudo hddtemp /dev/nvme0n1 would be interesting. And especially if you see it getting real high at the moment of trouble...

[EDIT] It was answered by you specifying the PCIe adapter with fan. So likely not the issue, but still an interesting probe...
andre@andre-System-Product-Name:~$ inxi -D
Drives:
Local Storage: total: 588.73 GiB used: 8.16 GiB (1.4%)
ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 960 PRO 512GB
size: 476.94 GiB
ID-2: /dev/sda vendor: Kingston model: SV300S37A120G size: 111.79 GiB
andre@andre-System-Product-Name:~$ sudo hddtemp /dev/nvme0n1
[sudo] password for andre:
ERROR: /dev/nvme0n1: can't determine bus type (or this bus type is unknown)

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Tue Sep 03, 2019 3:50 pm

thx-1138 wrote:
Tue Sep 03, 2019 3:30 pm
(even more if you say such happens under Windows as well).
Yes, the system historically stable together with missing deluge of screaming 960 Pro users on new kernels meant I didn't think it would help. 19 is ENODEV; i.e., looks like an error you'd get when something else caused the drive to disconnect first. My money's still quite firmly on blown capacitor -- although without interchangeable parts I wouldn't know how to test where, between drive, motherboard and PSU.

As to hddtemp; ah, crap. Apparently not useful for NVMe. Never mind; your finger was a perfect sensor...

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Tue Sep 03, 2019 3:53 pm

managed to get a smart report
I just don't understand, everything seems ok with the drive, the magician software says it is ok, and it works well until it crashes, I ran out of ideas to try :( I think I'll just wait for the answer from Samsung and get a new one

sudo nvme smart-log /dev/nvme0n1
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning : 0
temperature : 31 C
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 0%
data_units_read : 6.777.003
data_units_written : 5.161.368
host_read_commands : 161.531.212
host_write_commands : 111.780.796
controller_busy_time : 638
power_cycles : 1.678
power_on_hours : 550
unsafe_shutdowns : 596
media_errors : 0
num_err_log_entries : 246
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Temperature Sensor 1 : 31 C
Temperature Sensor 2 : 33 C
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Tue Sep 03, 2019 3:58 pm

The "unsafe shutdowns" thing I also get on some systems and seems to not in fact matter. Have always skipped investigating it due to that. Otherwise the report is saying "all is perfect".

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Tue Sep 03, 2019 4:14 pm

well, that's it then, maybe it has no solution but to RMA it :( at least I managed to recover all the data from the drive... maybe call someone that's an expert on those SSDs? this just shouldn't be happening D:
now I'm working on the laptop's SSD and works like a charm, just not as fast as the samsung :(

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Tue Sep 03, 2019 4:31 pm

As far as I've been able to see any APST issues (as implied from gm10's edited-in link; https://bugs.launchpad.net/ubuntu/+sour ... ug/1678184) should be lone gone / quirked-around in Ubuntu's 4.15 kernels that you are using so, yah, together with Windows also being implied I wouldn't think that further software-tries have high potential. Although I'd personally from that link still try more settings for the latency parameter; 0, 250, 3000, 6000...

But hardware can break; also the good hardware that you have. I really wouldn't feel in any way bad about your purchase; it's a damn nice set of board, CPU and drive. I'd be a little weary of sending in the drive without definitive confirmation the issue's there (no familiy, friends, enemies with a compatible board you can test it in?) but, well, don't know what to otherwise say either...

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Tue Sep 03, 2019 4:49 pm

I have a friend that recently bought a high end notebook, I'll try installing the nvme there and see if it also crashes... I found on Samsung site an old firmware installer for the M2 that downgrades it from version 4 to version 2, maybe the firmware memory got corrupted and that's what is causing the crashes? or is it risky that I'll fully brick it?

rene
Level 12
Level 12
Posts: 4172
Joined: Sun Mar 27, 2016 6:58 pm

Re: PC randomly crashing

Post by rene » Tue Sep 03, 2019 5:02 pm

There's usualy good reason to up- rather than downgrade firmware and, sure there's always some risk involved. I.e, try and not have a power outage while updating firmwares -- and come to think of it.... a drive that seems to disconnect from the bus as a matter of course might be especially "exciting" in that sense. But I'd myself probably go ahead and try. Samsung will RMA it... :)

AndreZoio
Level 1
Level 1
Posts: 22
Joined: Sun Sep 01, 2019 4:47 pm

Re: PC randomly crashing

Post by AndreZoio » Tue Sep 03, 2019 5:17 pm

well, no use, the program won't recognize the ssd to update, so I'll just wait for my friend and test it on another pc

Post Reply

Return to “Other topics”