Basic hardware troubleshooting

Write tutorials for Linux Mint here
More tutorials on https://github.com/orgs/linuxmint/discu ... /tutorials and (archive) on https://community.linuxmint.com/tutorial
Forum rules
Don't add support questions to tutorials; start your own topic in the appropriate sub-forum instead. Before you post read forum rules
Post Reply
Moonstone Man
Level 16
Level 16
Posts: 6054
Joined: Mon Aug 27, 2012 10:17 pm

Basic hardware troubleshooting

Post by Moonstone Man »

NOTE 1: Only steps 1), 8 ), 10) and 11) are suitable for laptops. The rest of the steps are for desktop units only.

NOTE 2: This is not an exhaustive list. If you think the tutorial could be enhanced by adding other tests and checks, please feel free to reply to this post. When replying, it is advisable to use the quote button " at the top right of the post you are going to reply to rather than clicking the Post Reply button at the bottom left of the post, and don't forget to trim the text, leaving the quote metadata in place; this ensures that I see your suggestion.

NOTE 3: In this tutorial I refer to crashes, but these steps are fully applicable to the machine hard locking, and especially to locking that might appear to be random.

NOTE 4: Before you suspect hardware and try these steps, post into the support forums and ask for help. There may be a software reason for your machine crashing or hanging so that should be eliminated first. The quickest way to determine if it might be a software issue is to boot from a live USB flash drive or DVD.

Desktop & laptop
1) If the machine boots, check the output of this command in the terminal: dmesg --level=err and look for error messages. You shouldn't be able to miss them if there are any. If you don't know what the error messages are telling you, create a new thread in the support forums, and include a copy/paste of the errors, suitably trimmed if there are a lot of them repeated; we don't need to see hundreds of repeated lines all saying the same thing. Also post the complete output of this command: inxi -Fxxxzr in your request for assistance. Always enclose the pasted output in code tags [code]output.here[/code]. You'll see the code tags icon </> when you post or reply into a thread.

Please don't post support questions in this thread.

Desktop only
2) Shut the machine down, remove the power and video cables, press the power on button to discharge any power stored in the PSU's capacitors, remove the video card completely, put it back and make sure it's firmly seated. Make sure NVMe and SATA connections are good.

If you have other devices plugged into your machine you should either reseat them all or remove them all except for the video card and try to boot. If it boots then you know that one of the other cards you removed is involved so replace the cards one by one and boot each time.

If the video card needs a 6 or 8-pin or multiple 6/8 pin connections, make sure they are provided, and if possible, make sure the multiple connectors do not connect to the same power rail, ie the connectors are on independent cables going to the power supply.

Usually motherboards require additional power from 4/6/8pin connectors. Make sure these are connected. Mostly the power sockets are located close to the CPU, and in the bottom left of the motherboard when looking at the back of the board. The back is where peripherals are plugged in.

Desktop only
Perform this step if the motherboard is not new:

3) While you've got the video card out, look for capacitor plague on the motherboard. You're looking for anything that looks like this:

Image

Notice the three green caps. Two have distended tops and one has a flat top. Also note that the first green cap from the left has a brown, waxy substance on top. The cap with a flat top is ok, but the other two are plagued. If you see anything like that then your motherboard is the suspect. You can stop here if you find the plague on your motherboard because it needs replacing.

Reconnect the cables and try again.

Desktop only
4) If the machine crashes after doing 2) and 3), shut it down again and apply some pressure to the motherboard with two fingers in various places then try again. You've applied enough pressure when you feel the board give by the tiniest amount. This is done in case there is an open circuit somewhere on the board. Applying pressure to make the board give just a tiny bit may temporarily close an open circuit, or it may make it worse or it may do nothing at all. If it doesn't crash after this then either there was a loose or open connection somewhere and it's now either permanently rectified or temporarily rectified. If it's temporary then the machine will often crash when it gets warmer. You'll just have to keep your eye on this if applying pressure actually gets the machine to work.

If this is a newly built machine and not shop bought, make doubly-sure that there is sufficient thermal grease on the CPU, make sure the CPU is correctly oriented, and make sure there are no bent pins on the CPU. Also make sure there is no run-off thermal grease on the pins or pads of the CPU. Note that some CPUs have gold pads rather than pins.

NOTE: Don't void your warranty if this is a shop-bought build. When you've performed all of the tests, you may need to consider sending it back.

Desktop only
5) With the machine turned off, apply the same two-finger pressure to your RAM sticks or, alternatively, remove them and reseat them. Try to boot the machine again. As in step 4) if it doesn't crash after this then either there was a loose or open connection somewhere and it's now either permanently rectified or temporarily rectified. If it's temporary then the machine will often crash when it gets warmer. You'll just have to keep your eye on this if applying pressure actually works.

As a further check, you should remove all memory sticks except one and see if it stabilises. If it doesn't, replace that one stick with another. Rinse, repeat until all sticks have been individually tested.

Desktop only
6) This, along with the remaining steps, is where it can get expensive. Try a different video card. If need be, beg, borrow or temporarily steal one. You could monitor your GPU temperatures to see if it's spiking but that's beyond the scope of this tutorial.

Desktop only
7) The power supply might be faulty and require replacement. If this is the case then searching online for how to test the PSU won't help unless you have appropriate electronic test gear, so again, you may have to beg, borrow or steal a power supply to test this.

Desktop & laptop
8 ) Stress test the CPU and RAM.

Code: Select all

sudo apt-get install stress
sudo stress --cpu  12 --timeout 90
Adjust the CPU stress command above to suit your CPU, ie change the value 12 to the number of threads that your CPU supports. Note that it is the number threads, not the number of cores. If it doesn't crash after 90 seconds to 2 minutes, you can assume that at least the CPU, cooler and thermal grease are in working order. You could monitor your CPU temperatures to see if it's spiking but that's beyond the scope of this tutorial also. Also, the point of stress-testing the CPU is to make it crash quicker than it might otherwise crash under normal usage so it's best to boot the machine cold and then run this test. A cold boot is any power-on following a complete shutdown.

Consider running prime95 or memtest86 for long periods, ie more than a few minutes and up to 12 or 24 hours.

Desktop only
9) If the machine crashes in step 8 ) then you should carefully take the CPU out, clean all of the thermal grease, also carefully, using methylated spirits and a small clean cloth, being very careful not to bend any pins, reseat the CPU, apply a matchstick head's worth of new thermal grease, put the cooler back after cleaning out dust and muck and try again.

Desktop & laptop
10) Soak test the machine. Just leave it running for a full 24 hours but turn off display power management and only engage the screen saver. For this test, when the screen saver kicks in, you can turn the display off manually. You want power management disabled (DPMS) because you don't want the video card to go to sleep. You should know enough by this stage if you've got deeper problems because the soak test fails ie the machine crashes when doing almost nothing.

Desktop & laptop
11) Take it to a computer repair technician, and prepare for the worst.
Last edited by Moonstone Man on Wed Aug 11, 2021 5:47 am, edited 14 times in total.
SimonPeter
Level 5
Level 5
Posts: 579
Joined: Tue Jul 13, 2021 5:13 am

Re: Basic hardware troubleshooting

Post by SimonPeter »

Kadaitcha Man wrote: Thu Jul 29, 2021 8:14 am
Great tutorial.

BTW: inxi -Fxxxrz may be better.
Last edited by karlchen on Thu Jul 29, 2021 3:46 pm, edited 1 time in total.
Reason: Removed the full post quote of the post right above.
DAMIEN1307

Re: Basic hardware troubleshooting

Post by DAMIEN1307 »

Great tutorial KM...The only thing that i would suggest, is to highlight this one sentence in red along with "B" and "I", (Bold and Italicised), to try to avoid the support questions getting added to this...Just a thought I had and needed to grab it quick before it died of loneliness...lol...DAMIEN
Please don't post support questions in this thread.
Moonstone Man
Level 16
Level 16
Posts: 6054
Joined: Mon Aug 27, 2012 10:17 pm

Re: Basic hardware troubleshooting

Post by Moonstone Man »

DAMIEN1307 wrote: Thu Jul 29, 2021 3:29 pm Great tutorial KM...The only thing that i would suggest, is to highlight this one sentence in red along with "B" and "I", (Bold and Italicised), to try to avoid the support questions getting added to this...
I would, but it's already in red at the top of the page. I could make 300 pixels high and people will still post support questions here.
Every time I boot up the OS, the sound may work or it may not. I have no idea what could cause this. I already know that when I do not hear the OS "Greetings" sound - the problem is back. But for some reason some applications work. Spotify and Element work all the time; VLC works sometimes; Firefox and other applications don't work - I cannot hear anything. In Sound Settings though it tells me that "No application is currently playing or recording audio" regardless.

Any ideas?
Moderator fodder.
Moonstone Man
Level 16
Level 16
Posts: 6054
Joined: Mon Aug 27, 2012 10:17 pm

Re: Basic hardware troubleshooting

Post by Moonstone Man »

SimonPeter wrote: Thu Jul 29, 2021 2:23 pm BTW: inxi -Fxxxrz may be better.
Done, thank you.
User avatar
RollyShed
Level 8
Level 8
Posts: 2434
Joined: Sat Jan 12, 2019 8:58 pm
Location: South Island, New Zealand
Contact:

Re: Basic hardware troubleshooting

Post by RollyShed »

The Capacitor Plague -
https://en.wikipedia.org/wiki/Capacitor_plague

Note how long ago it started. Its been the bane of my life or what kept me employed depending how you look at it.
https://www.collinsdictionary.com/dicti ... glish/bane
Post Reply

Return to “Tutorials”