Hardware Error 0 then proper number, long boot

Questions about hardware, drivers and peripherals
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Hardware Error 0 then proper number, long boot

Post by baku85 »

Code: Select all

[ 0.990451] kernel: Freeing SMP alternatives memory: 40K [
[0.994334] kernel: smpboot: CPU0: Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz (family: 0x6, model: 0x55, stepping: 0x4) 
[ 0.994475] kernel: mce: [Hardware Error]: Machine check events logged 
[0.994478] kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 12: c802444000200e0f
[ 0.994483] kernel: mce: [Hardware Error]: TSC 0 MISC 40000000000000 
[ 0.994488] kernel: mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1646677545 SOCKET 0 APIC 0 microcode 2006b06
[ 0.994631] kernel: Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver. 
[ 0.994640] kernel: ... version: 4 [ 0.994640] kernel: ... bit width: 48 
[ 0.994641] kernel: ... generic registers: 4 
[ 0.994642] kernel: ... value mask: 0000ffffffffffff 
[ 0.994642] kernel: ... max period: 00007fffffffffff
[ 0.994643] kernel: ... fixed-purpose events: 3 
[ 0.994644] kernel: ... event mask: 000000070000000f
[ 0.994745] kernel: rcu: Hierarchical SRCU implementation. 
[1.004406] kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[1.005495] kernel: smp: Bringing up secondary CPUs ... 
[1.005608] kernel: x86: Booting SMP configuration: 
[ 1.005609] kernel: .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19
[ 1.160262] kernel: .... node #1, CPUs: #20 [ 0.001872] kernel: smpboot: CPU 20 Converting physical 0 to logical die 1 
[ 1.254132] kernel: mce: [Hardware Error]: Machine check events logged 
[ 1.254135] kernel: mce: [Hardware Error]: CPU 20: Machine Check: 0 Bank 5: c802444000200e0f 
[1.254141] kernel: mce: [Hardware Error]: TSC 0 MISC 40000000000000

Code: Select all

uname -r 
5.4.0-100-generic
my system is starting verry long, its what I saw in logs and I see before it boots. What can be the reason ?

currently I am on fresh linux mint, but had the same on ubuntu 20.04
Last edited by karlchen on Thu Jun 22, 2023 5:04 am, edited 2 times in total.
Reason: inserted missing linebreaks into the boot messages
deepakdeshp
Level 20
Level 20
Posts: 12341
Joined: Sun Aug 09, 2015 10:00 am

Re: Hardware Error 0 then proper number, long boot

Post by deepakdeshp »

Stress Test your CPU.I suggest trying stress-ng with the cpu stressor and verify mode enabled:

Code: Select all


sudo apt-get install stress-ng
stress-ng --cpu 0 --verify --verbose --timeout 5m  
If I have helped you solve a problem, please add [SOLVED] to your first post title, it helps other users looking for help.
Regards,
Deepak

Mint 21.1 Cinnamon 64 bit with AMD A6 / 8GB
Mint 21.1 Cinnamon AMD Ryzen3500U/8gb
User avatar
SMG
Level 25
Level 25
Posts: 31907
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Hardware Error 0 then proper number, long boot

Post by SMG »

baku85 wrote: Tue Mar 08, 2022 2:47 amcurrently I am on fresh linux mint, but had the same on ubuntu 20.04
Welcome to the forum, baku85.

mce = machine check exceptions
This can sometimes indicate hardware issues which might be related to the hardware or the settings used in BIOS/UEFI.

Please give us information about your install by entering this command in a terminal: inxi -Fxxxrz
Click </> from the mini toolbar above the textbox where you type your reply and then place your cursor between the code tags and paste the results of the command between the code tags [code]Results[/code]. This will let us know how Mint sees your hardware.
Image
A woman typing on a laptop with LM20.3 Cinnamon.
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

thanks for the reply!

here you go:

Code: Select all

System:    Kernel: 5.4.0-100-generic x86_64 bits: 64 compiler: gcc v: 9.3.0 Desktop: Cinnamon 5.2.7 wm: muffin 5.2.1 
           dm: LightDM 1.30.0 Distro: Linux Mint 20.3 Una base: Ubuntu 20.04 focal 
Machine:   Type: Desktop System: Dell product: Precision 7820 Tower v: N/A serial: <filter> Chassis: type: 3 serial: <filter> 
           Mobo: Dell model: 05WNJ2 v: A01 serial: <filter> UEFI [Legacy]: Dell v: 1.3.3 date: 02/02/2018 
CPU:       Topology: 2x 20-Core model: Intel Xeon Gold 6138 bits: 64 type: MT MCP SMP arch: Skylake rev: 4 L2 cache: 55.0 MiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 320081 
           Speed: 1000 MHz min/max: 1000/3700 MHz Core speeds (MHz): 1: 1000 2: 1001 3: 1001 4: 1000 5: 1001 6: 1001 7: 1001 
           8: 1000 9: 1001 10: 1000 11: 1001 12: 1001 13: 1000 14: 1001 15: 1001 16: 1000 17: 1000 18: 1001 19: 1001 20: 1001 
           21: 1001 22: 1001 23: 1001 24: 1000 25: 1001 26: 1000 27: 1001 28: 1001 29: 1000 30: 1001 31: 1001 32: 1000 
           33: 1000 34: 1001 35: 1001 36: 1001 37: 1001 38: 1001 39: 1001 40: 1000 41: 1000 42: 1000 43: 1001 44: 1000 
           45: 1380 46: 1000 47: 1001 48: 1001 49: 1001 50: 1000 51: 1000 52: 1001 53: 1001 54: 1001 55: 1000 56: 1001 
           57: 1000 58: 1001 59: 1000 60: 1001 61: 1000 62: 1001 63: 1001 64: 1001 65: 1000 66: 1000 67: 1000 68: 1001 
           69: 1000 70: 1001 71: 1001 72: 1000 73: 1000 74: 1001 75: 1001 76: 1001 77: 1001 78: 1000 79: 1000 80: 1728 
Graphics:  Device-1: NVIDIA driver: nvidia v: 510.47.03 bus ID: 9e:00.0 chip ID: 10de:2230 
           Display: x11 server: X.Org 1.20.13 driver: nvidia unloaded: fbdev,modesetting,nouveau,vesa 
           resolution: 1920x1080~60Hz, 2560x1080~60Hz 
           OpenGL: renderer: NVIDIA RTX A6000/PCIe/SSE2 v: 4.6.0 NVIDIA 510.47.03 direct render: Yes 
Audio:     Device-1: Intel vendor: Dell driver: snd_hda_intel v: kernel bus ID: 00:1f.3 chip ID: 8086:a1f0 
           Device-2: NVIDIA driver: snd_hda_intel v: kernel bus ID: 9e:00.1 chip ID: 10de:1aef 
           Sound Server: ALSA v: k5.4.0-100-generic 
Network:   Device-1: Intel Ethernet I219-LM vendor: Dell driver: e1000e v: 3.2.6-k port: 0780 bus ID: 00:1f.6 
           chip ID: 8086:15b9 
           IF: enp0s31f6 state: down mac: <filter> 
           Device-2: Intel Wireless 7265 driver: iwlwifi v: kernel port: 0780 bus ID: 04:00.0 chip ID: 8086:095a 
           IF: wlp4s0 state: up mac: <filter> 
           IF-ID-1: br-f73460aa952c state: down mac: <filter> 
           IF-ID-2: docker0 state: down mac: <filter> 
Drives:    Local Storage: total: 465.76 GiB used: 57.70 GiB (12.4%) 
           ID-1: /dev/sda vendor: Seagate model: BarraCuda 120 SSD ZA500CM10003 size: 465.76 GiB speed: 6.0 Gb/s 
           serial: <filter> rev: D013 scheme: MBR 
Partition: ID-1: / size: 114.85 GiB used: 29.02 GiB (25.3%) fs: ext4 dev: /dev/sda7 
           ID-2: /boot size: 944.3 MiB used: 211.0 MiB (22.3%) fs: ext4 dev: /dev/sda1 
           ID-3: /home size: 167.88 GiB used: 14.24 GiB (8.5%) fs: ext4 dev: /dev/sda6 
           ID-4: swap-1 size: 9.31 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sda5 
Sensors:   System Temperatures: cpu: 50.0 C mobo: N/A sodimm: 34.0 C gpu: nvidia temp: 36 C 
           Fan Speeds (RPM): cpu: 1513 mobo: 795 gpu: nvidia fan: 30% 
Repos:     Active apt repos in: /etc/apt/sources.list 
           1: deb [arch=amd64] https://repo.fortinet.com/repo/6.4/ubuntu/ /bionic multiverse
           Active apt repos in: /etc/apt/sources.list.d/additional-repositories.list 
           1: deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable
           2: deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /
           Active apt repos in: /etc/apt/sources.list.d/azure-cli.list 
           1: deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ focal main
           Active apt repos in: /etc/apt/sources.list.d/deadsnakes-ppa-focal.list 
           1: deb http://ppa.launchpad.net/deadsnakes/ppa/ubuntu focal main
           Active apt repos in: /etc/apt/sources.list.d/google-chrome.list 
           1: deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main
           Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list 
           1: deb http://packages.linuxmint.com una main upstream import backport #id:linuxmint_main
           2: deb http://archive.ubuntu.com/ubuntu focal main restricted universe multiverse
           3: deb http://archive.ubuntu.com/ubuntu focal-updates main restricted universe multiverse
           4: deb http://archive.ubuntu.com/ubuntu focal-backports main restricted universe multiverse
           5: deb http://security.ubuntu.com/ubuntu/ focal-security main restricted universe multiverse
           6: deb http://archive.canonical.com/ubuntu/ focal partner
           Active apt repos in: /etc/apt/sources.list.d/vscode.list 
           1: deb [arch=amd64,arm64,armhf] http://packages.microsoft.com/repos/code stable main
Info:      Processes: 974 Uptime: 6h 24m Memory: 125.54 GiB used: 8.11 GiB (6.5%) Init: systemd v: 245 runlevel: 5 Compilers: 
           gcc: 9.4.0 alt: 9 Shell: zsh v: 5.8 running in: guake inxi: 3.0.38 
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

deepakdeshp wrote: Wed Mar 16, 2022 4:19 am Stress Test your CPU.I suggest trying stress-ng with the cpu stressor and verify mode enabled:

Code: Select all


sudo apt-get install stress-ng
stress-ng --cpu 0 --verify --verbose --timeout 5m  
did not help :(
deepakdeshp
Level 20
Level 20
Posts: 12341
Joined: Sun Aug 09, 2015 10:00 am

Re: Hardware Error 0 then proper number, long boot

Post by deepakdeshp »

It will help stress testing CPU 0. Did you get any errors?
If I have helped you solve a problem, please add [SOLVED] to your first post title, it helps other users looking for help.
Regards,
Deepak

Mint 21.1 Cinnamon 64 bit with AMD A6 / 8GB
Mint 21.1 Cinnamon AMD Ryzen3500U/8gb
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

I had no errors
User avatar
SMG
Level 25
Level 25
Posts: 31907
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Hardware Error 0 then proper number, long boot

Post by SMG »

Is this a problem on a new install of Linux Mint or is this a new issue on an existing install?

The information I found for this computer Dell Precision 7820 Drivers indicates the latest BIOS/UEFI is version 2.18.0, so it seems your computer is missing a lot of updates.
Machine:
Type: Desktop System: Dell product: Precision 7820 Tower v: N/A serial: <filter> Chassis: type: 3 serial: <filter>
Mobo: Dell model: 05WNJ2 v: A01 serial: <filter> UEFI [Legacy]: Dell v: 1.3.3 date: 02/02/2018

I don't think it is related to your problem, but I noticed you have several "bionic" repos which are the base for LM19. The base for LM20 is "focal".

Code: Select all

Repos:     Active apt repos in: /etc/apt/sources.list 
           1: deb [arch=amd64] https://repo.fortinet.com/repo/6.4/ubuntu/ /bionic multiverse
           Active apt repos in: /etc/apt/sources.list.d/additional-repositories.list 
           1: deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable
           2: deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /
Image
A woman typing on a laptop with LM20.3 Cinnamon.
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

its fresh linux mint, also tried with ubuntu the same..
for example windows no issues at all ( I barely use windows just for gaming, linux is for work ).
Yeah I saw also bios version which could be upgraded, but how then windows is ok.

Its desktop from my company, I think I will need to ask them first before doing any bios upgrade ;)
User avatar
deck_luck
Level 7
Level 7
Posts: 1577
Joined: Mon May 27, 2019 6:57 pm
Location: R-4808 North

Re: Hardware Error 0 then proper number, long boot

Post by deck_luck »

Code: Select all

1.005495] kernel: smp: Bringing up secondary CPUs ... 
 [ 1.005608] kernel: x86: Booting SMP configuration: 
 [ 1.005609] kernel: .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 
 [ 1.160262] kernel: .... node #1, CPUs: #20 [ 0.001872] kernel: smpboot: CPU 20 Converting physical 0 to logical die 1 
 [ 1.254132] kernel: mce: [Hardware Error]: Machine check events logged
 [ 1.254135] kernel: mce: [Hardware Error]: CPU 20: Machine Check: 0 Bank 5: c802444000200e0f [
 [1.254141] kernel: mce: [Hardware Error]: TSC 0 MISC 40000000000000
With the "Machine Check: 0 Bank 5: c802444000200e0f" error, the machine random access memory (ram or sc memory) could be a suspect. Do you have access to Dell Diagnostics to run on your machine? I know Lenovo and Hewlett Packard provide hardware diagnostics, and I assume Dell does the same. Also, on your Linux Mint Live/Installation media you can boot into the x86memtest program.
🐧Linux Mint 20.3 XFCE (UEFI - Secure Boot Enabled) dual boot with Windows 11

Give a friend a fish, and you feed them for a day. Teach a friend how to fish, and you feed them for a lifetime. ✝️
User avatar
SMG
Level 25
Level 25
Posts: 31907
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Hardware Error 0 then proper number, long boot

Post by SMG »

baku85 wrote: Wed Mar 23, 2022 10:23 amYeah I saw also bios version which could be upgraded, but how then windows is ok.
Windows is a different operating system. Not all operating systems use the information in the firmware (BIOS/UEFI) the same way.
baku85 wrote: Wed Mar 23, 2022 10:23 amIts desktop from my company, I think I will need to ask them first before doing any bios upgrade ;)
I can understand that. :)

It is possible the reason for the long boot is separate from the error message. What are the outputs of

Code: Select all

systemd-analyze
and

Code: Select all

systemd-analyze critical-chain
Image
A woman typing on a laptop with LM20.3 Cinnamon.
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

systemd-analyze

Code: Select all

Startup finished in 39.379s (kernel) + 11.138s (userspace) = 50.517s 
graphical.target reached after 11.115s in userspace
systemd-analyze critical-chain
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

Code: Select all

graphical.target @11.115s
└─multi-user.target @11.115s
  └─docker.service @10.078s +1.036s
    └─network-online.target @10.075s
      └─NetworkManager-wait-online.service @3.058s +7.015s
        └─NetworkManager.service @2.840s +215ms
          └─dbus.service @2.835s
            └─basic.target @2.802s
              └─sockets.target @2.802s
                └─snapd.socket @2.800s +1ms
                  └─sysinit.target @2.794s
                    └─systemd-timesyncd.service @2.731s +62ms
                      └─systemd-tmpfiles-setup.service @2.697s +28ms
                        └─local-fs.target @2.681s
                          └─home.mount @2.673s +8ms
                            └─systemd-fsck@dev-disk-by\x2duuid-acfc2731\x2d6d87\x2d4420\x2da877\x2d85aa9d30a02f.service @2.586s +84ms
                              └─dev-disk-by\x2duuid-acfc2731\x2d6d87\x2d4420\x2da877\x2d85aa9d30a02f.device @2.570s
User avatar
SMG
Level 25
Level 25
Posts: 31907
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Hardware Error 0 then proper number, long boot

Post by SMG »

baku85 wrote: Thu Mar 24, 2022 4:04 am
systemd-analyze

Code: Select all

Startup finished in 39.379s (kernel) + 11.138s (userspace) = 50.517s 
graphical.target reached after 11.115s in userspace
Relatively speaking, that seems like an extremely long kernel time so it is possible the long boot relates to the error message.

Is it possible for you to run any tests suggested by deck_luck?
Image
A woman typing on a laptop with LM20.3 Cinnamon.
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

hi guys,

ah forgot about to reply.
I upgraded my bios, good that dell has support asists and I had W11 license connected to this desktop computer, so I could easily do the upgrade.
But still the same.. so its not old bios.

Not sure if I can do smth more about it.
deepakdeshp
Level 20
Level 20
Posts: 12341
Joined: Sun Aug 09, 2015 10:00 am

Re: Hardware Error 0 then proper number, long boot

Post by deepakdeshp »

As requested earlier please test the memory ,RAM
If I have helped you solve a problem, please add [SOLVED] to your first post title, it helps other users looking for help.
Regards,
Deepak

Mint 21.1 Cinnamon 64 bit with AMD A6 / 8GB
Mint 21.1 Cinnamon AMD Ryzen3500U/8gb
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

Code: Select all

sudo memtester 200M 1
memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 200MB (209715200 bytes)
got  200MB (209715200 bytes), trying mlock ...locked.
Loop 1/1:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         
  8-bit Writes        : ok
  16-bit Writes       : ok

Done.

looks like all is ok with memory
User avatar
SMG
Level 25
Level 25
Posts: 31907
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Hardware Error 0 then proper number, long boot

Post by SMG »

I do not have any other ideas related to hardware.

You could try a newer kernel to see if that helps the boot time. The 5.13 kernel is available in Update Manager. Let us know if you want to try it and need directions.
Image
A woman typing on a laptop with LM20.3 Cinnamon.
Marie SWE
Level 5
Level 5
Posts: 713
Joined: Wed Feb 28, 2018 7:32 pm
Location: Sweden

Re: Hardware Error 0 then proper number, long boot

Post by Marie SWE »

I'm no expert in linux I still see myself as a noob

But the first thing that comes to my mind is..
Have you installed Intel microcode firmware for intel cpu's?
if you want my attention...quote me so I get a notification
Nothing is impossible, the impossible just takes a little longer to solve..
It is like it is.. because you do as you do.. if you hadn't done it as you did.. it wouldn't have become as it is. ;)
deepakdeshp
Level 20
Level 20
Posts: 12341
Joined: Sun Aug 09, 2015 10:00 am

Re: Hardware Error 0 then proper number, long boot

Post by deepakdeshp »

Run memtest when the os is not booted.

https://www.google.com/amp/s/www.wikiho ... 86%3famp=1
If I have helped you solve a problem, please add [SOLVED] to your first post title, it helps other users looking for help.
Regards,
Deepak

Mint 21.1 Cinnamon 64 bit with AMD A6 / 8GB
Mint 21.1 Cinnamon AMD Ryzen3500U/8gb
baku85
Level 1
Level 1
Posts: 25
Joined: Tue Mar 08, 2022 2:46 am

Re: Hardware Error 0 then proper number, long boot

Post by baku85 »

Marie SWE wrote: Sat May 14, 2022 1:52 pm I'm no expert in linux I still see myself as a noob

But the first thing that comes to my mind is..
Have you installed Intel microcode firmware for intel cpu's?
hmm interesting Image

So I installed it, looks like I dont have such error anymore:

Code: Select all

dmesg | grep microcode
[   35.718680] microcode: sig=0x50654, pf=0x80, revision=0x2006d05
[   35.721273] microcode: Microcode Update Driver: v2.2.
( before installing by making this command I saw those errors ).
So no such error anymore! huge thanks.

but still boot takes long, ah. I will burn later memtest ISO to my usb pendrive and try memtest before booting.
Last edited by SMG on Wed May 18, 2022 10:27 am, edited 1 time in total.
Reason: Changed img tags to rimg tags so link is clickable.
Locked

Return to “Hardware Support”