Frequent unsolicited reboot or session break off [SOLVED]

Questions about hardware, drivers and peripherals
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Frequent unsolicited reboot or session break off [SOLVED]

Post by ezijlstra »

In October 2021 one of my three NUC's suddenly stopped after having performed flawlessly since January 2020. I had it repaired under warranty and it came back with a new motherboard. I put back in the memory and drives. All seemed fine, but since then it randomly reboots some time after I stop using it actively. When I come back some times it resumes normally, some times it appears to have rebooted or doesn't respond to mouse or keyboard activity and I have to force a hard shutdown. Some times the restart presents a GRUB menu, some times it just starts up normally. Some times there is no reboot in the system log and apparently the session was closed suddenly.
In the system journal there is a BERT fatal failure reported in some cases, but not every time. [Hardware Error]: section type: unknown, 81212a96-09ed-4996-9471-8d729c8e69ed. There seem to be more errors in the journal, but I need guidance to make sense of these messages.

I wander whether the hardware is not good or is there some kind of incompatibility as the new motherboard is of a newer production date (5/2020 on the bottom) than the old one (11/2019 on the bottom). If I swap the ssd's with another NUC the sw runs fine on the other hardware and the sw on the repaired NUC has the same symptoms of rebooting and so on. Until now all reboots and other nuisance has happened when I was not actively using the system, but the screen was black in screen saving mode.

I hope someone can help me find out what is wrong and how to fix it.

Here are the system details:

Code: Select all

System:    Host: NUC3 Kernel: 4.15.0-163-generic x86_64 bits: 64 compiler: gcc v: 7.5.0 Desktop: Cinnamon 4.2.4 
           wm: muffin 4.2.2 dm: LightDM 1.26.0 Distro: Linux Mint 19.2 Tina base: Ubuntu 18.04 bionic 
Machine:   Type: Mini-pc System: Intel Client Systems product: NUC8i7BEH v: J72992-309 serial: <filter> 
           Chassis: Intel Corporation type: 35 v: 2.0 serial: <filter> 
           Mobo: Intel model: NUC8BEB v: J72688-309 serial: <filter> UEFI: Intel v: BECFL357.86A.0087.2020.1209.1115 
           date: 12/09/2020 
CPU:       Topology: Quad Core model: Intel Core i7-8559U bits: 64 type: MT MCP arch: Kaby Lake rev: A L2 cache: 8192 KiB 
           flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 43198 
           Speed: 600 MHz min/max: 400/4500 MHz Core speeds (MHz): 1: 551 2: 502 3: 534 4: 565 5: 511 6: 548 7: 552 8: 506 
Graphics:  Device-1: Intel driver: i915 v: kernel bus ID: 00:02.0 chip ID: 8086:3ea5 
           Display: x11 server: X.Org 1.19.6 driver: modesetting unloaded: fbdev,vesa resolution: 1920x1080~60Hz 
           OpenGL: renderer: Mesa DRI Intel Iris Plus Graphics 655 (CFL GT3) v: 4.6 Mesa 20.0.8 compat-v: 3.0 
           direct render: Yes 
Audio:     Device-1: Intel driver: snd_hda_intel v: kernel bus ID: 00:1f.3 chip ID: 8086:9dc8 
           Sound Server: ALSA v: k4.15.0-163-generic 
Network:   Device-1: Intel driver: iwlwifi v: kernel port: 4000 bus ID: 00:14.3 chip ID: 8086:9df0 
           IF: wlp0s20f3 state: down mac: <filter> 
           Device-2: Intel Ethernet I219-V driver: e1000e v: 3.2.6-k port: efa0 bus ID: 00:1f.6 chip ID: 8086:15be 
           IF: eno1 state: up speed: 1000 Mbps duplex: full mac: <filter> 
Drives:    Local Storage: total: 931.52 GiB used: 482.10 GiB (51.8%) 
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO Plus 500GB size: 465.76 GiB speed: 31.6 Gb/s lanes: 4 
           serial: <filter> rev: 2B2QEXM7 scheme: GPT 
           ID-2: /dev/sda vendor: Samsung model: SSD 860 EVO 500GB size: 465.76 GiB speed: 6.0 Gb/s serial: <filter> rev: 3B6Q 
Partition: ID-1: / size: 456.96 GiB used: 150.55 GiB (32.9%) fs: ext4 dev: /dev/nvme0n1p2 
           ID-2: swap-1 size: 2.00 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/dm-0 
Sensors:   System Temperatures: cpu: 39.0 C mobo: 27.8 C 
           Fan Speeds (RPM): N/A 
Repos:     No active apt repos in: /etc/apt/sources.list 
           Active apt repos in: /etc/apt/sources.list.d/cisofy-lynis.list 
           1: deb https://packages.cisofy.com/community/lynis/deb/ stable main
           Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list 
           1: deb http://packages.linuxmint.com tina main upstream import backport #id:linuxmint_main
           2: deb http://archive.ubuntu.com/ubuntu bionic main restricted universe multiverse
           3: deb http://archive.ubuntu.com/ubuntu bionic-updates main restricted universe multiverse
           4: deb http://archive.ubuntu.com/ubuntu bionic-backports main restricted universe multiverse
           5: deb http://security.ubuntu.com/ubuntu/ bionic-security main restricted universe multiverse
           6: deb http://archive.canonical.com/ubuntu/ bionic partner
           Active apt repos in: /etc/apt/sources.list.d/skype-stable.list 
           1: deb [arch=amd64] https://repo.skype.com/deb stable main
           Active apt repos in: /etc/apt/sources.list.d/teams.list 
           1: deb [arch=amd64] https://packages.microsoft.com/repos/ms-teams stable main
Info:      Processes: 271 Uptime: 2h 47m Memory: 15.54 GiB used: 2.88 GiB (18.5%) Init: systemd v: 237 runlevel: 5 Compilers: 
           gcc: 7.5.0 alt: 7 Shell: bash v: 4.4.20 running in: gnome-terminal inxi: 3.0.32 
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 2 times in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
User avatar
karlchen
Level 23
Level 23
Posts: 18222
Joined: Sat Dec 31, 2011 7:21 am
Location: Germany

Re: Frequent unsolicited reboot or session break off

Post by karlchen »

Hello, ezijlstra.

Is kernel series 4.15.0-xx not a bit too old for machines from 2019 and 2020?

I ask because on a Dell Notebook from 2019 I had to switch from kernel 4.15.0-xx to 5.0.0-xx first, later to 5.4.0-xx in order to make it function properly.
To be precise, with kernel 4.15.0-xx this Dell Notebook did not boot at all.
Only Mint 19.3 with kernel 5.0.0-xx on-board out of the box booted and could be installed.
By now the machine runs happily on kernel 5.4.0-91.

So my advice will be:
Upgrade kernel series 4.15.0-xx to 5.4.0-91 with the help of the Update Manager.
Inside Update Manager go to View Linux Kernel. Select kernel 5.4.0-91 and click on [Install].
Once installed, close Update Manager and reboot.

HTH.
Karl
Image
The people of Alderaan have been bravely fighting back the clone warriors sent out by the unscrupulous Sith Lord Palpatine for 771 days now.
Lifeline
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

Hi Karl,
Thank you for your suggestion.
My hesitation is that it seems a bit trial and error.
Is the 4.15 kernel series too old? My other two NUCs run happily on the same sw.
I was hoping for some diagnostic analysis first to find a clue.
Are there no tools or ways to diagnose?
Kinds regards, Erik
User avatar
karlchen
Level 23
Level 23
Posts: 18222
Joined: Sat Dec 31, 2011 7:21 am
Location: Germany

Re: Frequent unsolicited reboot or session break off

Post by karlchen »

Hello, ezijlstra
ezijlstra wrote: Tue Nov 30, 2021 6:12 pmI was hoping for some diagnostic analysis first to find a clue.
Well, the most detailled source for all system messages is the system journal. Command to inspect it is the terminal command journalctl.

Note:
journalctl has got quite a few commandline arguments and options, which govern its behaviour. Reading journalctl output is not really trivial. Yet, it is very likely that its output can be used to correlate logged (error) messages to unexpected reboots and aborted graphical users sessions.

Please, consult the output of journalctl --help and man journalctl in order to learn how to use journalctl.

In case you want to redirect the output of journalctl to a file, e.g. in order to share it with us, you will have to use the option --no-pager in order to make sure that journalctl does not interrupt its output after every screenful.
Also be aware that such redirected journalctl output files can easily be too large in order to be uploaded directly. In such a case compress the file with the help of gzip before uploading it here.

--
About trial and error approach of switching to kernel 5.4.0-91:
Yes. It is trial and error. But reverting to the previous kernel 4.15.0-xx, which is in use now, would be as simple as:
1. rebooting
2. entering the Grub boot menu
3. selecting the previous kernel from the "Advanced options" submenu.
4. pressing <F10> to boot the selected (previous) kernel.
In case you want to get rid of kernel 5.4.0-91, you could uninstall it again, after you have booted the previous kernel as explained above.

Regards,
Karl
Image
The people of Alderaan have been bravely fighting back the clone warriors sent out by the unscrupulous Sith Lord Palpatine for 771 days now.
Lifeline
User avatar
SMG
Level 25
Level 25
Posts: 31911
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Frequent unsolicited reboot or session break off

Post by SMG »

ezijlstra wrote: Tue Nov 30, 2021 6:12 pmIs the 4.15 kernel series too old?
For your listed CPU, yes, it is. The 4.15 kernel was released 28 January 2018. While there are bug fixes and patches still being released for it, it has not received any new features since it was released.

The Intel Core i7-8559U was released to market in April 2018 which is after the kernel was released. Therefore, it is not possible that kernel is optimized for use with that CPU. It might be able to limp along using it, but the situation is not ideal. If you were running LM20, I would be recommending you use a kernel newer than 5.4 for it.

However, the situations you describe about the spontaneous reboots sounds like it could be a power supply unit (PSU) issue while the freezes are probably related to the kernel version.
ezijlstra wrote: Tue Nov 30, 2021 5:08 pm All seemed fine, but since then it randomly reboots some time after I stop using it actively. When I come back some times it resumes normally, some times it appears to have rebooted or doesn't respond to mouse or keyboard activity and I have to force a hard shutdown.
These types of freezes are why I recommend to those using the 5.4 kernel on LM20 to upgrade to a newer one. That has cleared the issues for all of them.
ezijlstra wrote: Tue Nov 30, 2021 5:08 pm Some times the restart presents a GRUB menu, some times it just starts up normally. Some times there is no reboot in the system log and apparently the session was closed suddenly.
That "session was closed suddenly" sounds like a issue I helped someone with about a year ago. In that situation, it was discovered the PSU was not sized properly for all the rails on the computer. The voltage tolerance was such that periodically power dipped slightly below what was needed to keep the hard drive powered. A sudden loss of power stops the session immediately and the logs just stop.
Image
A woman typing on a laptop with LM20.3 Cinnamon.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

Thank you both for helping me.

The new motherboard is less than a year younger than my other NUC's that are doing fine.
Would Intel have made changes in the mean time that would account for the problems?

Before trying to update the kernel I want to understand the potential consequences.
Does upgrading to 5.4.0-91 imply that I have to upgrade to LM19.3 or to LM20?
And if so does that imply that I have to upgrade applications?
I do not use very special things, I guess that Firefox and Thunderbird will run without problems, but it took some effort to get Adobe Digital running under Wine.

Kind regards, Erik
User avatar
SMG
Level 25
Level 25
Posts: 31911
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Frequent unsolicited reboot or session break off

Post by SMG »

ezijlstra wrote: Wed Dec 01, 2021 8:38 amThe new motherboard is less than a year younger than my other NUC's that are doing fine.
Would Intel have made changes in the mean time that would account for the problems?
Is your question would they have made changes to the board they sent you (as compared to the original one)? Yes, it is possible that might have happened, but I do not know that we would have any way of knowing that. Engineers make changes to processing lines all the time, but the changes are supposed to be within a certain tolerance.
ezijlstra wrote: Wed Dec 01, 2021 8:38 amBefore trying to update the kernel I want to understand the potential consequences.
Does upgrading to 5.4.0-91 imply that I have to upgrade to LM19.3 or to LM20?
Many people are using the 5.4 LTS kernel on LM19 versions. I believe most are using LM19.3, but I think there are people using older versions as well. I am not aware of a specific need or reason that you would have to upgrade in order to use the 5.4 kernel.

We have people using LM20 who are using kernels much newer than the 5.4 LTS because their hardware needs it. They are obviously not using LM20.3 because it has not yet been released.

As karlchen explained, it is quite easy to change back to an older kernel if you decide you prefer the older kernel.
Image
A woman typing on a laptop with LM20.3 Cinnamon.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

Thank you again for your explanation.

I did some further research and found out that the BIOS version of the new motherboard is 0087, whereas the other NUC has 0073. On the Intel site I downloaded the release notes and there have been numerous changes from 0073 to 0087. So this may explain different behaviour.

So I installed kernel 5.4.0.91. The system rebooted faster than before. I looked into the system log and there seem to be less errors. I hope this keeps up and let you in time if the issues have been fixed. It looks good so far.

Kind regards, Erik
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

Too bad; the system freezes after a period of 6 to 45 minutes; and now not only when on screen saver, but also as I am typing in this post.

Here is the new inxi:

Code: Select all

System:    Host: NUC3 Kernel: 5.4.0-91-generic x86_64 bits: 64 compiler: gcc v: 7.5.0 Desktop: Cinnamon 4.2.4 wm: muffin 4.2.2 
           dm: LightDM 1.26.0 Distro: Linux Mint 19.2 Tina base: Ubuntu 18.04 bionic 
Machine:   Type: Mini-pc System: Intel Client Systems product: NUC8i7BEH v: J72992-309 serial: <filter> 
           Chassis: Intel Corporation type: 35 v: 2.0 serial: <filter> 
           Mobo: Intel model: NUC8BEB v: J72688-309 serial: <filter> UEFI: Intel v: BECFL357.86A.0087.2020.1209.1115 
           date: 12/09/2020 
CPU:       Topology: Quad Core model: Intel Core i7-8559U bits: 64 type: MT MCP arch: Kaby Lake rev: A L2 cache: 8192 KiB 
           flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 43198 
           Speed: 500 MHz min/max: 400/4500 MHz Core speeds (MHz): 1: 500 2: 500 3: 500 4: 500 5: 500 6: 500 7: 500 8: 500 
Graphics:  Device-1: Intel driver: i915 v: kernel bus ID: 00:02.0 chip ID: 8086:3ea5 
           Display: x11 server: X.Org 1.19.6 driver: modesetting unloaded: fbdev,vesa resolution: 1920x1080~60Hz 
           OpenGL: renderer: Mesa DRI Intel Iris Plus Graphics 655 (CFL GT3) v: 4.6 Mesa 20.0.8 compat-v: 3.0 
           direct render: Yes 
Audio:     Device-1: Intel driver: snd_hda_intel v: kernel bus ID: 00:1f.3 chip ID: 8086:9dc8 
           Device-2: C-Media CM108 Audio Controller type: USB driver: hid-generic,snd-usb-audio,usbhid bus ID: 1-3.4:5 
           chip ID: 0d8c:013c 
           Sound Server: ALSA v: k5.4.0-91-generic 
Network:   Device-1: Intel driver: iwlwifi v: kernel port: 4000 bus ID: 00:14.3 chip ID: 8086:9df0 
           IF: wlp0s20f3 state: down mac: <filter> 
           Device-2: Intel Ethernet I219-V driver: e1000e v: 3.2.6-k port: efa0 bus ID: 00:1f.6 chip ID: 8086:15be 
           IF: eno1 state: up speed: 1000 Mbps duplex: full mac: <filter> 
Drives:    Local Storage: total: 931.52 GiB used: 483.20 GiB (51.9%) 
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO Plus 500GB size: 465.76 GiB speed: 31.6 Gb/s lanes: 4 
           serial: <filter> rev: 2B2QEXM7 scheme: GPT 
           ID-2: /dev/sda vendor: Samsung model: SSD 860 EVO 500GB size: 465.76 GiB speed: 6.0 Gb/s serial: <filter> rev: 3B6Q 
Partition: ID-1: / size: 456.96 GiB used: 151.10 GiB (33.1%) fs: ext4 dev: /dev/nvme0n1p2 
           ID-2: swap-1 size: 2.00 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/dm-0 
Sensors:   System Temperatures: cpu: 36.0 C mobo: N/A 
           Fan Speeds (RPM): N/A 
Repos:     No active apt repos in: /etc/apt/sources.list 
           Active apt repos in: /etc/apt/sources.list.d/cisofy-lynis.list 
           1: deb https://packages.cisofy.com/community/lynis/deb/ stable main
           Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list 
           1: deb http://packages.linuxmint.com tina main upstream import backport #id:linuxmint_main
           2: deb http://archive.ubuntu.com/ubuntu bionic main restricted universe multiverse
           3: deb http://archive.ubuntu.com/ubuntu bionic-updates main restricted universe multiverse
           4: deb http://archive.ubuntu.com/ubuntu bionic-backports main restricted universe multiverse
           5: deb http://security.ubuntu.com/ubuntu/ bionic-security main restricted universe multiverse
           6: deb http://archive.canonical.com/ubuntu/ bionic partner
           Active apt repos in: /etc/apt/sources.list.d/skype-stable.list 
           1: deb [arch=amd64] https://repo.skype.com/deb stable main
           Active apt repos in: /etc/apt/sources.list.d/teams.list 
           1: deb [arch=amd64] https://packages.microsoft.com/repos/ms-teams stable main
Info:      Processes: 274 Uptime: 7m Memory: 15.51 GiB used: 1.57 GiB (10.1%) Init: systemd v: 237 runlevel: 5 Compilers: 
           gcc: 7.5.0 alt: 7 Shell: bash v: 4.4.20 running in: gnome-terminal inxi: 3.0.32 
The are no BERT fatal hardware errors any more in the log.
My impression is that we now seeing a problem related to Intel graphics.
This is a line in the system log pertaining to graphics that may give a clue.

Code: Select all

dec 02 01:13:42 NUC3 kernel: i915 0000:00:02.0: Failed to program MOCS registers; expect performance issues.
What can I do in this situation?
User avatar
SMG
Level 25
Level 25
Posts: 31911
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Frequent unsolicited reboot or session break off

Post by SMG »

ezijlstra wrote: Wed Dec 01, 2021 7:10 pmI did some further research and found out that the BIOS version of the new motherboard is 0087, whereas the other NUC has 0073. On the Intel site I downloaded the release notes and there have been numerous changes from 0073 to 0087. So this may explain different behaviour.
According to this BIOS Update [BECFL357] , the latest is 0089.
ezijlstra wrote: Thu Dec 02, 2021 6:46 am

Code: Select all

dec 02 01:13:42 NUC3 kernel: i915 0000:00:02.0: Failed to program MOCS registers; expect performance issues.
I recognize that message and when I went back through my posts I discovered it seemed to be a common message with Kaby Lake processors. Usually, the recommendation I make is to upgrade the kernel, but those were LM20 installs where newer kernels are available. That line is usually only seen once at start-up.

What graphics problem are you experiencing that you think there is a problem related to graphics?

What happens when the system freezes? Does it lock up and you have to hit the power button to get out of the freeze?
Image
A woman typing on a laptop with LM20.3 Cinnamon.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

Hi SMG,

I am aware that there is yet another BIOS update, but it provides no functionality that I require at this time.
So I want to pursue the route to LM20 so as to be able to use kernels that are fit for my hw / BIOS configuration.

In answer to your question: yes, the screen freezes in the middle of some activity and I have to do a hard reset. As I read some posts that referred to problems with Intel graphics, I concluded that I have to move on to kernels beyond 5.4

I rolled back to the previous kernel 4.15.x to prepare for the upgrade to 19.3 as a first step.
Stumbling on a spurious error messages "to check internet connection", which I hope to resolve with assistance of post viewtopic.php?f=90&t=356432&p=2062511&h ... s#p2062511

I have to do some research what is entails for Wine to upgrade to LM20. Any recommendations?

Edit: NUC is now on LM19.3 and I discovered something I want to share as it may throw some light on this case. Please forgive my ignorance: For the upgrade to 19.3 I followed the instruction to disable the screen saver. After the upgrade I discovered that there is another option in the system settings under energy saving to switch of the screen and another option to pause the system. The energy saving option was active and I have turned it of. The option to pause the system was not active and I left it that way. To save my monitor I switch of the monitor manually with the hardware button. Now the system does not freeze any more. At least it is now already for a couple of hours happily running. So perhaps it may also have something to do with the energy management. What doe you think?

Thank you for your support.
I'll keep you posted.
Erik
User avatar
SMG
Level 25
Level 25
Posts: 31911
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Frequent unsolicited reboot or session break off

Post by SMG »

ezijlstra wrote: Fri Dec 03, 2021 11:41 amI am aware that there is yet another BIOS update, but it provides no functionality that I require at this time.
Once in a while a BIOS update might provide new functionality, but BIOS updates usually provide fixes to issues which have been discovered.
The release notes for BIOS Version 0089 indicate:
New Fixes/Features:
• Fixed issue with OFBD module SMI handler vulnerabilities.
• Updated CPU Microcode Firmware to 0xEA for IPU2021.1
• Removed option using “ESC” key to exit out of boot menu.
• Added protection code for unauthorized write at controllable
address in SMRAM.
• Fixed issue to achieve arbitrary write in SMRAM save state
region.
And for BIOS Version 0088:
New Fixes/Features:
• Updated ME Firmware version to 12.0.81.1753.
• Fixed issue when changing VR Temp to CPU Temp in performance
monitor.
• Updated NTFS DXE driver when parsing NTFS file system partition.
• Update BIOS code for security fixes.
The release notes are in the Documentation box on the right side of the link I provided in my prior post.
ezijlstra wrote: Fri Dec 03, 2021 11:41 amI have to do some research what is entails for Wine to upgrade to LM20. Any recommendations?
Sorry, I do not know of resources for that. I've never used Wine.
ezijlstra wrote: Fri Dec 03, 2021 11:41 am To save my monitor I switch of the monitor manually with the hardware button.
So you are doing this instead of using the Power Management setting to turn off the screen when the system is inactive?

I would think it might depend on the type of monitor and whether or not it has DPMS (Display Power Management System) settings as to how the changes you made in the operating system might impact the overall stability of the system (hardware and software combined).
ezijlstra wrote: Fri Dec 03, 2021 11:41 am Now the system does not freeze any more. At least it is now already for a couple of hours happily running. So perhaps it may also have something to do with the energy management. What doe you think?
If that works then that is great. Problem solved. :)

Did you leave the screensaver turned off? I do not understand all the underlying details, but I know there is interaction (and sometimes interference) between the screensaver and Power Management of the computer. It is possible just leaving the screensaver off was the main factor.
Image
A woman typing on a laptop with LM20.3 Cinnamon.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

If that works then that is great. Problem solved.
You are a bit too optimistic I am afraid.

Have a look at these error messages:

Code: Select all

-- Logs begin at Sun 2020-09-20 11:22:26 CEST, end at Sat 2021-12-04 16:30:45 CET. --
dec 04 09:26:12 NUC3 kernel: [drm:intel_bios_init [i915]] *ERROR* Unexpected child device config size 39 (expected 38 for VBT version 228)
dec 04 09:26:13 NUC3 systemd[1]: Failed to activate swap /swapfile.
-- Subject: Unit swapfile.swap has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit swapfile.swap has failed.
-- 
-- The result is RESULT.
dec 04 09:26:13 NUC3 kernel: iwlwifi 0000:00:14.3: BIOS contains WGDS but no WRDS
dec 04 09:26:14 NUC3 wpa_supplicant[846]: Failed to create interface p2p-dev-wlp0s20f3: -22 (Invalid argument)
dec 04 09:26:14 NUC3 wpa_supplicant[846]: nl80211: Failed to create a P2P Device interface p2p-dev-wlp0s20f3
dec 04 09:26:14 NUC3 lightdm[1101]: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared object file: No such file or directory
dec 04 09:26:14 NUC3 lightdm[1101]: PAM adding faulty module: pam_kwallet.so
dec 04 09:26:14 NUC3 lightdm[1101]: PAM unable to dlopen(pam_kwallet5.so): /lib/security/pam_kwallet5.so: cannot open shared object file: No such file or directory
dec 04 09:26:14 NUC3 lightdm[1101]: PAM adding faulty module: pam_kwallet5.so
dec 04 09:26:15 NUC3 lightdm[1186]: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared object file: No such file or directory
dec 04 09:26:15 NUC3 lightdm[1186]: PAM adding faulty module: pam_kwallet.so
dec 04 09:26:15 NUC3 lightdm[1186]: PAM unable to dlopen(pam_kwallet5.so): /lib/security/pam_kwallet5.so: cannot open shared object file: No such file or directory
dec 04 09:26:15 NUC3 lightdm[1186]: PAM adding faulty module: pam_kwallet5.so
dec 04 09:26:21 NUC3 systemd[1]: Failed to start Postfix Mail Transport Agent (instance -).
-- Subject: Unit postfix@-.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit postfix@-.service has failed.
-- 
-- The result is RESULT.
dec 04 09:26:23 NUC3 kernel: Could not find key with description: [hex]
dec 04 09:26:23 NUC3 kernel: Could not find valid key in user session keyring for sig specified in mount option: [hex]
dec 04 09:26:23 NUC3 kernel: Error parsing options; rc = [-2]
dec 04 09:26:24 NUC3 pulseaudio[1572]: [pulseaudio] backend-ofono.c: Failed to register as a handsfree audio agent with ofono: org.freedesktop.DBus.Error.ServiceUnknown: The name org.ofono was not provided by any .service files
dec 04 09:26:24 NUC3 pulseaudio[1651]: [pulseaudio] pid.c: Daemon already running.
~
i915 refers to the Intel graphics processor. I am not familiar with lightdm and pam_kwallet. Do you have a clue if these errors are related to kernel vs BIOS? Or is something missing or not well installed?

When booting this morning the system hung and I had to force a shutdown and restart. In view of these error messages I conclude that I have to move on to newer kernels as Karlchen and you suggested before.

Kind regards and thanks for your patience,
Erik
Last edited by ezijlstra on Sun Dec 05, 2021 6:42 pm, edited 1 time in total.
User avatar
SMG
Level 25
Level 25
Posts: 31911
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Frequent unsolicited reboot or session break off

Post by SMG »

ezijlstra wrote: Sat Dec 04, 2021 11:49 amHave a look at these error messages:
I see error messages all the time in logs. One has to look at them in relation to where they are in the log to know if they have any relevance. Error messages in isolation, such as what you have provided here, can be misleading.

i915 refers to the graphics driver. This message refers to the information in the BIOS not being what the Linux kernel expects it to be and is apparently a common message for Kaby Lake CPUs. It is not a problem.

Code: Select all

dec 04 09:26:12 NUC3 kernel: [drm:intel_bios_init [i915]] *ERROR* Unexpected child device config size 39 (expected 38 for VBT version 228)
The bug report filed against Intel KabyLake vbt file triggers Linux error message #45 indicates:
At this point in time there is no functional problem with this but it would be better to either synchronize the Linux driver with the newer VBT so the 39 bytes are actually supported or report 38 instead of 39 bytes.
With regards to the below, I do not know if you did a fresh install of LM19 or you upgraded from LM18. LM18 did not have swapfiles and required one to create a swap partition. Your install has a swap partition. One does not need to have both a swap partition and a swapfile, but LM19 is expecting to see a swap file. Did you do any special setup with regards to swap?

Code: Select all

dec 04 09:26:13 NUC3 systemd[1]: Failed to activate swap /swapfile.
-- Subject: Unit swapfile.swap has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit swapfile.swap has failed.
-- 
-- The result is RESULT.
The below is a well-known message that causes no issues. BIOS contains WGDS but no WRDS

Code: Select all

dec 04 09:26:13 NUC3 kernel: iwlwifi 0000:00:14.3: BIOS contains WGDS but no WRDS
The below is related to your wireless device. Your inxi output indicates you are not using it, so I would not think this is significant.

Code: Select all

dec 04 09:26:14 NUC3 wpa_supplicant[846]: Failed to create interface p2p-dev-wlp0s20f3: -22 (Invalid argument)
dec 04 09:26:14 NUC3 wpa_supplicant[846]: nl80211: Failed to create a P2P Device interface p2p-dev-wlp0s20f3
The below do not cause any issues. The error comes up because kwallet is not installed. Slow boot on Ubuntu 16.04 ... issue with pam_kwallet5.so. You can install it to make the messages disappear or you can just ignore it. (kwallet is a KDE app. I do not know why the system looks for it, but everyone with Mint gets these messages.)

Code: Select all

dec 04 09:26:14 NUC3 lightdm[1101]: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared object file: No such file or directory
dec 04 09:26:14 NUC3 lightdm[1101]: PAM adding faulty module: pam_kwallet.so
dec 04 09:26:14 NUC3 lightdm[1101]: PAM unable to dlopen(pam_kwallet5.so): /lib/security/pam_kwallet5.so: cannot open shared object file: No such file or directory
dec 04 09:26:14 NUC3 lightdm[1101]: PAM adding faulty module: pam_kwallet5.so
dec 04 09:26:15 NUC3 lightdm[1186]: PAM unable to dlopen(pam_kwallet.so): /lib/security/pam_kwallet.so: cannot open shared object file: No such file or directory
dec 04 09:26:15 NUC3 lightdm[1186]: PAM adding faulty module: pam_kwallet.so
dec 04 09:26:15 NUC3 lightdm[1186]: PAM unable to dlopen(pam_kwallet5.so): /lib/security/pam_kwallet5.so: cannot open shared object file: No such file or directory
dec 04 09:26:15 NUC3 lightdm[1186]: PAM adding faulty module: pam_kwallet5.so
Postfix is apparently a default mail protocol. I could not find any reference to it on my install so I do not know if it is relevant.

Code: Select all

dec 04 09:26:21 NUC3 systemd[1]: Failed to start Postfix Mail Transport Agent (instance -).
-- Subject: Unit postfix@-.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Unit postfix@-.service has failed.
-- 
-- The result is RESULT.
The below is related to an encrypted device. Could not find valid key in user session keyring for sig specified in mount option after upgrade from 16.04 to 18.04.

Code: Select all

dec 04 09:26:23 NUC3 kernel: Could not find key with description: [d917192c24749df9]
dec 04 09:26:23 NUC3 kernel: Could not find valid key in user session keyring for sig specified in mount option: [d917192c24749df9]
dec 04 09:26:23 NUC3 kernel: Error parsing options; rc = [-2]
The below apparently relates to a hands-free audio device.

Code: Select all

dec 04 09:26:24 NUC3 pulseaudio[1572]: [pulseaudio] backend-ofono.c: Failed to register as a handsfree audio agent with ofono: org.freedesktop.DBus.Error.ServiceUnknown: The name org.ofono was not provided by any .service files
dec 04 09:26:24 NUC3 pulseaudio[1651]: [pulseaudio] pid.c: Daemon already running.
ezijlstra wrote: Sat Dec 04, 2021 11:49 amWhen booting this morning the system hung and I had to force a shutdown and restart. In view of these error messages I conclude that I have to move on to newer kernels as Karlchen and you suggested before.
I would suggest updating your BIOS since your system is missing the updated CPU Microcode Firmware and ME Firmware updates. Also check to make sure your swap and encryption are set up properly.
Edited to add missing close code tag.
Image
A woman typing on a laptop with LM20.3 Cinnamon.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

Thanks again for your elaborate reply.
I am glad to learn that most of the error messages are harmless.
I do not know if you did a fresh install of LM19 or you upgraded from LM18.
I did a fresh installation of LM19.2 and followed the installation guide. So I am not aware of having done anything special with the swap file / partition.
Is there a way to fix this? Would this constitute a problem for the upgrade to LM20? Or can I just ignore it and see what will happen?
The below is related to your wireless device. Your inxi output indicates you are not using it, so I would not think this is significant.
Contrary to the first inxi, the system is now using wifi and the error messages pertain to this situation. Wifi is working, but sometimes quite slow. This may be the internet connection. But I sometimes wonder if there is something wrong in the system with the network connection. This is the present inxi.

Code: Select all

System:    Host: NUC3 Kernel: 4.15.0-163-generic x86_64 bits: 64 compiler: gcc v: 7.5.0 Desktop: Cinnamon 4.4.8 
           wm: muffin 4.4.4 dm: LightDM 1.26.0 Distro: Linux Mint 19.3 Tricia base: Ubuntu 18.04 bionic 
Machine:   Type: Mini-pc System: Intel Client Systems product: NUC8i7BEH v: J72992-309 serial: <filter> 
           Chassis: Intel Corporation type: 35 v: 2.0 serial: <filter> 
           Mobo: Intel model: NUC8BEB v: J72688-309 serial: <filter> UEFI: Intel v: BECFL357.86A.0087.2020.1209.1115 
           date: 12/09/2020 
CPU:       Topology: Quad Core model: Intel Core i7-8559U bits: 64 type: MT MCP arch: Kaby Lake rev: A L2 cache: 8192 KiB 
           flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 43198 
           Speed: 594 MHz min/max: 400/4500 MHz Core speeds (MHz): 1: 590 2: 586 3: 538 4: 500 5: 537 6: 571 7: 546 8: 500 
Graphics:  Device-1: Intel driver: i915 v: kernel bus ID: 00:02.0 chip ID: 8086:3ea5 
           Display: x11 server: X.Org 1.19.6 driver: modesetting unloaded: fbdev,vesa resolution: 2560x1440~60Hz 
           OpenGL: renderer: Mesa DRI Intel Iris Plus Graphics 655 (CFL GT3) v: 4.6 Mesa 20.0.8 compat-v: 3.0 
           direct render: Yes 
Audio:     Device-1: Intel driver: snd_hda_intel v: kernel bus ID: 00:1f.3 chip ID: 8086:9dc8 
           Sound Server: ALSA v: k4.15.0-163-generic 
Network:   Device-1: Intel driver: iwlwifi v: kernel port: 4000 bus ID: 00:14.3 chip ID: 8086:9df0 
           IF: wlp0s20f3 state: up mac: <filter> 
           Device-2: Intel Ethernet I219-V driver: e1000e v: 3.2.6-k port: efa0 bus ID: 00:1f.6 chip ID: 8086:15be 
           IF: eno1 state: down mac: <filter> 
Drives:    Local Storage: total: 960.34 GiB used: 489.78 GiB (51.0%) 
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO Plus 500GB size: 465.76 GiB speed: 31.6 Gb/s lanes: 4 
           serial: <filter> rev: 2B2QEXM7 scheme: GPT 
           ID-2: /dev/sda vendor: Samsung model: SSD 860 EVO 500GB size: 465.76 GiB speed: 6.0 Gb/s serial: <filter> rev: 3B6Q 
           ID-3: /dev/sdb type: USB vendor: Kingston model: DataTraveler 3.0 size: 28.82 GiB serial: <filter> scheme: MBR 
Partition: ID-1: / size: 456.96 GiB used: 150.51 GiB (32.9%) fs: ext4 dev: /dev/nvme0n1p2 
           ID-2: swap-1 size: 2.00 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/dm-0 
Sensors:   System Temperatures: cpu: 38.0 C mobo: 27.8 C 
           Fan Speeds (RPM): N/A 
Repos:     No active apt repos in: /etc/apt/sources.list 
           Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list 
           1: deb http://packages.linuxmint.com tricia main upstream import backport #id:linuxmint_main
           2: deb http://archive.ubuntu.com/ubuntu bionic main restricted universe multiverse
           3: deb http://archive.ubuntu.com/ubuntu bionic-updates main restricted universe multiverse
           4: deb http://archive.ubuntu.com/ubuntu bionic-backports main restricted universe multiverse
           5: deb http://security.ubuntu.com/ubuntu/ bionic-security main restricted universe multiverse
           6: deb http://archive.canonical.com/ubuntu/ bionic partner
Info:      Processes: 293 Uptime: 5h 40m Memory: 15.54 GiB used: 3.24 GiB (20.9%) Init: systemd v: 237 runlevel: 5 Compilers: 
           gcc: 7.5.0 alt: 7 Shell: bash v: 4.4.20 running in: gnome-terminal inxi: 3.0.32
I amended the Pam configuration files and the pam-kwallet messages have disappeared.

I have no idea where postfix has entered the system. It is not on my other NUCs. Any idea how to get rid of it?
I would suggest updating your BIOS since your system is missing the updated CPU Microcode Firmware and ME Firmware updates.
What is the observation that leads to this suggestion? What makes you conclude that there is something missing in the BIOS / firmware?
I got the impression that the gap between the kernel and the BIOS is too big and that I have to upgrade the kernel. In view of the Intel graphics problems 5.4 is not suitable. The first viable kernel is 5.5. But you tend to upgrading the BIOS which makes the gap even bigger. Or am I misinterpreting your advice?

Kind regards, Erik
User avatar
SMG
Level 25
Level 25
Posts: 31911
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Frequent unsolicited reboot or session break off

Post by SMG »

ezijlstra wrote: Sun Dec 05, 2021 1:06 pmI did a fresh installation of LM19.2 and followed the installation guide. So I am not aware of having done anything special with the swap file / partition.
Is there a way to fix this? Would this constitute a problem for the upgrade to LM20? Or can I just ignore it and see what will happen?
I would suggest fixing it before upgrading. That is the first time I've seen that message in journalctl output and having swap issues can cause freezes/crashes.

What is the output of cat /etc/fstab, free, and ls -l /swapfile? (Those are three different terminal commands to run one at a time.)
ezijlstra wrote: Sun Dec 05, 2021 1:06 pmI have no idea where postfix has entered the system. It is not on my other NUCs. Any idea how to get rid of it?
To be honest, that was the first time I had ever heard of it. I do not know its purpose.
ezijlstra wrote: Sun Dec 05, 2021 1:06 pm
I would suggest updating your BIOS since your system is missing the updated CPU Microcode Firmware and ME Firmware updates.
What is the observation that leads to this suggestion? What makes you conclude that there is something missing in the BIOS / firmware?
Because the release notes (which I posted above) for those BIOS updates indicate that is what Intel changed. I presume the Intel engineers would not make changes unless they had a good reason for doing so.
ezijlstra wrote: Sun Dec 05, 2021 1:06 pmI got the impression that the gap between the kernel and the BIOS is too big and that I have to upgrade the kernel. In view of the Intel graphics problems 5.4 is not suitable. The first viable kernel is 5.5. But you tend to upgrading the BIOS which makes the gap even bigger. Or am I misinterpreting your advice?
The BIOS updates are completely independent of the operating system. They have nothing to do with what kernel you are using. Intel will put out BIOS updates when they find out there are problems and changing the BIOS code will fix the problems. They actually listed for each update what changes they made. Not all manufacturers give that kind of detail.

My recommendation on what kernel to use was based on feedback from others on this forum as to what stopped freeze problems they were having. It did not have anything to do with your BIOS version. However, sometimes BIOS updates can correct problems which cause freezes.
Image
A woman typing on a laptop with LM20.3 Cinnamon.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

Hi SMG,
Here is the information you asked for:

Code: Select all

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/nvme0n1p2 during installation
UUID=05d3f7d2-2680-4d27-8da3-7b18025ec364 /               ext4    errors=remount-ro 0       1
# /boot/efi was on /dev/nvme0n1p1 during installation
UUID=E73D-B9DB  /boot/efi       vfat    umask=0077      0       1
/swapfile                                 none            swap    sw              0       0
/dev/mapper/cryptswap1 none swap sw 0 0
LABEL=Backup\040bestanden /mnt/Backup\040bestanden auto nosuid,nodev,nofail 0 0

Code: Select all

$ free
              total        used        free      shared  buff/cache   available
Mem:       16293504     1819528     7729576      496884     6744400    13648384
Swap:       2096636           0     2096636

Code: Select all

$ ls -l /swapfile
-rw------- 1 root root 2147483648 dec  5 12:24 /swapfile
I found some information on Postfix. I guess it came with some tool I installed to analyse slow internet traffic. You don't have to research that.

kind regards, Erik
User avatar
SMG
Level 25
Level 25
Posts: 31911
Joined: Sun Jul 26, 2020 6:15 pm
Location: USA

Re: Frequent unsolicited reboot or session break off

Post by SMG »

ezijlstra wrote: Sun Dec 05, 2021 6:40 pm Hi SMG,
Here is the information you asked for:
Unless someone else is following along and can give advice, I would suggest creating a new topic on this swap issue. I know basic information, but I do not have experience helping others fix swap issues, especially when it involves encrypted swap.

Here is the most recent inxi info you posted which shows you have a swap partition that seems to be encrypted (the dm-0 part). I do not know why it is encrypted because it does not appear your install is encrypted and the swap partition is too small for hibernation.

Code: Select all

Partition: ID-1: / size: 456.96 GiB used: 151.10 GiB (33.1%) fs: ext4 dev: /dev/nvme0n1p2 
           ID-2: swap-1 size: 2.00 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/dm-0 
The info from fstab shows both a swapfile and a swap partition.

Code: Select all

# <file system>       <mount point>   <type>  <options>     <dump>  <pass>
/swapfile              none            swap     sw             0      0
/dev/mapper/cryptswap1 none            swap     sw             0      0
This shows you do have a swapfile.

Code: Select all

$ ls -l /swapfile
-rw------- 1 root root 2147483648 dec  5 12:24 /swapfile
The default size of a swapfile created by Mint is 2GB, but that is also the size showing for your swap partition. So I am not sure which 2GB is showing below when taking into account the error message that was in the log (-- Subject: Unit swapfile.swap has failed).

Code: Select all

$ free
              total        used        free      shared  buff/cache   available
Mem:       16293504     1819528     7729576      496884     6744400    13648384
Swap:       2096636           0     2096636
Based on what I have mentioned here, do you recall anything special which might have been done with regards to swap? Did you at any time try to optimize swap?
Image
A woman typing on a laptop with LM20.3 Cinnamon.
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

do you recall anything special which might have been done with regards to swap? Did you at any time try to optimize swap?
The only thing that I can think of is that I have an encrypted home directory. This may explain that the swapfile has to be encrypted too and perhaps this is implemented by means of an encrypted partition.

I read somewhere that ecryptfs can better be replaced by full disk encryption. And I think I go for the latter and do a fresh installation of LM20.2.

I will let you know if is results in a stable system.

Tanks again for you support. Erik
ezijlstra
Level 1
Level 1
Posts: 25
Joined: Tue Nov 30, 2021 4:43 pm

Re: Frequent unsolicited reboot or session break off

Post by ezijlstra »

After my previous post I started preparations for installing LM20.2, but then I discovered that it comes with kernel 5.4.x and not with a newer kernel. So I decided not to upgrade as 5.4 gave frequent sudden freezes.
In the mean time the system behaves fairly stable with the automatic monitor switch off disabled.
Only when booting the system every now and then doesn't reach the login screen.
I edited the grub configuration according to the release notes: trying nomodeset, but this lead to a complaint about the video driver.
Next I booted with pcie=noaer which seems to perform well, but still sometimes when booting the system does not reach the login screen.

Then I tried to trap the system at the failing point by editing the booting commands: setting gfxmode text and removing quiet splash and vt.handoff.
This produces a lot of messages flying over the screen, but the system boots perfectly.

Today I switched the computer on and missed the timeout for editing the grub command lines. And guess what: the system booted, but no login screen.
I subsequently notices that the on/off switch of the monitor didn't have any effect.
I removed the power from the monitor and put it back on: and there was the login screen.

Any idea how to diagnose this problem??
Locked

Return to “Hardware Support”