system load average pegged at 1.3, but system is mostly idle

Questions about other topics - please check if your question fits better in another category before posting here
Forum rules
Before you post please read how to get help
Post Reply
rmbjr60
Level 1
Level 1
Posts: 19
Joined: Sun Feb 18, 2018 3:33 pm

system load average pegged at 1.3, but system is mostly idle

Post by rmbjr60 »

I frequently get my system into a state in which it has a continual load average of ~1.3 and the fan is running high, yet I've closed all applications. According to top there is almost nothing going on the system.

I used vmstat and noticed I've got a steady rate of ~40k interrupts/sec and ~4k context switches/sec ... both of which seem *very* high and should not be steady at these high rates.

This state usually occurs after I've been running a couple of browsers, each with many tabs open, including tabs at sites with ads playing almost continually. But I close all browsers, and yet the situation continues.

I'm on LM 21 (but I've observed this problem since at least LM 20) My system has 24G of memory and a 2G swap space. Most of each type of memory is free (see top output below). I've got an 8-core i7-8550U CPU @ 1.8GHz.

Any ideas what is causing these interrupts and context switches???

Here's the output of vmstat and top...

Code: Select all


% vmstat 5 55
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0   1280 12047476 2148684 9079024    0    0     8    34   27   13  3  1 96  0  0
 0  0   1280 12074440 2148688 9062704    0    0     0   218 39825 4117  0  0 100  0  0
 1  0   1280 11971408 2149216 9063120    0    0   185     0 39940 4337  6  4 90  0  0
 0  0   1280 12066148 2149224 9063180    0    0     2    27 39496 4109  2  2 96  0  0
 0  0   1280 12073708 2149224 9063116    0    0     0     0 39803 4104  0  0 100  0  0
 0  0   1280 12064636 2149224 9063116    0    0     0     0 39782 4059  0  0 100  0  0
 1  0   1280 11918284 2149376 9094520    0    0  5103   163 39718 5581  3  1 94  2  0
 0  0   1280 11862432 2149484 9092956    0    0   486  1816 39999 10421  9  1 90  0  0
 0  0   1280 11862180 2149488 9092892    0    0     0   436 39816 4501  0  0 100  0  0
 0  0   1280 11860720 2149488 9092892    0    0     0    94 39830 4317  0  0 100  0  0
 2  0   1280 11860216 2149496 9092892    0    0     0     3 39797 4331  0  0 100  0  0
^C

%  top -b
top - 01:03:50 up 2 days, 20:02,  1 user,  load average: 1.36, 1.78, 1.95
Tasks: 272 total,   1 running, 271 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  1.5 sy,  0.0 ni, 97.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  7.5/24351496 [|||||||                                                                                             ]
KiB Swap:  0.1/2097148  [                                                                                                    ]

S USER        VIRT    RES   SWAP    SHR   CODE    DATA nMaj nMin vMj vMn OOMs  OOMa nTH WCHAN      TTY       %CPU  %MEM   TIME     PID COMMAND
S root      167576  13104      0   8272    896   20840  148  74k   0   2    0     0   1 -          ?          6.2   0.1   0:05       1 /sbin/init splash
S r6m0bjr  4783372 180380      0 101508      4  215820    6  60k   0   0  670     0  21 do_poll.c+ ?          6.2   0.7   0:03  195640 cinnamon --replace
R r6m0bjr    13240   4292      0   3432     80    1444    0  340   0   5  666     0   1 -          pts/0      6.2   0.0   0:00  196098 top -b

top - 01:03:54 up 2 days, 20:02,  1 user,  load average: 1.33, 1.76, 1.94
Tasks: 272 total,   1 running, 271 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.4 us,  0.3 sy,  0.0 ni, 99.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  7.5/24351496 [|||||||                                                                                             ]
KiB Swap:  0.1/2097148  [                                                                                                    ]

S USER        VIRT    RES   SWAP    SHR   CODE    DATA nMaj nMin vMj vMn OOMs  OOMa nTH WCHAN      TTY       %CPU  %MEM   TIME     PID COMMAND
S r6m0bjr   406624  50432      0  30968   2756   53144    1 5960   0   0  667     0   4 do_poll.c+ ?          1.2   0.2   0:00  196046 mintreport-tray
D root           0      0      0      0      0       0    0    0   0   0    0     0   1 -          ?          0.7   0.0  16:13     264 [irq/51-DELL0804]
S r6m0bjr  4783372 180380      0 101508      4  215820    6  60k   0   1  670     0  21 do_poll.c+ ?          0.5   0.7   0:03  195640 cinnamon --replace
S root      276148  10756      0   9748    344   25928   20  616   0   0  666     0   4 -          ?          0.2   0.0   4:38     881 /usr/sbin/thermald --systemd --dbus-enable --adaptive
S root     1726348  55624      0  35572  18972  163672    0 7426   0   0    2   -16  14 -          ?          0.2   0.2   0:30   84374 /usr/bin/containerd
S root     1552164 101892      0  67712   1696  162388   17  21k   0  30  668     0  17 -          tty7       0.2   0.4   0:01  195252 /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
S r6m0bjr   540544  42220      0  32764    168   58760  128 4822   0  25  667     0   4 -          ?          0.2   0.2   0:00  195905 /usr/libexec/gnome-terminal-server
I root           0      0      0      0      0       0    0    0   0   0    0     0   1 -          ?          0.2   0.0   0:00  196043 [kworker/0:1-events]
R r6m0bjr    13240   4292      0   3432     80    1444    0  345   0   5  666     0   1 -          pts/0      0.2   0.0   0:00  196098 top -b

top - 01:03:58 up 2 days, 20:03,  1 user,  load average: 1.33, 1.76, 1.94
Tasks: 272 total,   1 running, 271 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.2 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  7.5/24351496 [|||||||                                                                                             ]
KiB Swap:  0.1/2097148  [                                                                                                    ]

S USER        VIRT    RES   SWAP    SHR   CODE    DATA nMaj nMin vMj vMn OOMs  OOMa nTH WCHAN      TTY       %CPU  %MEM   TIME     PID COMMAND
S r6m0bjr  4783372 180444      0 101572      4  215820    6  61k   0  12  670     0  20 do_poll.c+ ?          0.7   0.7   0:03  195640 cinnamon --replace
D root           0      0      0      0      0       0    0    0   0   0    0     0   1 -          ?          0.5   0.0  16:13     264 [irq/51-DELL0804]
I root           0      0      0      0      0       0    0    0   0   0    0     0   1 -          ?          0.2   0.0   3:16       9 [kworker/0:1H-events_highpri]
I root           0      0      0      0      0       0    0    0   0   0    0     0   1 -          ?          0.2   0.0   1:00      14 [rcu_sched]
I root           0      0      0      0      0       0    0    0   0   0    0     0   1 -          ?          0.2   0.0   0:09     149 [kworker/7:1H-events_highpri]
S root     1552164 101892      0  67712   1696  162388   17  21k   0   5  668     0  17 -          tty7       0.2   0.4   0:01  195252 /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
R r6m0bjr    13240   4292      0   3432     80    1444    0  346   0   1  666     0   1 -          pts/0      0.2   0.0   0:00  196098 top -bq^C



User avatar
deck_luck
Level 6
Level 6
Posts: 1373
Joined: Mon May 27, 2019 6:57 pm
Location: R-4808 North

Re: system load average pegged at 1.3, but system is mostly idle

Post by deck_luck »

rmbjr60 wrote:
Fri Sep 16, 2022 1:22 am
...
I used vmstat and noticed I've got a steady rate of ~40k interrupts/sec and ~4k context switches/sec ... both of which seem *very* high and should not be steady at these high rates.
...
That is a good catch and, most desktop users would never identify the interrupts/sec. Back in my HP-UX days, one of our customers ran in to a crippling slow system response. Eventually, it was tracked down to very high interrupts/sec. It was caused by an external cable being connected to the PA-RISC computer but, the other end of the cable was disconnected. The cable was acting like an antenna and caused spurious interrupts.

Troubleshooting:

First, you might try a minimum configuration by disconnecting everything expect the bare essential like the keyboard, mouse, monitor, etc. If that fails try re-seating the boards on the motherboard as well as any internal cables.
🐧Linux Mint 19 XFCE 💡Give a friend a fish, and you feed them for a day. Teach a friend how to fish, and you feed them for a lifetime. ✝️ Proverbs 4:7 Wisdom is the principal thing; therefore get wisdom: and with all thy getting get understanding.
rmbjr60
Level 1
Level 1
Posts: 19
Joined: Sun Feb 18, 2018 3:33 pm

Re: system load average pegged at 1.3, but system is mostly idle

Post by rmbjr60 »

I've tried disconnecting various bits of hardware (usb drive, keyboard, mouse) - initially, to no avail.

The problem would recur even if I rebooted the system. However, as of three days ago the problem has reduced drastically (but not completely disappeared). This was shortly after connecting/disconnecting various devices (which, as I said, did not resolve the problem) and rebooting the system. I still don't know what caused the improvement. Anyway...

After googling around I came upon what might be going on. This is an issue that has been reported since at least 2013 (probably earlier) and is related to interrupts from acpi (Advance Configuration and Power Interface). Searching on "acpi interrupts high cpu" returns tons of matches regarding this problem.

Here are typical examples..

https://unix.stackexchange.com/question ... 300#528300

https://bbs.archlinux.org/viewtopic.php ... 0#p1696370

To identify if the problem is occurring, The first step is to find if any of the acpi interrupts are increasing at a high rate. The typical way to find this is...

Code: Select all

grep . -r /sys/firmware/acpi/interrupts
... and then looking through the output for very high integers. If there is a match, repeat the command a few times and see if that high integer increases at a "high" rate (presumably approximately increasing at the rate of interrupts reported by vmstat?)

I preferred to narrow down my grep results with the longer format...

Code: Select all

grep . -r /sys/firmware/acpi/interrupts | grep /gpe | grep -Eav ':[ \t][ \t]*.[ \t][ \t]*|gpe_all'
... although this longer command is not necessary. It simply makes the output less cluttered.

On my system, I did indeed have one particular interrupt that increased by leaps and bounds every time I repeated the grep...

Code: Select all

grep . -r /sys/firmware/acpi/interrupts | grep /gpe | grep -Eav ':[ \t][ \t]*.[ \t][ \t]*|gpe_all'
/sys/firmware/acpi/interrupts/gpe6E: 1955895 EN enabled unmasked
... that number, 1955895, stood out like a sore thumb when I ran the first grep. And, it was increasing.

These acpi interrupts are specific to a system manufacturer's firmware (e.g., Dell, HP, Lenovo, etc.) and are often used to signal events such as the closing of a laptop's lid, or the power button having been pressed. You can disable one of these interrupts and you'll only lose the ability to detect that particular interrupt. Unfortunately, it is difficult to find out which interrupt is associated with which event. Nevertheless, the advised workaround I've found in all answers is to disable the offending interrupt - no matter which interrupt it is (as long is it an acpi interrupt)

How to disable? According to the first link above, the current method is as follows..

Code: Select all

echo mask | sudo tee /sys/firmware/acpi/interrupts/gpeXX
.. where 'XX' corresponds to the last two hex digits of the interrupt name in the grep output. For example, in my case (see output, above) the interrupt path is /sys/firmware/acpi/interrupts/gpe6E ... so for me the value of 'XX' is '6E'.

There are variations on the form of the command (e.g., echo mask > /sys/firmware/interrupts/gpeXX), but you may have problems with permission denied. Doing "sudo echo mask ..." won't work because the io redirection is taken care of by your shell, not the command being run by sudo. There are other ways to get sudo to work. Using "... | sudo tee ..." is just one of many.

Also, it appears an earlier workaround was to use "echo disable" rather than "echo mask". According to the first of the links above, "echo mask" is the newer, correct, way.

This should immediately stop that interrupt from occurring. However, it will come back after reboot. There are multiple solutions for making it persistent - I won't give one here (they're fairly easy to find) because I'm not sure which is the best (if there is 'best'), and also it seems some solutions depend on the flavor of linux you'r using.

For a nice explanation of why the gpe interrupt flooding occurs, check this out: https://git.kernel.org/pub/scm/linux/ke ... e58fbafc11
Petermint
Level 8
Level 8
Posts: 2474
Joined: Tue Feb 16, 2016 3:12 am

Re: system load average pegged at 1.3, but system is mostly idle

Post by Petermint »

Interesting issue you present .How did you display the interrupts? I might check my machines as a reference point for comparison if they have problems.

As a general rule for diagnosis, measure, change one thing, reboot, measure. When you fix your system, please post the result for comparison. There may be many other machines with a similar problem undetected.

As an example my desktop server has bunches of stuff plugged in and some cables hanging off it where things were plugged in. I do not know how active it is and what would be the idle level of activity if cleaned up.
rmbjr60
Level 1
Level 1
Posts: 19
Joined: Sun Feb 18, 2018 3:33 pm

Re: system load average pegged at 1.3, but system is mostly idle

Post by rmbjr60 »

... yes, it is a *very* interesting situation!

I've learned to check my interrupts in various ways. I currently use all of them..

(1) the first way I learned was via "vmstat", which prints the interrupt rate *and* the context switches rate. I prefer it with the -t (timestamps) and -w (wide output) switches. For example..

Code: Select all

% vmstat -t -w 5 5

--procs-- -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu-------- -----timestamp-----
   r    b         swpd         free         buff        cache   si   so    bi    bo   in   cs  us  sy  id  wa  st                 EDT
   0    0            0     20281960       238936      2104328    0    0   143    19  245  507   4   2  94   0   0 2022-09-26 22:15:27
   0    0            0     20289576       238944      2096116    0    0     0    42 3280 7316   6   3  91   0   0 2022-09-26 22:15:32
   0    0            0     20304192       238952      2104328    0    0     0    25 3286 7305   6   3  91   0   0 2022-09-26 22:15:37
   0    0            0     20305244       238952      2096116    0    0     0    14 3242 7300   6   3  90   0   0 2022-09-26 22:15:42
   0    0            0     20288136       238952      2104332    0    0     0     0 3278 7235   6   3  91   0   0 2022-09-26 22:15:47
.. the first sample can always be discarded since it hasn't observed a previous sample from which to calculate a meaningful rate.

In the above example the system is operating "normally" ... i.e., the rate of interrupts, ~3.2K/second, is minuscule in comparison to the rate during one of my interrupt storms (~40k/sec).

Here's how vmstat looks when a storm kicks in..

Code: Select all

   0    0          596      8292252       733740      8957624    0    0     0    11 1233 2249  63  37   0   0   0 2022-09-26 17:05:52
   0    0          596      8285376       733820      8957556    0    0     0    23 1299 2278  59  41   0   0   0 2022-09-26 17:06:52
   1    0          596      8314628       733896      8959412    0    0     0    19 4122 3318  70  30   0   0   0 2022-09-26 17:07:52
   0    0          596      8152048       734100      9084252    0    0    43    60 17787 5755  61  27   0  12   0 2022-09-26 17:23:15
   0    0          596      8180104       734184      9063972    0    0     0    74 40900 7298  55  45   0   0   0 2022-09-26 17:24:15
   0    0          596      8174888       734244      9064000    0    0     0    14 40899 7198  54  46   0   0   0 2022-09-26 17:25:15
   0    0          596      8157948       734344      9064012    0    0     0    14 40900 7223  56  44   0   0   0 2022-09-26 17:26:15
   0    0          596      8139664       734396      9064244    0    0     0     9 40899 7244  56  44   0   0   0 2022-09-26 17:27:15
   0    0          596      8127700       734456      9064064    0    0     0     8 40857 7253  55  45   0   0   0 2022-09-26 17:28:15
   0    0          596      8114152       734504      9064068    0    0     0    12 40785 7144  54  46   0   0   0 2022-09-26 17:29:15
   0    0          596      8098280       734560      9064184    0    0     0    10 40782 7114  54  46   0   0   0 2022-09-26 17:30:15
... there is no mistaking that something happened between 17:22 and 17:23! From that point onward the interrupt rate was pegged at ~40k/second ... which persisted until I rebooted the system.

(2) Another way I keep an eye on the interrupts is by doing a sequence of 'cat /proc/interrupts' and looking for any value that increases at a high rate. I found a great visual tool to assist in this approach: 'watch' (from the 'procps' package); it runs a console command periodically, refreshing the screen with each iteration and highlighting any/all on-screen characters that changed from one iteration to the next. I was able to identify the interrupt that increased at the highest rate (by at least an order of magnitude): interrupt 16, which is associated with (according to "cat /proc/interrupts"): "IR-IO-APIC 16-fasteoi i801_smbus, idma64.0, i2c_designware.0". Sadly, I don't understand what it all means ... but the fact that its count is increasing at such a high rate is a significant clue! Furthermore, when the problem is *not* occurring, there is no interrupt in the 'watch' output that increases anywhere nearly as quickly as this one. Clearly this is a smoking gun.

(3) An earlier method I used turned out to be a red herring for me. There has been an ongoing issue with acpi interrupts for several years. The best way to identify if an acpi interrupt flood is occurring is to run..

Code: Select all

grep . -r /sys/firmware/acpi/interrupts
.. and looking for any particular interrupt that increases at a high rate. Initially I thought gpe6E was my culprit; so I disabled it. Unfortunately, even though I confirmed gpe6E was disabled (and masked), my system's interrupt storm continued.

So, those are the three ways I've looked at interrupts. A few days ago I thought it was just a matter of disabling gpe6E. After that proved to be ineffective I started attempting to associate particular system events with the time that an interrupt storm occurred.

Tonight I found what I hoped was a smoking gun ... the interrupt storm appeared to kick in roughly at the time I did one of four activities: open the lid on my laptop, plug my laptop into A/C power, attach an external monitor, and reconnect a USB drive... all of which are activities I took at approximately the same moment my interrupt storm commenced this evening (I had been collecting 'vmstat' results 1x/minute for several hours, so I have a fairly accurate gauge as to when the problem started).

I am now in the stage of trying to reproduce the issue via one of these four events. As of yet, I've not succeeded in reproducing the problem on a reliable basis.
Petermint
Level 8
Level 8
Posts: 2474
Joined: Tue Feb 16, 2016 3:12 am

Re: system load average pegged at 1.3, but system is mostly idle

Post by Petermint »

Thanks for the info. 8) I ran the same test. 400 ins at idle with just a browser and email open. Started System monitor. Ins jumped to 1000. Told email to check the server. 3000 Ins. Tomorrow I will run same test on the server with all the junk attached.

That weird item you mention. I searched and found references to a support chip for Celerons and kernel 6.0.
User avatar
Pjotr
Level 23
Level 23
Posts: 18163
Joined: Mon Mar 07, 2011 10:18 am
Location: The Netherlands (Holland) 🇳🇱
Contact:

Re: system load average pegged at 1.3, but system is mostly idle

Post by Pjotr »

Very interesting, but also very fruitless for potential helpers, because you've provided far too little system info. :wink:

Please generate an overview of your system like this:
- Launch a terminal window (this is how to launch a terminal window);
- Make the terminal window full screen, to avoid chopped lines;
- Copy/paste this command into the terminal:

Code: Select all

inxi -Fxpmrz
(if you type: the letter F is a capital letter, and don't omit the space after inxi!)

Press Enter.

Copy/paste the output in your next message.
Tip: 10 things to do after installing Linux Mint 21 Vanessa
Keep your Linux Mint healthy: Avoid these 10 fatal mistakes
Twitter: twitter.com/easylinuxtips
All in all, horse sense simply makes sense.
rmbjr60
Level 1
Level 1
Posts: 19
Joined: Sun Feb 18, 2018 3:33 pm

Re: system load average pegged at 1.3, but system is mostly idle

Post by rmbjr60 »

Here is the output you requested...

Code: Select all

% inxi -Fxpmrz
System:
  Kernel: 5.15.0-48-generic x86_64 bits: 64 compiler: gcc v: 11.2.0
    Desktop: Cinnamon 5.4.12 Distro: Linux Mint 21 Vanessa
    base: Ubuntu 22.04 jammy
Machine:
  Type: Laptop System: Dell product: Inspiron 5379 v: N/A
    serial: <superuser required>
  Mobo: Dell model: 01R32P v: A00 serial: <superuser required> UEFI: Dell
    v: 1.11.0 date: 01/15/2019
Battery:
  ID-1: BAT0 charge: 25.6 Wh (100.0%) condition: 25.6/42.0 Wh (60.9%)
    volts: 12.9 min: 11.4 model: Samsung SDI DELL CYMGM7A status: Full
  Device-1: hidpp_battery_0 model: Logitech Wireless Keyboard
    charge: 55% (should be ignored) status: Discharging
Memory:
  RAM: total: 23.22 GiB used: 4.67 GiB (20.1%)
  RAM Report:
    permissions: Unable to run dmidecode. Root privileges required.
CPU:
  Info: quad core model: Intel Core i7-8550U bits: 64 type: MT MCP
    arch: Coffee Lake rev: A cache: L1: 256 KiB L2: 1024 KiB L3: 8 MiB
  Speed (MHz): avg: 856 high: 1062 min/max: 400/4000 cores: 1: 800 2: 800
    3: 800 4: 800 5: 800 6: 1062 7: 988 8: 800 bogomips: 31999
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
  Device-1: Intel UHD Graphics 620 vendor: Dell driver: i915 v: kernel
    bus-ID: 00:02.0
  Device-2: Realtek Integrated_Webcam_HD type: USB driver: uvcvideo
    bus-ID: 1-5:3
  Display: x11 server: X.Org v: 1.21.1.3 driver: X: loaded: modesetting
    unloaded: fbdev,vesa gpu: i915 resolution: 1: 1920x1080~60Hz
    2: 1920x1080~60Hz
  OpenGL: renderer: Mesa Intel UHD Graphics 620 (KBL GT2)
    v: 4.6 Mesa 22.0.5 direct render: Yes
Audio:
  Device-1: Intel Sunrise Point-LP HD Audio vendor: Dell
    driver: snd_hda_intel v: kernel bus-ID: 00:1f.3
  Sound Server-1: ALSA v: k5.15.0-48-generic running: yes
  Sound Server-2: PulseAudio v: 15.99.1 running: yes
  Sound Server-3: PipeWire v: 0.3.48 running: yes
Network:
  Device-1: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
    vendor: Dell driver: ath10k_pci v: kernel bus-ID: 01:00.0
  IF: wlp1s0 state: up mac: <filter>
  IF-ID-1: docker0 state: down mac: <filter>
Bluetooth:
  Device-1: Qualcomm Atheros type: USB driver: btusb v: 0.8 bus-ID: 1-7:4
  Report: hciconfig ID: hci0 rfk-id: 2 state: up address: <filter>
    bt-v: 2.1 lmp-v: 4.2
Drives:
  Local Storage: total: 4.09 TiB used: 3.35 TiB (81.9%)
  ID-1: /dev/sda vendor: HP model: SSD S700 500GB size: 465.76 GiB
  ID-2: /dev/sdb type: USB vendor: Western Digital model: WD easystore 2648
    size: 3.64 TiB
Partition:
  ID-1: / size: 456.89 GiB used: 229.43 GiB (50.2%) fs: ext4 dev: /dev/sda2
  ID-2: /boot/efi size: 511 MiB used: 5.2 MiB (1.0%) fs: vfat
    dev: /dev/sda1
  ID-3: /easystore size: 3.58 TiB used: 3.13 TiB (87.4%) fs: ext4
    dev: /dev/sdb1
Swap:
  ID-1: swap-1 type: file size: 2 GiB used: 0 KiB (0.0%) file: /swapfile
Sensors:
  System Temperatures: cpu: 49.0 C pch: 49.0 C mobo: 48.0 C sodimm: SODIMM C
  Fan Speeds (RPM): cpu: 0
Repos:
  Packages: 2585
  No active apt repos in: /etc/apt/sources.list
  Active apt repos in: /etc/apt/sources.list.d/docker.list
    1: deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu vanessa stable
  Active apt repos in: /etc/apt/sources.list.d/giuspen-ppa-jammy.list
    1: deb http://ppa.launchpad.net/giuspen/ppa/ubuntu jammy main
    2: deb-src http://ppa.launchpad.net/giuspen/ppa/ubuntu jammy main
  Active apt repos in: /etc/apt/sources.list.d/google-chrome-beta.list
    1: deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main
  No active apt repos in: /etc/apt/sources.list.d/google-chrome.list
  Active apt repos in: /etc/apt/sources.list.d/megasync.list
    1: deb [signed-by=/usr/share/keyrings/meganz-archive-keyring.gpg] https://mega.nz/linux/repo/xUbuntu_22.04/ ./
  Active apt repos in: /etc/apt/sources.list.d/official-dbgsym-repositories.list
    1: deb http://ddebs.ubuntu.com jammy main restricted universe multiverse
    2: deb http://ddebs.ubuntu.com jammy-updates main restricted universe multiverse
  Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list
    1: deb http://packages.linuxmint.com vanessa main upstream import backport
    2: deb http://archive.ubuntu.com/ubuntu jammy main restricted universe multiverse
    3: deb http://archive.ubuntu.com/ubuntu jammy-updates main restricted universe multiverse
    4: deb http://archive.ubuntu.com/ubuntu jammy-backports main restricted universe multiverse
    5: deb http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
  Active apt repos in: /etc/apt/sources.list.d/official-source-repositories.list
    1: deb-src http://packages.linuxmint.com vanessa main upstream import backport
    2: deb-src http://archive.ubuntu.com/ubuntu jammy main restricted universe multiverse
    3: deb-src http://archive.ubuntu.com/ubuntu jammy-updates main restricted universe multiverse
    4: deb-src http://archive.ubuntu.com/ubuntu jammy-backports main restricted universe multiverse
    5: deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
Info:
  Processes: 344 Uptime: 1d 13h 50m Init: systemd runlevel: 5 Compilers:
  gcc: 11.2.0 Shell: Bash v: 5.1.16 inxi: 3.3.13
(rmbInspiro2018) var/log 11:51:43> 
rmbjr60
Level 1
Level 1
Posts: 19
Joined: Sun Feb 18, 2018 3:33 pm

Re: system load average pegged at 1.3, but system is mostly idle

Post by rmbjr60 »

I have new info to report.

First of all I can report with 100% confidence that the problem commences when my systems comes out of suspend mode. I can reproduce this every time.

Second, it seems to be an occurrence of the following bug...

https://bugzilla.kernel.org/show_bug.cgi?id=177311

.. which is fixed in kernel 5.16. Since I am running 5.15, it makes sense that I'm seeing the problem. The only reason my situation might not be an exact match for that bug is that the suggested workaround does not completely resolve my situation. Nevertheless my symptoms mirror that bug very closely, and I'm hopeful an upgrade to 5.16 will resolve the issue.

The workaround, btw, was to add this to my kernel boot command: "i2c-i801.disable_features=0x10". A nice description of how to do this is here: https://sleeplessbeastie.eu/2022/04/17/ ... upt-storm/.

I confirmed the interrupts for this module are indeed disabled ...

Code: Select all

dmesg | grep -i "i2c|i801|smbus"
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-48-generic root=UUID=10e1f970-a4e4-434b-8924-3c19c41ae2d5 ro i2c-i801.disable_features=0x10 quiet splash
[    0.095051] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-48-generic root=UUID=10e1f970-a4e4-434b-8924-3c19c41ae2d5 ro i2c-i801.disable_features=0x10 quiet splash
[    1.380303] i801_smbus 0000:00:1f.4: Interrupt disabled by user
[    1.380584] i801_smbus 0000:00:1f.4: SPD Write Disable is set
[    1.380603] i801_smbus 0000:00:1f.4: SMBus using polling
[    1.797941] i2c i2c-0: 2/2 memory slots populated (from DMI)
[    1.798593] i2c i2c-0: Successfully instantiated SPD at 0x50
...
.. but I can reproduce the interrupt storm simply by coming out of suspend mode.

As a secondary, possibly related issue, I set my power manager settings to *NOT* go to suspend mode when the lid of my laptop is closed. Nevertheless, it always enters suspend mode when I close the lid.
Petermint
Level 8
Level 8
Posts: 2474
Joined: Tue Feb 16, 2016 3:12 am

Re: system load average pegged at 1.3, but system is mostly idle

Post by Petermint »

A few distros skipped 5.16 and went to 5.18 as it handles intel gen 11 and other new hardware. MX Linux has a special edition with Xfce and kernel 5.18. I do not know of a Cinnamon 5.18. Certainly worth testing.
Post Reply

Return to “Other topics”