Consistently repeating file system corruption

Questions about Grub, UEFI,the liveCD and the installer
Forum rules
Before you post please read how to get help
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Consistently repeating file system corruption

Post by rambo919 »

Not sure if this is the correct section, mods can move it if they want to.

I have been having this bug for years now and it's part of why I don't use LM as my main OS. I thought I had traced the problem to ESET AV but it has not been installed since the upgrade to LM21.

It seems somehow linked to synaptic/apt because I can't remember it ever happening without having used either, especially if a package being updated is currently running.

Everything is going fine an then suddenly something goes wrong and the drive is mounted as read-only and a lot of the the time it leads to a total freeze as the system keeps trying to write to files. There can be a bit of a delay to the freeze though I think. The only way to fix this is to boot to live USB and run fsck on both

Code: Select all

/
and

Code: Select all

/home
partitions..... could this perhaps have something do with the swap file? I have heard they can randomly become corrupt. Should I switch to a /swap partition instead?

I also thought this was perhaps due to HDD failure but this is the second drive the bug has occurred on.

I also have LM on another machine but it has never had this problem, I have noticed it sometimes running fsck automatically on startup though is this normal or a bug?
Cosmo.
Level 24
Level 24
Posts: 22055
Joined: Sat Dec 06, 2014 7:34 am

Re: Consistently repeating file system corruption

Post by Cosmo. »

rambo919 wrote:
Sat Sep 24, 2022 1:19 pm
I also have LM on another machine but it has never had this problem
With this you with speak all speculations about the source of the problem on the first computer.
BTW: Which version of LM on that machine?
And which desktop on both machines?

How did you upgrade to LM 21?
Do I understand correctly, that the problem was identical on LM 20.3 and LM 21?

How do you shut down the system? Do you hold the power button?
Gotcha!
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

Both are LM21 upgraded from LM20.3 using the upgrade tool, Cinnamon.

The problem has been persisting since LM20.0

Due to the system going into a tailspin shutting down via power button was the only option left.

The first time was with a second hand HDD that ended up failing, now it's with a perfectly fine SSD that I just noticed the bios does not even pick up anymore.

Either I have terrible luck with storage devices or something is breaking them.
Cosmo.
Level 24
Level 24
Posts: 22055
Joined: Sat Dec 06, 2014 7:34 am

Re: Consistently repeating file system corruption

Post by Cosmo. »

rambo919 wrote:
Sat Sep 24, 2022 2:25 pm
Due to the system going into a tailspin shutting down via power button was the only option left.
That is what I assumed. With this method you damage the file system and out of this reason fsck starts repeatedly. But those checks do not change anything about the case, that your file system is damaged. The only way to correct this is to format the drives (all) and this means of course a new install.

Besides that: Whenever you come into the "tailspin":
Press and hold the Alt and the Print keys together and than type one after the other: S, U and B with a little break between U and B. The computer will reboot.
Gotcha!
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

Cosmo. wrote:
Sat Sep 24, 2022 2:53 pm
That is what I assumed. With this method you damage the file system and out of this reason fsck starts repeatedly. But those checks do not change anything about the case, that your file system is damaged. The only way to correct this is to format the drives (all) and this means of course a new install.
No on the affected machine it never auto starts checks, that happens on the seemingly fine machine which never needs a hard shut down.

Many times I have been able to soft reboot when the filesystem goes into read only mode but not always. This last time seems to have broken the SSD too..... it's as if the universe is saying "DOOONT UUUUSE LINUUUX OOOOH" because drives on the affected machine running it keep failing.
Besides that: Whenever you come into the "tailspin":
Press and hold the Alt and the Print keys together and than type one after the other: S, U and B with a little break between U and B. The computer will reboot.
That is something i am going to have to print out to remember
User avatar
karlchen
Level 22
Level 22
Posts: 16631
Joined: Sat Dec 31, 2011 7:21 am
Location: Germany

Re: Consistently repeating file system corruption

Post by karlchen »

Hello, rambo919.

If Linux Mint consistently has to complain about filesystem corruption, and this even across fresh Mint installation, this strongly indicates some hardware related root cause.

The most common suspect would be an old harddisk or an old SSD, which is simply dying.
You ruled out the old harddisk as the root cause, because at some point in time it has been replaced by a new SSD.

A faulty data cable between SSD and motherboard controller could be another reason for recurring file system corruption.

Yet another reason could be either bad main memory bars or poorly seated main memory bars.

Apart from checking the mentioned hardware components, you should also carefully check the system journal. Termina command journalctl -b (in order to inspect the system journal written during the current system session)
The journal is not unlikely to reveal recurring hardware related problems.

Last thing:
You forgot to share your Linux Mint System Information with us. (Steps how to do so here: viewtopic.php?f=90&t=318644)
Regards,
Karl
Image
The people of Alderaan keep on bravely fighting back the clone warriors sent out by the unscrupulous Sith Lord Palpatine.
The Prophet's Song
Cosmo.
Level 24
Level 24
Posts: 22055
Joined: Sat Dec 06, 2014 7:34 am

Re: Consistently repeating file system corruption

Post by Cosmo. »

rambo919 wrote:
Sat Sep 24, 2022 3:02 pm
No on the affected machine it never auto starts checks
That doesn't change anything about, that it appears out of the distance, that your file system is damaged. As you have the problem since years and you upgraded Mint, the underlying problem does still exist.
Gotcha!
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

karlchen wrote:
Sat Sep 24, 2022 3:09 pm
You ruled out the old harddisk as the root cause, because at some point in time it has been replaced by a new SSD.
The same SSD had been used before as a data SSD in Windows but never had more than 100GB in writes. Never gave any problems before.

I dunno it's just too strange that two drives using two separate cables decided to fail on Linux while the other 3 connected to Windows keeps on going on. The only thing I can do it seems is change nothing and install LM on one of the unaffected drives and see what happens.

EDIT: Lemme elaborate, connected to the machine is:
1x Win10 SSD
2x data NTFS HDD
1x LM21 SSD (if it ever connects again)

The unaffected machine which for some reason keeps doing fsck checks upon boot only has a LM & MX dual boot HDD. I installed MX this week but the wierd fsck thing has been happening for a long time now.
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

karlchen wrote:
Sat Sep 24, 2022 3:09 pm
Last thing:
You forgot to share your Linux Mint System Information with us. (Steps how to do so here: viewtopic.php?f=90&t=318644)
Regards,
Karl
Yes, sorry, can't boot there now so this is from the live USB I have on hand. Will update if i can get it to boot again, the machine seems to be picking it up again for now.

Code: Select all

System:    Kernel: 5.10.0-17-amd64 [5.10.136-1] x86_64 bits: 64 compiler: gcc v: 10.2.1 
           parameters: quiet splasht nosplash 
           Desktop: KDE Plasma 5.20.5 wm: kwin_x11 vt: 7 dm: SDDM 
           Distro: MX-21.2_KDE_x64 Wildflower 27 August 2022 base: Debian GNU/Linux 11 (bullseye) 
Machine:   Type: Desktop Mobo: Gigabyte model: Q370M D3H GSM PLUS v: x.x serial: <filter> 
           UEFI-[Legacy]: American Megatrends v: F1 MS date: 05/23/2018 
CPU:       Info: 6-Core model: Intel Core i7-8700 bits: 64 type: MT MCP arch: Kaby Lake 
           note: check family: 6 model-id: 9E (158) stepping: A (10) microcode: 84 cache: 
           L2: 12 MiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 76799 
           Speed: 1666 MHz min/max: 800/4600 MHz Core speeds (MHz): 1: 1666 2: 3187 3: 1812 4: 811 
           5: 3104 6: 3557 7: 1403 8: 2778 9: 4313 10: 3593 11: 3835 12: 3756 
           Vulnerabilities: Type: itlb_multihit status: KVM: VMX disabled 
           Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable 
           Type: mds status: Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable 
           Type: meltdown mitigation: PTI 
           Type: mmio_stale_data 
           status: Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable 
           Type: retbleed mitigation: IBRS 
           Type: spec_store_bypass status: Vulnerable 
           Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization 
           Type: spectre_v2 
           mitigation: IBRS, IBPB: conditional, RSB filling, PBRSB-eIBRS: Not affected 
           Type: srbds status: Vulnerable: No microcode 
           Type: tsx_async_abort 
           status: Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable 
Graphics:  Device-1: Intel CoffeeLake-S GT2 [UHD Graphics 630] vendor: Gigabyte driver: i915 
           v: kernel bus-ID: 00:02.0 chip-ID: 8086:3e92 class-ID: 0380 
           Device-2: NVIDIA GP107 [GeForce GTX 1050 Ti] vendor: Gigabyte driver: nouveau v: kernel 
           bus-ID: 01:00.0 chip-ID: 10de:1c82 class-ID: 0300 
           Display: x11 server: X.Org 1.20.14 compositor: kwin_x11 driver: loaded: modesetting 
           unloaded: fbdev,vesa display-ID: :0 screens: 1 
           Screen-1: 0 s-res: 3840x1080 s-dpi: 96 s-size: 1016x285mm (40.0x11.2") 
           s-diag: 1055mm (41.5") 
           Monitor-1: HDMI-4 res: 1920x1080 hz: 60 dpi: 93 size: 527x296mm (20.7x11.7") 
           diag: 604mm (23.8") 
           Monitor-2: HDMI-5 res: 1920x1080 hz: 60 dpi: 102 size: 476x268mm (18.7x10.6") 
           diag: 546mm (21.5") 
           OpenGL: renderer: NV137 v: 4.3 Mesa 22.0.5 direct render: Yes 
Audio:     Device-1: Intel Cannon Lake PCH cAVS vendor: Gigabyte driver: snd_hda_intel v: kernel 
           alternate: snd_soc_skl,snd_sof_pci bus-ID: 00:1f.3 chip-ID: 8086:a348 class-ID: 0403 
           Device-2: NVIDIA GP107GL High Definition Audio vendor: Gigabyte driver: snd_hda_intel 
           v: kernel bus-ID: 01:00.1 chip-ID: 10de:0fb9 class-ID: 0403 
           Sound Server-1: ALSA v: k5.10.0-17-amd64 running: yes 
           Sound Server-2: PulseAudio v: 14.2 running: yes 
Network:   Device-1: Intel Ethernet I219-LM vendor: Gigabyte driver: e1000e v: kernel port: efa0 
           bus-ID: 00:1f.6 chip-ID: 8086:15bb class-ID: 0200 
           IF: eth1 state: down mac: <filter> 
           Device-2: Intel I211 Gigabit Network vendor: Gigabyte driver: igb v: kernel port: 3000 
           bus-ID: 04:00.0 chip-ID: 8086:1539 class-ID: 0200 
           IF: eth0 state: up speed: 100 Mbps duplex: full mac: <filter> 
Drives:    Local Storage: total: 6.61 TiB used: 0 KiB (0.0%) 
           SMART Message: Unable to run smartctl. Root privileges required. 
           ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 1TB size: 931.51 GiB 
           block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD 
           serial: <filter> rev: 2B4QFXO7 temp: 32.9 C scheme: MBR 
           ID-2: /dev/sda maj-min: 8:0 vendor: Seagate model: ST2000DM001-9YN164 size: 1.82 TiB 
           block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 
           serial: <filter> rev: CC4B scheme: MBR 
           ID-3: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 750 EVO 250GB size: 232.89 GiB 
           block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> 
           rev: 1B6Q scheme: MBR 
           ID-4: /dev/sdc maj-min: 8:32 vendor: Toshiba model: HDWE140 size: 3.64 TiB block-size: 
           physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 serial: <filter> 
           rev: FP2A scheme: GPT 
           ID-5: /dev/sdd maj-min: 8:48 type: USB vendor: Kingston model: DataTraveler 3.0 
           size: 14.41 GiB block-size: physical: 512 B logical: 512 B type: N/A serial: <filter> 
           scheme: MBR 
           SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure? 
Partition: Message: No partition data found. 
Swap:      Alert: No swap data was found. 
Sensors:   System Temperatures: cpu: 41.0 C mobo: 27.8 C gpu: nouveau temp: 49.0 C 
           Fan Speeds (RPM): N/A gpu: nouveau fan: 0 
Repos:     Packages: note: see --pkg apt: 2294 lib: 1285 flatpak: 0 
           No active apt repos in: /etc/apt/sources.list 
           Active apt repos in: /etc/apt/sources.list.d/debian-stable-updates.list 
           1: deb http://deb.debian.org/debian bullseye-updates main contrib non-free
           Active apt repos in: /etc/apt/sources.list.d/debian.list 
           1: deb http://deb.debian.org/debian bullseye main contrib non-free
           2: deb http://security.debian.org/debian-security bullseye-security main contrib non-free
           Active apt repos in: /etc/apt/sources.list.d/mx.list 
           1: deb http://mxrepo.com/mx/repo/ bullseye main non-free
           2: deb http://mxrepo.com/mx/repo/ bullseye ahs
Info:      Processes: 292 Uptime: 0m wakeups: 1 Memory: 31.19 GiB used: 1.21 GiB (3.9%) 
           Init: SysVinit v: 2.96 runlevel: 5 default: 5 tool: systemctl Compilers: gcc: N/A 
           alt: 10 Client: shell wrapper v: 5.1.4-release inxi: 3.3.06 
Boot Mode: BIOS (legacy, CSM, MBR)
User avatar
karlchen
Level 22
Level 22
Posts: 16631
Joined: Sat Dec 31, 2011 7:21 am
Location: Germany

Re: Consistently repeating file system corruption

Post by karlchen »

Which one of the 2 following disk is the Linux Mint system disk?
/dev/sdb or /dev/sdc?

Code: Select all

           ID-3: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 750 EVO 250GB size: 232.89 GiB 
           block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> 
           rev: 1B6Q scheme: MBR 

           ID-4: /dev/sdc maj-min: 8:32 vendor: Toshiba model: HDWE140 size: 3.64 TiB block-size: 
           physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 serial: <filter> 
           rev: FP2A scheme: GPT 
What confuses me a bit is that all disks use the traditional partition table and the traditional MBR.
Only /dev/sdc uses the grand partition table (GPT), which requires UEFI, as far as I know.

I am not sure whether mixing disk devices having MBR/tradition partition table and disk devices having GPT could be related to the recurring file system corruption issues.
Image
The people of Alderaan keep on bravely fighting back the clone warriors sent out by the unscrupulous Sith Lord Palpatine.
The Prophet's Song
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

karlchen wrote:
Sat Sep 24, 2022 4:43 pm
Which one of the 2 following disk is the Linux Mint system disk?
/dev/sdb or /dev/sdc?

Code: Select all

           ID-3: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 750 EVO 250GB size: 232.89 GiB 
           block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> 
           rev: 1B6Q scheme: MBR 

           ID-4: /dev/sdc maj-min: 8:32 vendor: Toshiba model: HDWE140 size: 3.64 TiB block-size: 
           physical: 4096 B logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 7200 serial: <filter> 
           rev: FP2A scheme: GPT 
What confuses me a bit is that all disks use the traditional partition table and the traditional MBR.
Only /dev/sdc uses the grand partition table (GPT), which requires UEFI, as far as I know.

I am not sure whether mixing disk devices having MBR/tradition partition table and disk devices having GPT could be related to the recurring file system corruption issues.
The LM installation is on /sdb

/sdc is the main data drive, I had to format it as GPT because otherwise I could not allocate the full partition to it due to the 2TB limit. Works perfectly fine with MBR OS installations. I never use UEFI installs on anything.

Now the one thing I did differently when installing LM on the SSD from what I did when it was installed on the now dead HDD is add a Timeshift partition to /sda but that is unlikely to have any impact on the recurring problem since I did not do it the first time.
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

The drive works now and I managed to run a successful fsck on it. The machine being powered off for the night has helped with that.

It does not really properly boot though, will try to get to the journal later today when I have time.
Cosmo.
Level 24
Level 24
Posts: 22055
Joined: Sat Dec 06, 2014 7:34 am

Re: Consistently repeating file system corruption

Post by Cosmo. »

To clear this up:
GPT on a not-EFI system is no problem. Although GPT is part of the EFI specification GPT can also be used on legacy systems. There are single cases, where GPT on a legacy system gives trouble, but this does only affect booting, if the system is stored on the GPT drive. In case of data partitions this is no problem at all.

Also a mix of MBR and GPT is no problem (although unusual, as long as not forced by an installed Windows system).
Gotcha!
User avatar
spamegg
Level 9
Level 9
Posts: 2785
Joined: Mon Oct 28, 2019 2:34 am
Contact:

Re: Consistently repeating file system corruption

Post by spamegg »

Code: Select all

5.10.0-17-amd64
If you are on Mint 21, switch to the 5.15 kernel. See if this gets rid of the file system corruptions (of course you have to fix the existing issues with fsck). As far as I'm aware, this consistent file system corruption issue happened to a lot of people with kernels between 5.4-5.13. Including myself (it was happening every single day on brand new drives). Switching to 5.15 got rid of the issue.
User avatar
karlchen
Level 22
Level 22
Posts: 16631
Joined: Sat Dec 31, 2011 7:21 am
Location: Germany

Re: Consistently repeating file system corruption

Post by karlchen »

Hello, spamegg.

The kernel 5.10.0-17 is not used by Mint 21, but by MX Linux 21.2.

Code: Select all

System:    Kernel: 5.10.0-17-amd64 [5.10.136-1] x86_64 bits: 64 compiler: gcc v: 10.2.1 
           parameters: quiet splasht nosplash 
           Desktop: KDE Plasma 5.20.5 wm: kwin_x11 vt: 7 dm: SDDM 
           Distro: MX-21.2_KDE_x64 Wildflower 27 August 2022 base: Debian GNU/Linux 11 (bullseye) 
Regards,
Karl
Image
The people of Alderaan keep on bravely fighting back the clone warriors sent out by the unscrupulous Sith Lord Palpatine.
The Prophet's Song
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

It's no use jim, I can mount the partitions and fsck them now in live USB but I can't make make it boot. I can even made a backup of my /home folder as I did.

Manually reading the journals is a useless exercise, 8MB damn that's large, it's just so much everything crammed into one file. Is there a way of doing it via live USB?
User avatar
spamegg
Level 9
Level 9
Posts: 2785
Joined: Mon Oct 28, 2019 2:34 am
Contact:

Re: Consistently repeating file system corruption

Post by spamegg »

Karlchen,

Ah I see, thanks. Since this is the Mint forum I did not read the system report carefully and assumed it was Mint. My bad.

Anyway, the issue is the same. LMDE users also reported similar things so the Debian kernels have the same issue.

rambo should consider changing to a kernel higher than 5.13 (which still had the issue according to many posts here). I'm not sure how MX Linux handles kernels. Here's the manual for MX 21 https://github.com/MX-Linux/mx-docs/raw/master/mxum.pdf In section "7.6 The kernel" it says to "open MX package installer and click on the "Kernel" category." Of course I would recommend using Timeshift first, just in case.
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

spamegg wrote:
Mon Sep 26, 2022 7:30 am
Karlchen,

Ah I see, thanks. Since this is the Mint forum I did not read the system report carefully and assumed it was Mint. My bad.

Anyway, the issue is the same. LMDE users also reported similar things so the Debian kernels have the same issue.

rambo should consider changing to a kernel higher than 5.13 (which still had the issue according to many posts here). I'm not sure how MX Linux handles kernels. Here's the manual for MX 21 https://github.com/MX-Linux/mx-docs/raw/master/mxum.pdf In section "7.6 The kernel" it says to "open MX package installer and click on the "Kernel" category." Of course I would recommend using Timeshift first, just in case.
I think you completely misunderstand. I cannot boot into the affected OS so had to get the report from a MX live USB I happen to have ATM. I was running kernel 5.15 since the upgrade to LM21.

On the other hand I have to apologize for the same error, due to how crazy busy I have been I saw but did not properly read what you wrote. So you think the damage might have happened while I was running LM20.3 with kernel 5.10? (or whatever the kernel was I was running, can't remember now)
rambo919
Level 5
Level 5
Posts: 620
Joined: Wed May 22, 2013 3:11 pm

Re: Consistently repeating file system corruption

Post by rambo919 »

Now I have copied the /var/log folder to a separate directory another PC(lets call it PC2) which actually can boot but.... how do I make the log viewer read it?

I don't wanna just copy everything to the /var/log/ of PC2 and none of the log viewers I have seen (quick google) seem to have the option to load external logs.

Alternatively, which file can I manually open to look for such errors? There are a LOT of files in that folder.
mikeflan
Level 12
Level 12
Posts: 4206
Joined: Sun Apr 26, 2020 9:28 am
Location: Houston, TX

Re: Consistently repeating file system corruption

Post by mikeflan »

Partition: Message: No partition data found.
You have 5 drives and partition data cannot be found for any of them :shock:

I suggest you remove sdc and sdd, and try to get sda and sdb working. If that doesn't work, image sdb, remove sda and try a new install on sdb with /home placed on sdb. I suspect that will work. If not, I would focus on the /dev/nvme0n1.

Then maybe another install with /home on another reliable drive is called for.
Post Reply

Return to “Installation & Boot”