LM17.3 Cinnamon - Full System crash !!

Please post suggestions for improvement of Cinnamon here
https://github.com/linuxmint/Cinnamon
cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

LM17.3 Cinnamon - Full System crash !!

Post by cptX » Thu Jan 21, 2016 5:13 pm

Hi guys, this is my first post here! I'm desperate!!

I'm an electrical & computer engineer with adequate knowledge of hardware and software. I'm using Linux with great love the last 5 years and I believe I'm in an intermediate level. (Just a short background to know in which level to guide me.)

My problem is that randomly and most usually when a video is playing in youtube or during a videocall with skype my computer hangs completely, looping the last sound sample that was available in the sound buffer. No mouse, keyboard or other response! This is completely random. Maybe one day it doesn't hang at all. Maybe another day it hangs multiple times!

From these hangs, rarely the magic rescue REISUO works, but usually nothing responds and I have to make a hard reset.

I have tried the following without success to detect the problem. I have run out of ideas and believe me I tried a lot of things:

-- Changed the graphic card: I had an old ATI X1950 PRO which was only supported with the open source driver. The problem was there too. I replaced it with great hope with an Nvidia 450 GTS which has a current proprietary driver but the problem remains!!
-- Ran more than 12 hours the memtest for the RAM and found no problems
-- Connected an Oscilloscope to the internal voltages of the motherboard to find hardware issues but no problem found. I monitored the voltages 3.3, 5 and 12V and everything was fine, no voltage drops or something, so I believe is not the power supply!
-- I was using the sensors, top, the system monitor and other programs to check what is happening at the moment of crash but didn't see any strange behavior or unexpected loads, voltages or temperatures.
-- Tried with Compositor ON and OFF and the problem remains on both!
-- Changed several kernels and the problem remains!
-- Used the Phoronix Test Suite to stress the system but didn't have a crash
-- The log files doesn't show any problem, probably the system hangs before anything is reported.
-- Searched in all forums but I couldn't find something similar. That's why I created a new thread.


I have no clue what could be wrong, thus I need your help! This is my main development computer and I cannot afford losing work randomly!
I still cannot understand if it is a hw or sw issue. I think that in XP on the same machine I never had a hang, but I boot so rarely that I cannot verify this anymore!

My suspicions are:
-- either a Motherboard issue, or RAM issue. Although the memtest didn't show any problem.
-- A sw issue related with videos and maybe related with firefox. I recall though the system hanging even without firefox running. When I work on the computer I usually have firefox open so it's difficult for me to isolate the problem.

Things to try more:
-- The system has a 800MHz RAM. Maybe I should try running for some days with RAM at 667MHz.
-- Reverting BIOS to a previous version. I cannot recall if the problem started after a BIOS update, although a couple of years ago I didn't have this problem, probably before a BIOS update. Could the bios be here responsible?

Any suggestions are welcome! Thank you all in advance!
Last edited by cptX on Fri Jan 22, 2016 8:32 am, edited 3 times in total.

Laurent85
Level 15
Level 15
Posts: 5753
Joined: Tue May 26, 2015 10:11 am

Re: LM17.3 Cinnamon - Full System crash !!

Post by Laurent85 » Thu Jan 21, 2016 5:28 pm

Did you try running the Live usb flash drive for several hours, does the live system also crash ?
Image

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Thu Jan 21, 2016 6:08 pm

It's a good idea! I thought of it but didn't do it yet. It needs a lot of time until it happens (actually it may happen soon or take ages...) and I left this solution for the end. Of course if nothing else works I'll try this too. Currently I switched the FSB to 667MHz and wait.

Cosmo.
Level 23
Level 23
Posts: 17824
Joined: Sat Dec 06, 2014 7:34 am

Re: LM17.3 Cinnamon - Full System crash !!

Post by Cosmo. » Thu Jan 21, 2016 6:36 pm

Is this a fresh installation of LM 17.3 or was there an older Mint-version before, which you upgraded? If so, how did you upgrade? Again if so, did the problem exist in the previous Mint version?

I ask this, because LM 17.3 has a new graphic stack, but the graphic stack does not get updated, if the system gets ugraded with the update manager.

If the graphic stack should be the culprit I would assume, that the problem appears also in the live system with 17.3, but not in the live system of LM 17 to 17.2 (the graphic stack is identical in these 3 versions).

Start any video before you go to bed and let it run over night.

User avatar
austin.texas
Level 20
Level 20
Posts: 12054
Joined: Tue Nov 17, 2009 3:57 pm
Location: at /home

Re: LM17.3 Cinnamon - Full System crash !!

Post by austin.texas » Thu Jan 21, 2016 7:58 pm

The first things I would suspect would be CPU temperature, or a failing power supply.
Mint 18.2 Cinnamon, Quad core AMD A8-3870 with Radeon HD Graphics 6550D, 8GB DDR3, Ralink RT2561/RT61 802.11g PCI
Linux Linx 2018

User avatar
daveinuk
Level 7
Level 7
Posts: 1545
Joined: Tue Mar 23, 2010 7:52 pm
Location: Manchester, England.
Contact:

Re: LM17.3 Cinnamon - Full System crash !!

Post by daveinuk » Thu Jan 21, 2016 8:44 pm

Welcome aboard, it may help if you post some info about your machines spec's so others can help.

Open a terminal from the menu, paste in the code below and press enter, then post back the output in a reply, wrapped in the 'code' tags in your repy.

Code: Select all

inxi -Fxz

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Fri Jan 22, 2016 4:46 am

@ Cosmo: I think I have upgraded sequentially from linux mint 17.1 to 17.2 and finally to 17.3. In all these versions I had crashes.

@ austin.texas: I used some monitoring programs (sensors, top, system monitor etc) to check the voltages and the temperatures of both CPU and Graphic Card just before the moment of crash and didn't see any problems. I also used an oscilloscope to check the power supply and again didn't see any problems!

Cosmo.
Level 23
Level 23
Posts: 17824
Joined: Sat Dec 06, 2014 7:34 am

Re: LM17.3 Cinnamon - Full System crash !!

Post by Cosmo. » Fri Jan 22, 2016 6:21 am

cptX wrote:@ Cosmo: I think I have upgraded sequentially from linux mint 17.1 to 17.2 and finally to 17.3. In all these versions I had crashes.
Leaves the advice of Laurent85.

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Fri Jan 22, 2016 8:35 am

Can somebody tell me if it is possible a bios version to be responsible for random hangs? In my opinion probably not but maybe somebody knows better...

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Fri Jan 22, 2016 8:40 am

This guy here looks like having the same problem but he is not providing more details:
http://forums.linuxmint.com/viewtopic.p ... 8&t=214475

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Fri Jan 22, 2016 8:04 pm

I believe my problems come from the ram and I'm investigating towards this direction. I started experimenting with RAM voltage and timings from bios. I'll give my feedback as soon as my system proves stable or unstable. Meanwhile if you have other suggestions you are welcome!

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Sun Jan 24, 2016 6:07 am

My motherboard is an Asus M2N-SLI Deluxe which is overclocking capable, thus the BIOS allows many configurations for RAM timings and voltage.

Initially I had 2x 1GB in dual mode configuration. A couple of years ago I added extra 2x 2GB but from different manufacturer, again in dual mode. I cannot recall if the crashes started after my addition or before but I started now researching in this direction.

I booted in windows. Windows are XP 32bits, and recognize only 3GB, maybe that's why I didn't have crashes in windows. In windows I used the CPU-Z application to check the nominal timings of the RAM. I was surprised to see that they had different nominal timings.
The initial RAM @400MHz suggests 5-5-5-18, 1.8V and the extra added @ 400MHz 6-6-6-18, 1.8V
I had set my BIOS settings to manual 800MHz but all the rest voltage and timing settings to auto. The only thing I know is that automatically the BIOS selected the timings 5-5-5-15. What voltage had selected is not visible!
Initially I changed the settings to 6-6-6-18 and voltage to 1.8V, so exactly what the manufacturer suggested but I had again a crash and I could say faster than before.

Then I increased the voltage to 1.95V and set back the timings to auto (BIOS reports 5-5-5-15) and since then I had no crash!!! In order to prove that is stable I have to wait some days more, but 2 days ago I did a burnout test using at the same time videos running in youtube, camera video running with cheese, skype running and firefox running, the CPU was at it's limits for 2 hours and the system was rock solid stable.

So probably, by adding extra RAMs we need to increase the voltage of the RAM a bit to compensate the extra consumption. Unfortunately, I still don't know what voltage had the BIOS selected automatically before, but 1.95V now seems stable.
I will continue using the PC the next days to prove the stability before we declare it as solved!

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Fri Jan 29, 2016 7:13 pm

Exactly one week later and under very high stress I got today again one crash. The whole week the system was rock solid.

Today the CPU was at the maximum for a couple of hours with the nemo trying to copy around half a million files from an NTFS external hard disk to an EXT4 internal disk. The system just hang again during file transfer and the REISUO command was not responding. Until today I thought I had solved the problem. I don't know if this time the cause was different.

Is it possible a software issue to bring the computer to a state that will not respond even to REISUO?

I have to continue testing until the system proves stable.

(By the way nemo is completely useless when trying to open a directory with more than 100.000, and the most worse is that the whole cinnamon desktop hangs!!!)

User avatar
LinuxJim
Level 5
Level 5
Posts: 659
Joined: Tue Jan 26, 2016 8:01 pm
Location: Oregon, USA

Re: LM17.3 Cinnamon - Full System crash !!

Post by LinuxJim » Sat Jan 30, 2016 6:40 pm

cptX wrote: Is it possible a software issue to bring the computer to a state that will not respond even to REISUO?
If the kernel has gone to la-la land, then REISUO (or REISUB, or any of the other SysReq keys) will not work.
Be sure to wait a few seconds between each key. Going too fast will not work correctly either.
cptX wrote:(By the way nemo is completely useless when trying to open a directory with more than 100.000, and the most worse is that the whole cinnamon desktop hangs!!!)
Who puts 100,000 files in a single directory??? Every GUI system I have ever seen (including Windows and Mac OSX) will slow to a crawl or crash with more than a couple thousand - and that is still way too much...

ralplpcr
Level 5
Level 5
Posts: 643
Joined: Tue Jul 28, 2015 10:11 am

Re: LM17.3 Cinnamon - Full System crash !!

Post by ralplpcr » Sat Jan 30, 2016 7:03 pm

Alright, this may seem a bit out there, but have you checked the capacitors on your board? According to a quick Google search, that board was released mid-2006... which happens to coincide with an industry-wide problem with a couple of bad Chinese capacitor vendors. In a nutshell, somebody stole the electrolytic formula for a capacitor and sold it to several companies who supply capacitors to all the big motherboard manufacturing companies. Problem was that the formula they stole wasn't complete - - and it led to early failure of these capacitors and massive warranty issues for all the big vendors.

If it is a capacitor problem, it's only going to get worse. These boards have been known to work without a hitch for quite some time, but when the electrolytic fluid inside the capacitors starts to break down, errors/freeze-ups/sudden crashes will start to occur. It may start slowly, but will increase in frequency until eventually you can't boot up at all.

You may have to remove cards, heatsinks, etc... but I suspect that with a board from that time period, it's quite a good possibility that this is what you may be experiencing. The tops of all your board's capacitors should be flat, with no swelling on the seals. Check for a "bulge" in the top of any capacitors on your board. It's not always obvious, like if fluid were leaking...but if you do see any that are bulging they should be replaced. (Or, get a newer motherboard... but it all depends on how much you wish to spend)

Edit: some reference, with pictures of what I'm describing: https://en.wikipedia.org/wiki/Capacitor_plague

User avatar
LinuxJim
Level 5
Level 5
Posts: 659
Joined: Tue Jan 26, 2016 8:01 pm
Location: Oregon, USA

Re: LM17.3 Cinnamon - Full System crash !!

Post by LinuxJim » Sat Jan 30, 2016 7:52 pm

ralplpcr wrote:Alright, this may seem a bit out there, but have you checked the capacitors on your board?
That is a good point. Nearly all electronics (computers, routers, TVs, anything) made between ~2004 and ~2008 have a max projected lifetime of 2 years because of that darned capacitor fiasco. I had a 2006 iMac that lasted 3 1/2 years before I had to replace the capacitors, which is quite beyond the norm. It still serves me as a file server today, but it went up in a puff of smoke about 6 years ago...

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Tue Feb 02, 2016 10:05 am

@LinuxJim:
I was using REISUO pressing the keys slowly...

Regarding the files, it's not uncommon at all to have so many thousand of files. Windows as far as I know had no problem handling them. Bash could handle them really easy too. Nemo had the problem because by default tries to read infos from every file, calculate size etc. So I thing Nemo should with a clever way see how many files there are and stop preprocessing them if they are too many.

The most important is that when I was killing Nemo (whenever it was not responding) all icons on the cinnamon desktop were lost. I don't get the relationship between Cinnamon and Nemo. Why are they correlated?

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Tue Feb 02, 2016 10:10 am

@ ralplpcr:
The idea of problematic capacitors is a good point. All electrolytic capacitors lose capacitance by aging. The applied voltage, the hours under power, the temperature and other parameters could age them sooner. So it's a good idea but because I cannot measure them without desoldering them I have to continue focusing in the RAM voltage parameters.
It's nice to know though that a SW issue could also trigger a complete hang, and this time my system was really under stress.

cptX
Level 1
Level 1
Posts: 21
Joined: Thu Jan 21, 2016 4:37 pm

Re: LM17.3 Cinnamon - Full System crash !!

Post by cptX » Fri Feb 05, 2016 3:29 pm

I just had another crash! This time the system restarted by it's own when I watching a video in a web page.

I opened the dmesg and discovered this:
[ 14.444689] ACPI Warning: SystemIO range 0x0000000000001c40-0x0000000000001c7f conflicts with OpRegion 0x0000000000001c40-0x0000000000001c45 (\_SB_.PCI0.SM01) (20141107/utaddress-258)
[ 14.445904] ACPI Warning: SystemIO range 0x0000000000001c40-0x0000000000001c7f conflicts with OpRegion 0x0000000000001c40-0x0000000000001c45 (\_SB_.PCI0.SM00) (20141107/utaddress-258)
[ 14.445910] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

And in syslog I found this:
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Pm1aEventBlock: 32/8 (20141107/tbfadt-623)
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Pm1aControlBlock: 16/8 (20141107/tbfadt-623)
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/PmTimerBlock: 32/8 (20141107/tbfadt-623)
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 64/8 (20141107/tbfadt-623)
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe1Block: 128/8 (20141107/tbfadt-623)
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): Invalid length for FADT/Pm1aEventBlock: 8, using default 32 (20141107/tbfadt-704)
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): Invalid length for FADT/Pm1aControlBlock: 8, using default 16 (20141107/tbfadt-704)
Feb 5 20:22:58 athlon kernel: [ 0.000000] ACPI BIOS Warning (bug): Invalid length for FADT/PmTimerBlock: 8, using default 32 (20141107/tbfadt-704)

although I don't know if they are correlated to the problem!
Any clue?

ralplpcr
Level 5
Level 5
Posts: 643
Joined: Tue Jul 28, 2015 10:11 am

Re: LM17.3 Cinnamon - Full System crash !!

Post by ralplpcr » Fri Feb 05, 2016 8:14 pm

That looks to me like hardware failures on your board - possibly related to power regulation. I still stick by my suggestion to check your capacitors. You don't need to de-solder them - - just a visual check to see if they're bulging ought to tell you if they're affected by the "plague" linked in my earlier post.

Have you tried booting into a live media session (USB or DVD) to see if you can replicate the failure? If so, then at least you know it isn't your software.

Post Reply

Return to “Cinnamon”