Periodic Boot Failure
Forum rules
Before you post please read how to get help
Before you post please read how to get help
Periodic Boot Failure
Hi all,
Firstly I apologise in advance for the somewhat vague description which follows...
Long time Mint user, and I've had the hardware setup listed below for some time without issue. However, I'm now in the situation where at irregular intervals I get a (very long) stack trace upon boot with no obvious 'ah nouveau died' style culprit. The only common reference point I have suspicions about is that I recently (~3 months) reinstalled on an older hard drive I had around to give my win 7 instance its own drive. Prior to mint install I ran Spinrite to clear up / test out the old drive to see if I was wasting my time and all went through smoothly.
Preceding the session I'm typing this from, the previous 5 restarts all resulted in failures (but before that it had been a week or more). Failures occur in many guises: sometimes it hangs with the stack trace, sometimes it reboots itself, sometimes the display just goes black and other times the mint loading logo comes up and never goes any further.
Any ideas welcome, I'm slightly confused as to what can cause irregular errors like this other than bad disk sectors - but I'm not convinced that's the issue. I will of course take a picture of the stack trace next time it happens and add to this post - I'm in the process of backing up currently so that might be another day.
Thanks,
Michael.
Hardware:
MB: GA-790A-D3 w/ F3 BIOS
CPU: AMD Phenom(tm) II X4 965 Processor (fam: 10, model: 04, stepping: 03)
RAM: 8GB (2x4GB) Corsair XMS 1333MHz Dual Channel (in the correct dual channel slots)
GPU: Nvidia GTS450
HDD0: /dev/sda: (WINDOWS 7 ULTIMATE x64)
ATA device, with non-removable media
Model Number: WDC WD2500KS-00MJB0
Serial Number: WD-WCANKH221xxx
Firmware Revision: 02.01C03
HDD1: /dev/sdb: (LINUX MINT 15 x64)
ATA device, with non-removable media
Model Number: MAXTOR STM3500320AS
Serial Number: 9QM14xxx
Firmware Revision: MX15
Transport: Serial
Firstly I apologise in advance for the somewhat vague description which follows...
Long time Mint user, and I've had the hardware setup listed below for some time without issue. However, I'm now in the situation where at irregular intervals I get a (very long) stack trace upon boot with no obvious 'ah nouveau died' style culprit. The only common reference point I have suspicions about is that I recently (~3 months) reinstalled on an older hard drive I had around to give my win 7 instance its own drive. Prior to mint install I ran Spinrite to clear up / test out the old drive to see if I was wasting my time and all went through smoothly.
Preceding the session I'm typing this from, the previous 5 restarts all resulted in failures (but before that it had been a week or more). Failures occur in many guises: sometimes it hangs with the stack trace, sometimes it reboots itself, sometimes the display just goes black and other times the mint loading logo comes up and never goes any further.
Any ideas welcome, I'm slightly confused as to what can cause irregular errors like this other than bad disk sectors - but I'm not convinced that's the issue. I will of course take a picture of the stack trace next time it happens and add to this post - I'm in the process of backing up currently so that might be another day.
Thanks,
Michael.
Hardware:
MB: GA-790A-D3 w/ F3 BIOS
CPU: AMD Phenom(tm) II X4 965 Processor (fam: 10, model: 04, stepping: 03)
RAM: 8GB (2x4GB) Corsair XMS 1333MHz Dual Channel (in the correct dual channel slots)
GPU: Nvidia GTS450
HDD0: /dev/sda: (WINDOWS 7 ULTIMATE x64)
ATA device, with non-removable media
Model Number: WDC WD2500KS-00MJB0
Serial Number: WD-WCANKH221xxx
Firmware Revision: 02.01C03
HDD1: /dev/sdb: (LINUX MINT 15 x64)
ATA device, with non-removable media
Model Number: MAXTOR STM3500320AS
Serial Number: 9QM14xxx
Firmware Revision: MX15
Transport: Serial
- michaelm
Re: Periodic Boot Failure
If I understand what you are saying, you have a dual partition Linux/Window but in two hard disks. Is it possible that when you boot in WIndows, Window's UEFI system (check if you have that enabled) screws up Linux?
Re: Periodic Boot Failure
Hm, interesting idea - I don't think that is the case - however the point at which I moved to Win 7 also coincided with moving from a dual-boot-single-disk to a dual-boot-dual-disk, so I will definitely take a look at that. Also the linux disk/boot can go from working to not regardless of whether I have booted Windows in between, but UEFI is unchartered territory for me being a long-time linux user so I wouldn't rule it out! Thanks for the suggestion 
Edit: Note I should add that the dual-boot-single-disk was XP and mint 14, not win7

Edit: Note I should add that the dual-boot-single-disk was XP and mint 14, not win7
- michaelm
Re: Periodic Boot Failure
I just booted win7 -> linux -> win7 -> linux and no issues. Not sure exactly whether that proves anything or not, but thanks for the idea - I think you might be in the right ball-park.
- michaelm
Re: Periodic Boot Failure
I have the same problem but has been getting worse lately. The error comes and goes for some reason and the stack trace message is not always the same. I only have 1 hard drive on the machine and I don't have Windows installed on it. However, I do have drive set up to multiboot Mint 15 Cinnamon, CentOS 6, and OpenSUSE 12.3 and Mint 15 is the only OS that gives me this boot error.
Hardware:
AMD Phenom X2 550 processor
ECS NFORCE6M-A2 mother board w/ American Megatrends BIOS
HD0: WDC WD5000AAKS
RAM: 2X2GB Corsair XMS DDR3 dual channel.
NVIDIA 210 graphics card.
Open SUSE controls GRUB.
I also have 2 ATA DVD drives(set master, slave), a floppy drive, and a USB multi card reader on the machine and the BIOS is set to boot from the hard drive last. Maybe that makes a difference?
I also have an Intel machine(Sandy Bridge Pentium on a H61 board) that has only Mint 15 installed and it runs flawlessly. It boots in efi mode if that makes a differance.
Thanks.
Hardware:
AMD Phenom X2 550 processor
ECS NFORCE6M-A2 mother board w/ American Megatrends BIOS
HD0: WDC WD5000AAKS
RAM: 2X2GB Corsair XMS DDR3 dual channel.
NVIDIA 210 graphics card.
Open SUSE controls GRUB.
I also have 2 ATA DVD drives(set master, slave), a floppy drive, and a USB multi card reader on the machine and the BIOS is set to boot from the hard drive last. Maybe that makes a difference?
I also have an Intel machine(Sandy Bridge Pentium on a H61 board) that has only Mint 15 installed and it runs flawlessly. It boots in efi mode if that makes a differance.
Thanks.
Re: Periodic Boot Failure
Don't know if it was the same problem as mine but...
Machine: Mobo: ASUSTeK model: M5A78L-M/USB3 version: Rev X.0x Bios: American Megatrends version: 1401 date: 08/28/2012
CPU: Hexa core AMD FX-6100 Six-Core
Dual boot single disk Win 7 & Mint 13
Upgraded to Dual boot single disk Win 7 & Mint 15 and Nvidia 630 2g.
Problems In Mint 15 = would not shut down cleanly, intermittent artefacts on screen, live USB failing to boot getting 1/2 way through boot process , Multiboot USB would not get to grub screen.
Win 7 runs perfectly games play well - usb faults as above.
========================================================
Removed nvidia card - mint runs perfectly
Machine: Mobo: ASUSTeK model: M5A78L-M/USB3 version: Rev X.0x Bios: American Megatrends version: 1401 date: 08/28/2012
CPU: Hexa core AMD FX-6100 Six-Core
Dual boot single disk Win 7 & Mint 13
Upgraded to Dual boot single disk Win 7 & Mint 15 and Nvidia 630 2g.
Problems In Mint 15 = would not shut down cleanly, intermittent artefacts on screen, live USB failing to boot getting 1/2 way through boot process , Multiboot USB would not get to grub screen.
Win 7 runs perfectly games play well - usb faults as above.
========================================================
Removed nvidia card - mint runs perfectly
http://goo.gl/DXKgM LinuxMint tutorials.
Running LinuxMint 17.3 Mate. Pepermint 6 & Manjaro 15.12 Capella XFCE
http://goo.gl/WFu0u Installing Mint - the screen cast videos.
linuxcounter #368850
Running LinuxMint 17.3 Mate. Pepermint 6 & Manjaro 15.12 Capella XFCE
http://goo.gl/WFu0u Installing Mint - the screen cast videos.
linuxcounter #368850
Re: Periodic Boot Failure
z1p101 wrote:I have the same problem but has been getting worse lately. ... Mint 15 is the only OS that gives me this boot error.
Very interesting situation there..
z1p101 wrote: I also have 2 ATA DVD drives(set master, slave), a floppy drive, and a USB multi card reader on the machine and the BIOS is set to boot from the hard drive last. Maybe that makes a difference?
I have a SATA DVD drive also, but I didn't mention that as I imagine it's fairly irrelevant, but mine is set to boot from cd/dvd prior to disk also.
z1p101 wrote: I also have an Intel machine(Sandy Bridge Pentium on a H61 board) that has only Mint 15 installed and it runs flawlessly. It boots in efi mode if that makes a differance.
This does seem to be AMD related given the above and usbtux's issue also. I just booted today and received a ~10-15 second cycling stack trace. Which actually came in handy for taking pictures (see attached, apologies for the blur).
There were definite soft and hard cpu lockups involved even if they aren't shown here. These images were in sequence and they 'should' follow on from each other, but when the screen reloads I can't be sure a whole load of stack-trace doesn't fly off the screen.
Last edited by michaelm on Tue Aug 27, 2013 1:43 pm, edited 1 time in total.
- michaelm
Re: Periodic Boot Failure
michaelm,
Open the file /etc/default/grub and change the line from GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" to GRUB_CMDLINE_LINUX_DEFAULT="debug=*" then run
It might provide a little more info.
I notice you're running kernel version 3.8.0-19. Try doing a dist-upgrade to get the kernel to 3.8.0-29. Change that file above so the change carries over to the new kernel.
Install smartmontools and runto have a look at the health of the disk
Also does this computer lock-up after a while
I didn't notice any mention of that in the prior posts. Do you get the same message about cpu#3 when it fails 
Open the file /etc/default/grub and change the line from GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" to GRUB_CMDLINE_LINUX_DEFAULT="debug=*" then run
Code: Select all
sudo /usr/sbin/update-grub
I notice you're running kernel version 3.8.0-19. Try doing a dist-upgrade to get the kernel to 3.8.0-29. Change that file above so the change carries over to the new kernel.
Install smartmontools and run
Code: Select all
sudo smartctl -a /dev/sda
Also does this computer lock-up after a while




Re: Periodic Boot Failure
Michaelm
I'm just trying to weed out the knowns and unknowns. Last night I looked into it being a nvidia problem because I know that the nvidia kernel module is finicky about being matched up to the correct kernel build and I don't know how the repo handles that. Gave up on that one quick.
I think WarfRat is on to something with the problem being connected to grub. I created a custom grub boot script following [url=http://www.dedoimedo.com/computers/grub-2.html]the instructions here[/url] and this is kinda what it looks like.
The grub syntax has changed a little since the tutorial was done. I pulled most of the code out of my grub.cfg file and removed some of the flags like acpi=force and $vt_handoff. I can't guarantee it is fixed but the system seems to boot faster and with a lot less "static" and with no errors yet.
WarfRat, with me the system either boots or it does't. Once running everything is fine. One other thing, sometimes the last line of my stack trace reads something like "recursive error fixed but reboot is needed". Can't quote it exactly but that is the basic message.
Thanks.
I'm just trying to weed out the knowns and unknowns. Last night I looked into it being a nvidia problem because I know that the nvidia kernel module is finicky about being matched up to the correct kernel build and I don't know how the repo handles that. Gave up on that one quick.
I think WarfRat is on to something with the problem being connected to grub. I created a custom grub boot script following [url=http://www.dedoimedo.com/computers/grub-2.html]the instructions here[/url] and this is kinda what it looks like.
Code: Select all
#!/bin/sh -e
echo "Adding Mint to GRUB 2"
cat << EOF
menuentry "Mint 15 partiton 6" {
set root='hd0,msdos6'
linux /vmlinuz root=UUID=xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx ro quiet splash
initrd /initrd.img
}
EOF
WarfRat, with me the system either boots or it does't. Once running everything is fine. One other thing, sometimes the last line of my stack trace reads something like "recursive error fixed but reboot is needed". Can't quote it exactly but that is the basic message.
Thanks.
Re: Periodic Boot Failure
Mine is the same in both the above regards.z1p101 wrote:
WarfRat, with me the system either boots or it does't. Once running everything is fine. One other thing, sometimes the last line of my stack trace reads something like "recursive error fixed but reboot is needed". Can't quote it exactly but that is the basic message.
Thanks.
I'll do a kernel/grub update and see if that resolves anything. Thanks for the suggestions.
- michaelm
Re: Periodic Boot Failure
Done both of the above, no disk errors as I expected - I'll give it a few cycles to test the newer grub before doing kernel upgrade, that way I'll at least know what fixed it (assuming one of them willWharfRat wrote:michaelm,
Open the file /etc/default/grub and change the line from GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" to GRUB_CMDLINE_LINUX_DEFAULT="debug=*" then runIt might provide a little more info.Code: Select all
sudo /usr/sbin/update-grub
...
Install smartmontools and runto have a look at the health of the diskCode: Select all
sudo smartctl -a /dev/sda

- michaelm
Re: Periodic Boot Failure
Grub update doesn't seem to have worked, see attached for new screenshots - essentially the same as before. I've also added a shot of the successful boot's grub debug output. Hopefully this is of use to Mint developers.
I will try kernel update next and report back when/if it fails again.
I will try kernel update next and report back when/if it fails again.
- michaelm
Re: Periodic Boot Failure
michaelm,
Those photos are a great help. Your first lockup referenced cpu 2 and the latest referenced cpu 3. Either you actually have a failing cpu or nmi watchdog is acting too quickly in detecting a hardware hang.
If your system runs OK when it does boot, then try disabling watchdog by adding nmi_watchdog=0 to your kernel line.
It appears to have gotten through the initramfs OK, but leave the debug=* there anyway just in case.
Those photos are a great help. Your first lockup referenced cpu 2 and the latest referenced cpu 3. Either you actually have a failing cpu or nmi watchdog is acting too quickly in detecting a hardware hang.
If your system runs OK when it does boot, then try disabling watchdog by adding nmi_watchdog=0 to your kernel line.
It appears to have gotten through the initramfs OK, but leave the debug=* there anyway just in case.


Re: Periodic Boot Failure
I wouldn't think its the CPU, it's relatively new plus never had any issues on mint 13,14 over the last year.
The system runs completely without issue after booting. Also, those images are just a sample - I'm sure it's been on every core (0-3) at least once by now. I'll do the nmi watchdog change and see what that does.
Thanks again.
The system runs completely without issue after booting. Also, those images are just a sample - I'm sure it's been on every core (0-3) at least once by now. I'll do the nmi watchdog change and see what that does.
Thanks again.
- michaelm
Re: Periodic Boot Failure
nmi watchdog not the solution either, got another failure today - see attached screenshots for latest.
- michaelm
Re: Periodic Boot Failure
Apologies for adding yet another screenshot to this post, but today's result was 7x straight failures - only booting into recovery mode allowed me to post this (running software graphics also). There were several mint updates due, so they've been applied and I'll reboot again after posting this. Hopefully it comes back again, at least on 1 in 3 attempts.. Does anyone know what area is the most likely cause for this (kernel, hardware compatibility, hardware errors etc.)? so at least I can inform the relevant people. Eagerly awaiting the next mint release to see if that sorts things out.
Regards,
Michael.
Regards,
Michael.
- michaelm
Re: Periodic Boot Failure
FYI this issue still persists, manageable (in that it's usually not more than 3 reboots before I'm good to go) - If anyone else knows of similar issues which have been resolved I'd be keen to hear the solution. Thanks again all.
- michaelm