Periodic Boot Failure

Questions about Grub, UEFI,the liveCD and the installer
Forum rules
Before you post please read how to get help
Post Reply
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Periodic Boot Failure

Post by michaelm »

Hi all,
Firstly I apologise in advance for the somewhat vague description which follows...

Long time Mint user, and I've had the hardware setup listed below for some time without issue. However, I'm now in the situation where at irregular intervals I get a (very long) stack trace upon boot with no obvious 'ah nouveau died' style culprit. The only common reference point I have suspicions about is that I recently (~3 months) reinstalled on an older hard drive I had around to give my win 7 instance its own drive. Prior to mint install I ran Spinrite to clear up / test out the old drive to see if I was wasting my time and all went through smoothly.

Preceding the session I'm typing this from, the previous 5 restarts all resulted in failures (but before that it had been a week or more). Failures occur in many guises: sometimes it hangs with the stack trace, sometimes it reboots itself, sometimes the display just goes black and other times the mint loading logo comes up and never goes any further.

Any ideas welcome, I'm slightly confused as to what can cause irregular errors like this other than bad disk sectors - but I'm not convinced that's the issue. I will of course take a picture of the stack trace next time it happens and add to this post - I'm in the process of backing up currently so that might be another day.

Thanks,
Michael.

Hardware:

MB: GA-790A-D3 w/ F3 BIOS
CPU: AMD Phenom(tm) II X4 965 Processor (fam: 10, model: 04, stepping: 03)
RAM: 8GB (2x4GB) Corsair XMS 1333MHz Dual Channel (in the correct dual channel slots)
GPU: Nvidia GTS450
HDD0: /dev/sda: (WINDOWS 7 ULTIMATE x64)

ATA device, with non-removable media
Model Number: WDC WD2500KS-00MJB0
Serial Number: WD-WCANKH221xxx
Firmware Revision: 02.01C03

HDD1: /dev/sdb: (LINUX MINT 15 x64)

ATA device, with non-removable media
Model Number: MAXTOR STM3500320AS
Serial Number: 9QM14xxx
Firmware Revision: MX15
Transport: Serial

- michaelm
aless80
Level 2
Level 2
Posts: 82
Joined: Wed Nov 28, 2012 5:20 am

Re: Periodic Boot Failure

Post by aless80 »

If I understand what you are saying, you have a dual partition Linux/Window but in two hard disks. Is it possible that when you boot in WIndows, Window's UEFI system (check if you have that enabled) screws up Linux?
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

Hm, interesting idea - I don't think that is the case - however the point at which I moved to Win 7 also coincided with moving from a dual-boot-single-disk to a dual-boot-dual-disk, so I will definitely take a look at that. Also the linux disk/boot can go from working to not regardless of whether I have booted Windows in between, but UEFI is unchartered territory for me being a long-time linux user so I wouldn't rule it out! Thanks for the suggestion :)

Edit: Note I should add that the dual-boot-single-disk was XP and mint 14, not win7

- michaelm
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

I just booted win7 -> linux -> win7 -> linux and no issues. Not sure exactly whether that proves anything or not, but thanks for the idea - I think you might be in the right ball-park.

- michaelm
User avatar
z1p101
Level 1
Level 1
Posts: 31
Joined: Tue Jul 24, 2012 7:29 pm

Re: Periodic Boot Failure

Post by z1p101 »

I have the same problem but has been getting worse lately. The error comes and goes for some reason and the stack trace message is not always the same. I only have 1 hard drive on the machine and I don't have Windows installed on it. However, I do have drive set up to multiboot Mint 15 Cinnamon, CentOS 6, and OpenSUSE 12.3 and Mint 15 is the only OS that gives me this boot error.

Hardware:

AMD Phenom X2 550 processor
ECS NFORCE6M-A2 mother board w/ American Megatrends BIOS
HD0: WDC WD5000AAKS
RAM: 2X2GB Corsair XMS DDR3 dual channel.
NVIDIA 210 graphics card.
Open SUSE controls GRUB.

I also have 2 ATA DVD drives(set master, slave), a floppy drive, and a USB multi card reader on the machine and the BIOS is set to boot from the hard drive last. Maybe that makes a difference?

I also have an Intel machine(Sandy Bridge Pentium on a H61 board) that has only Mint 15 installed and it runs flawlessly. It boots in efi mode if that makes a differance.

Thanks.
User avatar
usbtux
Level 5
Level 5
Posts: 974
Joined: Tue Dec 28, 2010 10:37 am

Re: Periodic Boot Failure

Post by usbtux »

Don't know if it was the same problem as mine but...

Machine: Mobo: ASUSTeK model: M5A78L-M/USB3 version: Rev X.0x Bios: American Megatrends version: 1401 date: 08/28/2012
CPU: Hexa core AMD FX-6100 Six-Core
Dual boot single disk Win 7 & Mint 13

Upgraded to Dual boot single disk Win 7 & Mint 15 and Nvidia 630 2g.

Problems In Mint 15 = would not shut down cleanly, intermittent artefacts on screen, live USB failing to boot getting 1/2 way through boot process , Multiboot USB would not get to grub screen.

Win 7 runs perfectly games play well - usb faults as above.
========================================================
Removed nvidia card - mint runs perfectly
http://goo.gl/DXKgM LinuxMint tutorials.
Running LinuxMint 17.3 Mate. Pepermint 6 & Manjaro 15.12 Capella XFCE
http://goo.gl/WFu0u Installing Mint - the screen cast videos.
linuxcounter #368850
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

z1p101 wrote:I have the same problem but has been getting worse lately. ... Mint 15 is the only OS that gives me this boot error.


Very interesting situation there..
z1p101 wrote: I also have 2 ATA DVD drives(set master, slave), a floppy drive, and a USB multi card reader on the machine and the BIOS is set to boot from the hard drive last. Maybe that makes a difference?


I have a SATA DVD drive also, but I didn't mention that as I imagine it's fairly irrelevant, but mine is set to boot from cd/dvd prior to disk also.
z1p101 wrote: I also have an Intel machine(Sandy Bridge Pentium on a H61 board) that has only Mint 15 installed and it runs flawlessly. It boots in efi mode if that makes a differance.


This does seem to be AMD related given the above and usbtux's issue also. I just booted today and received a ~10-15 second cycling stack trace. Which actually came in handy for taking pictures (see attached, apologies for the blur).
There were definite soft and hard cpu lockups involved even if they aren't shown here. These images were in sequence and they 'should' follow on from each other, but when the screen reloads I can't be sure a whole load of stack-trace doesn't fly off the screen.
Last edited by michaelm on Tue Aug 27, 2013 1:43 pm, edited 1 time in total.

- michaelm
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

Rest of the stack trace pics.

- michaelm
User avatar
WharfRat
Level 21
Level 21
Posts: 13909
Joined: Thu Apr 07, 2011 8:15 pm

Re: Periodic Boot Failure

Post by WharfRat »

michaelm,

Open the file /etc/default/grub and change the line from GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" to GRUB_CMDLINE_LINUX_DEFAULT="debug=*" then run

Code: Select all

sudo /usr/sbin/update-grub
It might provide a little more info.

I notice you're running kernel version 3.8.0-19. Try doing a dist-upgrade to get the kernel to 3.8.0-29. Change that file above so the change carries over to the new kernel.

Install smartmontools and run

Code: Select all

sudo smartctl -a /dev/sda
to have a look at the health of the disk

Also does this computer lock-up after a while :?: I didn't notice any mention of that in the prior posts. Do you get the same message about cpu#3 when it fails :?:
ImageImage
User avatar
z1p101
Level 1
Level 1
Posts: 31
Joined: Tue Jul 24, 2012 7:29 pm

Re: Periodic Boot Failure

Post by z1p101 »

Michaelm
I'm just trying to weed out the knowns and unknowns. Last night I looked into it being a nvidia problem because I know that the nvidia kernel module is finicky about being matched up to the correct kernel build and I don't know how the repo handles that. Gave up on that one quick.

I think WarfRat is on to something with the problem being connected to grub. I created a custom grub boot script following [url=http://www.dedoimedo.com/computers/grub-2.html]the instructions here[/url] and this is kinda what it looks like.

Code: Select all

#!/bin/sh -e
echo "Adding Mint to GRUB 2"
cat << EOF
menuentry "Mint 15 partiton 6" {
set root='hd0,msdos6'
linux /vmlinuz root=UUID=xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx ro quiet splash
initrd /initrd.img
}
EOF
The grub syntax has changed a little since the tutorial was done. I pulled most of the code out of my grub.cfg file and removed some of the flags like acpi=force and $vt_handoff. I can't guarantee it is fixed but the system seems to boot faster and with a lot less "static" and with no errors yet.

WarfRat, with me the system either boots or it does't. Once running everything is fine. One other thing, sometimes the last line of my stack trace reads something like "recursive error fixed but reboot is needed". Can't quote it exactly but that is the basic message.

Thanks.
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

z1p101 wrote:
WarfRat, with me the system either boots or it does't. Once running everything is fine. One other thing, sometimes the last line of my stack trace reads something like "recursive error fixed but reboot is needed". Can't quote it exactly but that is the basic message.

Thanks.
Mine is the same in both the above regards.

I'll do a kernel/grub update and see if that resolves anything. Thanks for the suggestions.

- michaelm
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

WharfRat wrote:michaelm,

Open the file /etc/default/grub and change the line from GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" to GRUB_CMDLINE_LINUX_DEFAULT="debug=*" then run

Code: Select all

sudo /usr/sbin/update-grub
It might provide a little more info.

...

Install smartmontools and run

Code: Select all

sudo smartctl -a /dev/sda
to have a look at the health of the disk
Done both of the above, no disk errors as I expected - I'll give it a few cycles to test the newer grub before doing kernel upgrade, that way I'll at least know what fixed it (assuming one of them will :wink: )

- michaelm
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

Grub update doesn't seem to have worked, see attached for new screenshots - essentially the same as before. I've also added a shot of the successful boot's grub debug output. Hopefully this is of use to Mint developers.

I will try kernel update next and report back when/if it fails again.

- michaelm
User avatar
WharfRat
Level 21
Level 21
Posts: 13909
Joined: Thu Apr 07, 2011 8:15 pm

Re: Periodic Boot Failure

Post by WharfRat »

michaelm,

Those photos are a great help. Your first lockup referenced cpu 2 and the latest referenced cpu 3. Either you actually have a failing cpu or nmi watchdog is acting too quickly in detecting a hardware hang.

If your system runs OK when it does boot, then try disabling watchdog by adding nmi_watchdog=0 to your kernel line.

It appears to have gotten through the initramfs OK, but leave the debug=* there anyway just in case.
ImageImage
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

I wouldn't think its the CPU, it's relatively new plus never had any issues on mint 13,14 over the last year.

The system runs completely without issue after booting. Also, those images are just a sample - I'm sure it's been on every core (0-3) at least once by now. I'll do the nmi watchdog change and see what that does.

Thanks again.

- michaelm
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

nmi watchdog not the solution either, got another failure today - see attached screenshots for latest.

- michaelm
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

Apologies for adding yet another screenshot to this post, but today's result was 7x straight failures - only booting into recovery mode allowed me to post this (running software graphics also). There were several mint updates due, so they've been applied and I'll reboot again after posting this. Hopefully it comes back again, at least on 1 in 3 attempts.. Does anyone know what area is the most likely cause for this (kernel, hardware compatibility, hardware errors etc.)? so at least I can inform the relevant people. Eagerly awaiting the next mint release to see if that sorts things out.

Regards,
Michael.

- michaelm
michaelm
Level 1
Level 1
Posts: 19
Joined: Tue Feb 28, 2012 3:01 pm

Re: Periodic Boot Failure

Post by michaelm »

FYI this issue still persists, manageable (in that it's usually not more than 3 reboots before I'm good to go) - If anyone else knows of similar issues which have been resolved I'd be keen to hear the solution. Thanks again all.

- michaelm
Post Reply

Return to “Installation & Boot”