Page 1 of 1

Mint died. What happened?

Posted: Mon May 28, 2007 7:39 am
by Lolo Uila
My new Mint install is dead, and I'm not sure why, or what happened?

Okay, so I did a fresh install of Cassandra on a blank 80GB hard drive this morning. There were no problems during the install, and the system did updates then restarted fine.

It ran most of the day without incident. I copied over some photos, mp3s, videos and documents from the Windows side of the system (I'm dual booting with Win2K on another hard drive). Messed around with gimp and some other apps I'm trying to familiarize myself with and everything was going fine.

Then my air conditioner started making bad noises. Since I had to shut the AC down for maintenance I decided to shut down the computers because it gets real hot in here with all the computers pumping out heat and no AC.

All the computers shut down normally (including the Mint machine). After a while I got the AC up and running again, but it was unbearably hot in here so I decided to go for a drive and let the house cool down again. I ended up having dinner with a girlfriend and got home this evening when things were nice and cool.

However, when I tried to boot up, Mint it got about 1/3rd of the way through the boot process, then the screen filled with all kinds of messages about missing files and volumes and superblocks, and fsck failing. I eventually got to a terminal prompt and tried to shutdown, but even that failed! An attempted reboot yielded the same result. Something is seriously screwed up here!

Now the really odd part is I booted the Cassandra live CD and manually mounted the volumes, and everything seems to be fine? I can read all the files, view the photos, etc. I'm in the live CD now listening to my mp3s off of the supposedly failed file system?

Running fsck from the live CD yields this:

mint@mint:~$ fsck.ext3 /media/disk-2
e2fsck 1.40-WIP (14-Nov-2006)
fsck.ext3: Is a directory while trying to open /media/disk-2

The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>

Anyone have any idea what the heck is going on here?

Posted: Mon May 28, 2007 9:06 am
by clem
Your hard drive seems to have suffered from the heat.. (I won't go into the details of why heat is bad for HDD or how you can save a HDD by putting it into your freezer.. let's leave the hardcore stuff aside :)).

fsck.ext3 should be run on your device, not on the mount point. Let's say your device is /dev/sda3:

sudo umount /dev/sda3
sudo fsck.ext3 /dev/sda3

Clem

Posted: Mon May 28, 2007 7:41 pm
by Lolo Uila
I ran the Seatools diagnostic on the drive and it passed all tests. There were no Smart errors recorded and the max temp recorded was well within the operating spec of the drive.

I swapped the drive for another 80GB drive so I could run the diagnostics on it. Since I still have access to the data on it, are there any log or error files that could be helpful in figuring this out?

There really isn't anything important on the drive since it was a fresh install; but I'm wondering why this happened, because I would rather it not happen again.

Thanks for the help.

Posted: Tue May 29, 2007 6:06 am
by Husse
Seatools diagnostic
Is that hardware only?
As far as I can see from Seagate it appears so.
I believe something in your computer did not like the heat and you got logical errors on the disk as a result.
Run fsck.ext3 as Clem suggests and it should repair your filesystem. You don't have something valuable on the drive, and if it turns out that you can't access part of your data you don't have to cry floods :)

Posted: Tue May 29, 2007 8:08 pm
by Boo
If the first superblock is dead fsck may not fix your problem automatically.
the superblock is saved in many places in the file system.

so you are going to have to use the man pages on fsck and use more options.
that is the fun part.
eg: fsck.ext3 -o b=blocknumber /dev/sda3

so to get the other block numbers you need to interrogate the raw device.(more man pages)
eg: newfs -Nv /dev/rdsk/c0t0d0s2

now you have the other block numbers work your way through them with the fsck command.

I hope this helps.
:D

Posted: Mon Jun 04, 2007 4:08 am
by Lolo Uila
Actually, what killed it is the new 2.6.20-16 kernel update. :x

http://www.linuxmint.com/forum/viewtopic.php?t=2894

Re: Mint died. What happened?

Posted: Mon Jun 04, 2007 1:18 pm
by scorp123
Lolo Uila wrote:My new Mint install is dead, and I'm not sure why, or what happened?
Bingo! Same thing here :evil:

I did a normal shutdown in my office ... and now when I wanted to boot my laptop again it gives me error after error.

To shorten the story: I booted with a live CD, checked all filesystems, and what I found out is that the /usr partition is now empty ... !!! :shock:

Oh well ... looks like re-install time ... :roll: But thanks to the UNIX partitioning scheme I always use I can now backup /etc first (because it's on the / partition) thus preserve my working system configuration, and I don't need to touch /home at all ...

It just sucks that something like this could even happen ... :?

Posted: Mon Jun 04, 2007 2:38 pm
by Lolo Uila
Wow, that sucks.

On my SATA test system I found I can still boot if I select the previous 2.6.20-15 kernel from the grub boot menu. It didn't hose my data, just my mount points.

Do you think it was the file system checks that run during the boot error that wiped your /usr partition?

Sorry you got hit with it too. This new kernel really messed up a lot of people. You should see all the activity about this on the Ubuntu forums. A lot of unhappy people there.

Aloha, Tim

Posted: Mon Jun 04, 2007 4:50 pm
by scorp123
Lolo Uila wrote:This new kernel really messed up a lot of people. You should see all the activity about this on the Ubuntu forums. A lot of unhappy people there.
That's precisely what I mean when I say that Ubuntu is still lacking a lot of polish and is not really for newbies .... Too many strange things can happen. :wink:

Posted: Tue Jun 05, 2007 5:19 am
by Husse
But this is heavily hardware dependent - i.e. ICH has got more or less a knock out, while nforce4 (as I have) isn't touched at all.....
Maybe the devs should have a few more computers to test on......