Well, it happened again - the hard drive in my good old RedHat machine got corrupted. Strangely enough, previous failures also ocurred around this time in the previous few years. One of the failures was so bad that I had to buy hard drive recovery software to salvage my CVS repository. This time it seems there are 220+ bad blocks (about 1MB, at 4K/block), but most of the content is still accessible, although I still don't know the extent of the damage.
I have to say that I miss Windows' chkdsk, which not only reports bad blocks, but also the names of files or directories affected by the damaged blocks. e2fsck, on the other hand, comes up with pretty cryptic messages, such as "Attempt to read block from filesystem resulted in short read" or "...Force rewrite(y)?". I'm also more careful this time and so far haven't allowed e2fsck to auto-fix the partition - the last time I did this, it cost me the entire hard drive.
To add insult to the injury, it turned out that my Vantec NAS unit cannot handle files greater than 4GB when accessed through the network and the last full backup was just above 4GB, so I don't have a full backup of the damaged drive either.
So, I will spend a bit more time trying to recover the existing content, but if it turns out to be unrecoverable, I will probably just not ship Fedora Core binaries in the next release, as I don't have any other spare machine for FC builds and, with two donations per year, I'm not in the position to buy one.
April 10th, 2008
Last night I ran dump a few times, adding bad inode numbers to the exclusion list after every run and eventually was able to back up almost the whole drive. Once this was done, I ran e2fsck with -y. Half an hour later I had a usable hard drive with a few holes here and there. e2fsck's messages may be cryptic, but it did the job.
May 3rd, 2008
I finally figured out what the problem was. One day I noticed that the CD drive was performing erratically, sometimes failing to read perfectly clean CDs. Suspicious of this, I replaced the drive with a spare I had and, voilà, IDE interface stopped throwing sporadic SDA errors I was seeing once in a few days before.