Motherboard I/O controller problems
Well I came home today and powered on my server's monitor to see a wonderful array of disk I/O errors, from god knows when. I didn't get a shot of them, but they definitely looked to have been caused by a hardware defect. Upon pressing C+A+D, bash returned that "/sbin/shutdown" couldn't be loaded.
I haven't powered the server back on since, but I'll go get my UBCD and try to run a hard disk diagnostic soon.
This is really aggravating, if this hard drive dies, this WD800 was pretty much the last medium-small drive in my lineup of ones to use for the server. Now I'll have to go spend $70 on a 500GB SATA II drive, which I really don't want to do. Let me clarify. I think that the motherboard's I/O controller has been killing hard drives all along. It would make a lot of sense, but at the same time, it's now cost me a LOT in data storage, especially with the increased price of drives now.
The server (hardware-wise, a desktop) consists of the following:
- Gigabyte EP45-UD3R (rev. 1.0)
- C2Q/Q6600 (2.4 GHz)
- 3GB DDR2 800
- WD800 EIDE
Including its current WD800, this computer has successfully ravished the following hard disks:
- Maxtor SATA I (~160GB)
- Another Maxtor, SATA I (~200GB) (the two Maxtors were in the computer when I got it, and died shortly after powering it up. Each had (though I wasn't sure for how long) developed 2200+ reallocated sectors. They ran somewhat hot, IIRC, but that's beside the point.)
- Seagate, SATA II, 160GB; I bought this drive on eBay, which the seller claimed was working. After 4 or 5 hours of operation, it developed 600+ reallocated sectors.
- Western Digital WD600 EIDE (60GB); I'm not sure it ever died, it was the drive in it before I attempted the swap (see the next drive), it always had its perpetual 11 reallocated sectors, and it would vibrate and buzz loudly for about half an hour after initial spinup, but as it was a server drive it worked pretty well for a while. So this one doesn't exactly count; it had a running life of 7.6 years according to its SMART tables. I liked that drive, lol.
- Samsung SATA II, (1TB); this one it only killed halfway and I was able to revive it. The drive kept developing "pending sectors" while I had it installed, and I would get repeated I/O errors in either dmesg or file transfers, especially in my last emergency backup of it. I put it back in its initial computer, and it wiped successfully and the "pending" sectors went away. Hmm... (there's another thread about this here...)
- And now, of course: Western Digital WD800, EIDE, (80GB); apparently just failed.
I was still in the process of configuring the Linux installation on this machine and while I hadn't put any non-transitory data back onto the drive, it's going to suck to have to start all over, as well as to have to find a new drive, or motherboard, or server.
I guess that's what I get for being a freeloader. Had to replace the case, PSU, hard drives, optical drive, GPU, and apparently now the entire computer was useless. Bummer. So I guess I have an extra case, and some extra peripherals. It's a pity, because to buy another LGA775 board wouldn't be cost efficient, though this Q6600 is quite a nice CPU for a freebie. The board, actually, was always like kind of bent, or permanently flexed. I tried to bend it back when I switched cases, but it I think permanently resides that way. Probably not the main cause of the problem, but a symptom of whoever built it before being an idiot...
I haven't powered the server back on since, but I'll go get my UBCD and try to run a hard disk diagnostic soon.
This is really aggravating, if this hard drive dies, this WD800 was pretty much the last medium-small drive in my lineup of ones to use for the server. Now I'll have to go spend $70 on a 500GB SATA II drive, which I really don't want to do. Let me clarify. I think that the motherboard's I/O controller has been killing hard drives all along. It would make a lot of sense, but at the same time, it's now cost me a LOT in data storage, especially with the increased price of drives now.
The server (hardware-wise, a desktop) consists of the following:
- Gigabyte EP45-UD3R (rev. 1.0)
- C2Q/Q6600 (2.4 GHz)
- 3GB DDR2 800
- WD800 EIDE
Including its current WD800, this computer has successfully ravished the following hard disks:
- Maxtor SATA I (~160GB)
- Another Maxtor, SATA I (~200GB) (the two Maxtors were in the computer when I got it, and died shortly after powering it up. Each had (though I wasn't sure for how long) developed 2200+ reallocated sectors. They ran somewhat hot, IIRC, but that's beside the point.)
- Seagate, SATA II, 160GB; I bought this drive on eBay, which the seller claimed was working. After 4 or 5 hours of operation, it developed 600+ reallocated sectors.
- Western Digital WD600 EIDE (60GB); I'm not sure it ever died, it was the drive in it before I attempted the swap (see the next drive), it always had its perpetual 11 reallocated sectors, and it would vibrate and buzz loudly for about half an hour after initial spinup, but as it was a server drive it worked pretty well for a while. So this one doesn't exactly count; it had a running life of 7.6 years according to its SMART tables. I liked that drive, lol.
- Samsung SATA II, (1TB); this one it only killed halfway and I was able to revive it. The drive kept developing "pending sectors" while I had it installed, and I would get repeated I/O errors in either dmesg or file transfers, especially in my last emergency backup of it. I put it back in its initial computer, and it wiped successfully and the "pending" sectors went away. Hmm... (there's another thread about this here...)
- And now, of course: Western Digital WD800, EIDE, (80GB); apparently just failed.
I was still in the process of configuring the Linux installation on this machine and while I hadn't put any non-transitory data back onto the drive, it's going to suck to have to start all over, as well as to have to find a new drive, or motherboard, or server.
I guess that's what I get for being a freeloader. Had to replace the case, PSU, hard drives, optical drive, GPU, and apparently now the entire computer was useless. Bummer. So I guess I have an extra case, and some extra peripherals. It's a pity, because to buy another LGA775 board wouldn't be cost efficient, though this Q6600 is quite a nice CPU for a freebie. The board, actually, was always like kind of bent, or permanently flexed. I tried to bend it back when I switched cases, but it I think permanently resides that way. Probably not the main cause of the problem, but a symptom of whoever built it before being an idiot...
Comments
My only "extra" PC with SATA, would be I guess the Asus Terminator C3, because I don't want to screw with the Dell. Hmm...
anyway, should I just trash the rest of this PC, or tear it down and assimilate it into my component storage and stuff? Or is this just a really annoying coincidence?
You did mention replacing the PSU... I would suggest that a bad PSU could have been the initial cause. Damaging the southbridge and then the hard drives.
But then I would also make sure that the replacement PSU hasn't gone bad as well.
What does smartctl report? (smartctl --test=short /dev/sdX)
The PSU it recently used (I've since taken it apart) was a nicer Antec 500W ATX, history unknown, but it's voltages were no more than +/- 0.1V from specs.
stitch: I should've run "smartctl;" however, I've since taken it apart almost entirely, as it threw bootloader errors, even though the hard drive was recognized by the BIOS.
Currently I intend to retest the various hard drives in the C3 and see if I can wipe them. The original two may have actually died because they were clearly use with the crap PSU for some time, but we'll see. Then I guess I'll swap graphics cards and make the Phenom II a hybrid server/workstation.
What I must have been confused by was the presence of the Gigabyte SATA 2 chip on the board, which I believe semi-killed that TB drive. Maybe it's only semi-dead. Either way I think if anything I'll buy a PCI controller. Else, I'll switch computers and make the Phenom the server as mentioned before.
(In which case, anyone want a Zalman CPU fan and a Q6600? :P)
As for diagnostic software, I'm not aware of anything that tests the chipset specifically. I also can't say that I've really looked.
<Editing to add something I forgot>
Also yeah, it will usually. If one part of the southbridge is dying, is it really worth it to risk more parts dying? That could also be only the things you know that are dying..
Anyway, I'm working on testing the hard drives. So far: Maxtor I (120GB, actually) is quite dead. A simple zero failed halfway through. Maxtor II (250GB, actually) seems fine, it wiped successfully despite previous records of reallocated sectors. I don't think I'll use it for a main drive, but maybe a backup. Next up is the Seagate which I think I'll have better chances with.
If you want to get rid of that Q6600 I would be tempted to buy that off of you for either cash or assorted working server hardware.
That's not very hard to do..
Though at the same time, if you get bent pins, they're probably far easier to straighten on a 1.5" square, separate from the board, than on the board's socket itself. But that's generally bad anyway.
LGA and built-in thermal protection, come on, AMD :P
I got the Q6600's HSF off, and I have the CPU now, cleaned of its previous TIM.
I also figured out how to remove the Athlon / socket 476 heatsinks and had a lot of fun ripping like 5 of those apart. I found it interesting that an Athlon XP 2800+ (IIRC) stills held the 1999 copyright date like my other Durons... I thought they were way ahead of those chips but perhaps I was wrong...
I hope so because LGA is amazing.
I hope not because I'd like to see FM1 be current for a few more years. FM1 systems are such a good value now, for compact systems without graphics cards.
Although Intel is STILL the absolute best value.. lol.
K.
And anything that requires the i3 is probably going to use an external graphics card anyway.. and even if it didn't, the APU graphics aren't really that good either. You're still not going to be able to play games or anything.
Plus having the LGA1155 socket means that if you still want, you can upgrade it to an i5, which is pretty much the fastest chip that doesn't cost an arm and a leg..