Motherboard I/O controller problems

edited June 2012 in Hardware
Well I came home today and powered on my server's monitor to see a wonderful array of disk I/O errors, from god knows when. I didn't get a shot of them, but they definitely looked to have been caused by a hardware defect. Upon pressing C+A+D, bash returned that "/sbin/shutdown" couldn't be loaded.

I haven't powered the server back on since, but I'll go get my UBCD and try to run a hard disk diagnostic soon.

This is really aggravating, if this hard drive dies, this WD800 was pretty much the last medium-small drive in my lineup of ones to use for the server. Now I'll have to go spend $70 on a 500GB SATA II drive, which I really don't want to do. Let me clarify. I think that the motherboard's I/O controller has been killing hard drives all along. It would make a lot of sense, but at the same time, it's now cost me a LOT in data storage, especially with the increased price of drives now.

The server (hardware-wise, a desktop) consists of the following:
- Gigabyte EP45-UD3R (rev. 1.0)
- C2Q/Q6600 (2.4 GHz)
- 3GB DDR2 800
- WD800 EIDE
Including its current WD800, this computer has successfully ravished the following hard disks:
- Maxtor SATA I (~160GB)
- Another Maxtor, SATA I (~200GB) (the two Maxtors were in the computer when I got it, and died shortly after powering it up. Each had (though I wasn't sure for how long) developed 2200+ reallocated sectors. They ran somewhat hot, IIRC, but that's beside the point.)
- Seagate, SATA II, 160GB; I bought this drive on eBay, which the seller claimed was working. After 4 or 5 hours of operation, it developed 600+ reallocated sectors.
- Western Digital WD600 EIDE (60GB); I'm not sure it ever died, it was the drive in it before I attempted the swap (see the next drive), it always had its perpetual 11 reallocated sectors, and it would vibrate and buzz loudly for about half an hour after initial spinup, but as it was a server drive it worked pretty well for a while. So this one doesn't exactly count; it had a running life of 7.6 years according to its SMART tables. I liked that drive, lol.
- Samsung SATA II, (1TB); this one it only killed halfway and I was able to revive it. The drive kept developing "pending sectors" while I had it installed, and I would get repeated I/O errors in either dmesg or file transfers, especially in my last emergency backup of it. I put it back in its initial computer, and it wiped successfully and the "pending" sectors went away. Hmm... (there's another thread about this here...)
- And now, of course: Western Digital WD800, EIDE, (80GB); apparently just failed.
I was still in the process of configuring the Linux installation on this machine and while I hadn't put any non-transitory data back onto the drive, it's going to suck to have to start all over, as well as to have to find a new drive, or motherboard, or server.

I guess that's what I get for being a freeloader. Had to replace the case, PSU, hard drives, optical drive, GPU, and apparently now the entire computer was useless. Bummer. So I guess I have an extra case, and some extra peripherals. It's a pity, because to buy another LGA775 board wouldn't be cost efficient, though this Q6600 is quite a nice CPU for a freebie. The board, actually, was always like kind of bent, or permanently flexed. I tried to bend it back when I switched cases, but it I think permanently resides that way. Probably not the main cause of the problem, but a symptom of whoever built it before being an idiot...

Comments

  • What both annoys and surprises me about this is that both SATA and IDE drives have been killed by this board... does this mean both controllers are dead? that's lame. Gigabyte's terrible.
  • On the positive side, I still have (I think) those dead hard drives from before, which may in fact be salvageable. That would actually be great news, that's like 3 extra SATA II hard drives... we'll see.

    My only "extra" PC with SATA, would be I guess the Asus Terminator C3, because I don't want to screw with the Dell. Hmm...

    anyway, should I just trash the rest of this PC, or tear it down and assimilate it into my component storage and stuff? Or is this just a really annoying coincidence?
  • It's possible that it's the motherboard, probably a bad southbridge.

    You did mention replacing the PSU... I would suggest that a bad PSU could have been the initial cause. Damaging the southbridge and then the hard drives.

    But then I would also make sure that the replacement PSU hasn't gone bad as well.
  • Have you ever stuck the 'dead' drives into another PC and see if the I/O errors persist? It's highly unlikely that the SATA chipset is outright *killing* the drives. If it's the SATA controller and your PSU is good, just order a cheap PCI controller off Newegg, they're like $15.

    What does smartctl report? (smartctl --test=short /dev/sdX)
  • The original PSU definitely may have been a factor; it powered on but was a crappy non-ATX one, specific to its Antec case.

    The PSU it recently used (I've since taken it apart) was a nicer Antec 500W ATX, history unknown, but it's voltages were no more than +/- 0.1V from specs.

    stitch: I should've run "smartctl;" however, I've since taken it apart almost entirely, as it threw bootloader errors, even though the hard drive was recognized by the BIOS.

    Currently I intend to retest the various hard drives in the C3 and see if I can wipe them. The original two may have actually died because they were clearly use with the crap PSU for some time, but we'll see. Then I guess I'll swap graphics cards and make the Phenom II a hybrid server/workstation.
  • Also, what concerns me is that both IDE and SATA drives have died. If indeed I can salvage the board, etc., a PCI IDE controller even would be a good idea, I believe I have an extra...
  • IDE and SATA controllers are functions of the same chip, i.e. the southbridge.
  • ah, that would make sense, thanks for explaining.
    What I must have been confused by was the presence of the Gigabyte SATA 2 chip on the board, which I believe semi-killed that TB drive. Maybe it's only semi-dead. Either way I think if anything I'll buy a PCI controller. Else, I'll switch computers and make the Phenom the server as mentioned before.
  • My only suspicion is, now, how would I confirm if the southbridge chip is bad, i.e. via a diagnostic; and, also, wouldn't that then affect the onboard audio, USB, and other critical I/O, regardless? (Even if I buy a controller card...)

    (In which case, anyone want a Zalman CPU fan and a Q6600? :P)
  • Well, it depends on what's wrong with the chip. It's possible that only certain functions are impaired. It's also possible that the other functions are impaired and you don't realize it yet.

    As for diagnostic software, I'm not aware of anything that tests the chipset specifically. I also can't say that I've really looked.
  • If you're getting rid of it, I'll take the CPU and the cooler.. maybe the RAM if it's DDR2. My grandma's PC is LGA775 and could really use a good boost. :P

    <Editing to add something I forgot>

    Also yeah, it will usually. If one part of the southbridge is dying, is it really worth it to risk more parts dying? That could also be only the things you know that are dying..
  • right. Well anyway, I don't think I'll get rid of it entirely, and while I don't *need* a Q6600, they still fetch some money on the eBays and whatnot, but we'll see...

    Anyway, I'm working on testing the hard drives. So far: Maxtor I (120GB, actually) is quite dead. A simple zero failed halfway through. Maxtor II (250GB, actually) seems fine, it wiped successfully despite previous records of reallocated sectors. I don't think I'll use it for a main drive, but maybe a backup. Next up is the Seagate which I think I'll have better chances with.
  • gdea73 wrote:
    right. Well anyway, I don't think I'll get rid of it entirely, and while I don't *need* a Q6600, they still fetch some money on the eBays and whatnot, but we'll see...

    Anyway, I'm working on testing the hard drives. So far: Maxtor I (120GB, actually) is quite dead. A simple zero failed halfway through. Maxtor II (250GB, actually) seems fine, it wiped successfully despite previous records of reallocated sectors. I don't think I'll use it for a main drive, but maybe a backup. Next up is the Seagate which I think I'll have better chances with.

    If you want to get rid of that Q6600 I would be tempted to buy that off of you for either cash or assorted working server hardware.
  • That actually is an option I may be willing to consider; PM me and we can discuss that. Basically I just have to figure out how to remove the HSF :P.
  • gdea73 wrote:
    That actually is an option I may be willing to consider; PM me and we can discuss that. Basically I just have to figure out how to remove the HSF :P.

    That's not very hard to do..
  • yes, I know, I'm not entirely retarded lol... I just hadn't looked it up yet, and normally they are somewhat hard to remove especially when the thermal paste sticks the processor to it pretty well... I normally use a hair drier around the base of the HSF to heat it up first if the computer isn't in working condition, otherwise I've had the CPU come out with the heatsink, without unlatching the pins, which is generally bad...
  • That's the beauty of LGA. The pins are on the motherboard and it's pretty much impossible (or at least, hard to do) to remove the CPU accidentally.
  • that is actually nice. And you don't have to be as careful storing the CPUs, as the contact points aren't exposed nearly as much.

    Though at the same time, if you get bent pins, they're probably far easier to straighten on a 1.5" square, separate from the board, than on the board's socket itself. But that's generally bad anyway.

    LGA and built-in thermal protection, come on, AMD :P
  • I've seen some newer opterons in LGA sockets.
  • That's cool, though I wish AMD would use LGA for desktop boards and such. Ah well.

    I got the Q6600's HSF off, and I have the CPU now, cleaned of its previous TIM.

    I also figured out how to remove the Athlon / socket 476 heatsinks and had a lot of fun ripping like 5 of those apart. I found it interesting that an Athlon XP 2800+ (IIRC) stills held the 1999 copyright date like my other Durons... I thought they were way ahead of those chips but perhaps I was wrong...
  • I'm pretty sure AMD will move to LGA for desktop boards eventually. As stitch mentioned, they're already starting to on server boards.
  • I both hope so, and don't hope so.

    I hope so because LGA is amazing.

    I hope not because I'd like to see FM1 be current for a few more years. FM1 systems are such a good value now, for compact systems without graphics cards.

    Although Intel is STILL the absolute best value.. lol.
  • Intel. Value.


    K.
  • The Core i3 vastly outperforms everything AMD has right now in non-multi threaded tasks.

    And anything that requires the i3 is probably going to use an external graphics card anyway.. and even if it didn't, the APU graphics aren't really that good either. You're still not going to be able to play games or anything.

    Plus having the LGA1155 socket means that if you still want, you can upgrade it to an i5, which is pretty much the fastest chip that doesn't cost an arm and a leg..
Sign In or Register to comment.