Version 25 (modified by 16 years ago) ( diff ) | ,
---|
List of Node Failures
Node | Failure Mode | Solution / Notes |
[1,5] | Pxe Halt - Locks up during execution of PXE code | Multiple resets (more than 1) may be required Might require node Change |
[1,5] | Dead Node ID box top LED (the blinking one) | Power cycle Fixed it Rabbit Issue? |
[3,8] | First Power on Halt | Locks during the first attempt Post after reset |
[17,4] | First Power on Halt | Locks during the first attempt no serial console output |
[1,14] | First Power on Halt | Locks during the first attempt Reset Fixes it has new disk |
[20,19] | Disk Failure | Kernel Throws errors during imageing Disk Changed |
[12,9] | Disk Controller Failure | Disk controller was having issues, disks were being incorrectly recognised |
[3,18] | Disk Failure | Disk Write errors Disk replaced |
[5,11] | Disk Failure | Disk Write errors Disk replaced |
[14,11] | Disk Failure | Disk Write errors Disk replaced |
[13,5] | Lock Up | Rabbit and Node were halted Power cycled |
[4,11] | Disk Failure | Disk Write errors Disk replaced |
[5,9] | Disk Failure | Disk Write errors Disk replaced |
[9,11] | Disk Failure | Disk Write errors Disk replaced |
[3,19] | Bad Node | Mother board Failure, refused to boot Replaced |
[14,8] | Disk Failure | Kernel Throws Disk Errors Disk Changed |
[17,9] | Disk Failure | Disk write halts, imaging times out Disk replaced |
[18,3] | Over heat | CM measures internal temp at 106F, fails to boot reliably |
[20,2] | Disk Failure | Disk Write errors Disk replaced |
[8,13] | Disk Failure | Disk Write errors Disk replaced |
[9,10] | Disk Failure | Disk Write errors Disk replaced |
[5,2] | Disk Failure | Disk Write errors Disk replaced |
[17,13] | Disk Failure | Disk Write errors Disk replaced |
[12,1] | Disk Failure | Disk Write errors Disk replaced |
[6,14] | Disk Failure | Disk Write errors Disk replaced |
[17,19] | Memory Failure | Memory Pins did not make proper contact, Bent case and reinserted memory |
[7,2] | Disk Failure | Disk Write errors Disk replaced |
[5,15] | Lock Up | Rabbit and Node were halted, node ID box LED was solid Power cycled |
[7,2] | Lock Up | Rabbit and Node were halted, node ID box led was off Power cycled |
[16,1] | Lock Up | Rabbit and Node were halted Power cycled |
[1,9] | Intermitten failure | Power cycled |
[1,5] | Disk Failure | Failing disk caused disk controller to fail Cm had issues also, both replaced |
[9,4] | Disk Failure | Failing disk caused disk controller to fail Cm had issues also, both replaced |
[15,6] | Disk Failure | Disk Write errors Disk replaced |
[18,16] | Disk Failure | Disk Write errors Disk replaced |
[3,11] | Disk Failure | Disk Write errors Disk replaced |
[16,19] | Disk Failure | Disk Write errors Disk replaced |
[5,17] | Disk Failure | Disk Write errors Disk replaced |
[20,4] | Node Failure | Node was replaced |
[15,4] | Node Failure | Node was replaced, bad left antenna connector. Replacement was used |
[5,14] | Overheat | Fan was not plugged in |
[17,4] | Disk Failure | Smartctl reports impending disk death |
[9,9] | Memory Failure | Memory Pins did not make proper contact, Bent case and reinserted memory |
[11,4] | Disk Failure | Disk Write errors Disk replaced |
[12,7] | Disk Failure | Disk Write errors Disk replaced |
[13,2] | Disk Failure | Successfully booted from disk, but kernel was throwing disk errors |
[16,6] | Disk Failure | SMART overall-health self-assessment test result: FAILED! |
[13,5] | Disk Failure | kernel throwing disk errors |
[17,3] | Disk Failure | kernel throwing disk errors |
[14,12] | Pxe Halt - Locks up during execution of PXE code | Not Fixed |
[11,15] | Network Failure | Pxe give media check failure ] Node replaced |
[19,6] | Pxe Halt | Powers down during pxe |
[15,7] | Pxe Halt | Halts at random stages in the pxe image download process, before control in handed over to kernel |
[16,8] | CM crash | Power Cycled |
[20,20] | CM crash | CM light stays solid, Power Cycled |
[7,2] | CM crash | Node ID light stays off, Power Cycled |
[2,20] | CM crash | CM light stays solid, Power Cycled |
[14,12] | Disk Failure | Disk Write errors Disk replaced |
[10,7] | Disk Failure | Disk Write errors Disk replaced |
[11,18] | Disk Failure | Disk Write errors Disk replaced |
[1,15] | Disk Failure | Disk Write errors Disk replaced |
[8,3] | Disk Failure | Disk Write errors Disk replaced |
[2,11] | Disk Failure | Disk Write errors Disk replaced |
[11,16] | Disk Failure | Disk Write errors Disk replaced |
[7,8] | Disk Failure | Bios Does not detect disk Disk replaced |
[18,7] | Disk Failure | Bios Does not detect disk Disk replaced |
[2,17] | Disk Failure | Bios Does not detect disk Disk replaced |
[5,19] | Disk Failure | Bios Does not detect disk Disk replaced |
[7,2] | Disk Failure | kernel throwing disk errors Disk replaced |
[12,4] | Disk Failure | kernel throwing disk errors Disk replaced |
[1,8] | Disk Failure | kernel throwing disk errors Disk replaced |
[18,18] | Disk Failure | kernel throwing disk errors Disk replaced |
[14,20] | Disk Failure | kernel throwing disk errors Disk replaced |
[9,16] | Disk Failure | kernel throwing disk errors Disk replaced |
[4,6] | Disk Failure | kernel throwing disk errors Disk replaced |
[6,8] | Disk Failure | kernel throwing disk errors Disk replaced |
[3,13] | Disk Failure | kernel throwing disk errors Disk replaced |
[5,4] | Disk Failure | kernel throwing disk errors Disk replaced |
[10,5] | Disk Failure | kernel throwing disk errors Disk replaced |
[10,8] | Disk Failure | kernel throwing disk errors Disk replaced |
[8,8] | Network Failure | Kernel throws network hardware complain during dhcp |
[12,4] | Disk Failure | Bios Does not detect disk Disk replaced |
[8,10] | Disk Failure | Bios Does not detect disk Disk replaced |
[15,17] | Disk Failure | Bios Does not detect disk Disk replaced |
[10,2] | Disk Failure | Bios Does not detect disk Disk replaced |
[1,6] | Disk Failure | kernel throwing disk errors Disk replaced hda: dma_timer_expiry: dma status == 0x21 |
[18,12] | Disk Failure | kernel throwing disk errors Disk replaced hda: dma_timer_expiry: dma status == 0x21 |
[1,10] | Disk Failure | kernel throwing disk errors Disk replaced |
[13,8] | Disk Failure | kernel throwing disk errors Disk replaced |
[12,12] | Disk Failure | kernel throwing disk errors Disk replaced |
[8,8] | Disk Failure | kernel throwing disk errors Disk replaced |
[2,3] | Disk Failure | kernel throwing disk errors Disk replaced |
[2,14] | Disk Failure | kernel throwing disk errors Disk replaced |
[13,17] | Disk Failure | kernel throwing disk errors Disk replaced |
[16,17] | Disk Failure | kernel throwing disk errors Disk replaced |
[1,2] | Node Failure | Can't isolate Problem: Seems to over heat and kernel panic |
Note:
See TracWiki
for help on using the wiki.