| Version 12 (modified by , 17 years ago) ( diff ) | 
|---|
List of Node Failures
| Node | Failure Mode | Solution / Notes | 
| [1,5] | Pxe Halt - Locks up during execution of PXE code | Multiple resets (more than 1)  may be required Might require node Change  | 
| [1,5] | Dead Node ID box top LED (the blinking one) |  Power cycle Fixed it  Rabbit Issue?  | 
| [3,8] | First Power on Halt |  Locks during the first attempt  Post after reset  | 
| [17,4] | First Power on Halt |  Locks during the first attempt  no serial console output  | 
| [1,14] | First Power on Halt |  Locks during the first attempt  Reset Fixes it has new disk  | 
| [20,19] | Disk Failure |  Kernel Throws errors during imageing  Disk Changed  | 
| [12,9] | Disk Controller Failure | Disk controller was having issues, disks were being incorrectly recognised | 
| [3,18] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [5,11] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [14,11] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [13,5] | Lock Up |  Rabbit and Node were halted  Power cycled  | 
| [4,11] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [5,9] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [9,11] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [3,19] | Bad Node |  Mother board Failure, refused to boot  Replaced  | 
| [14,8] | Disk Failure |  Kernel Throws Disk Errors  Disk Changed  | 
| [17,9] | Disk Failure |  Disk write halts, imaging times out Disk replaced  | 
| [18,3] | Over heat | CM measures internal temp at 106F, fails to boot reliably | 
| [20,2] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [8,13] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [9,10] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [5,2] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [17,13] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [12,1] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [6,14] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [17,19] | Memory Failure | Memory Pins did not make proper contact, Bent case and reinserted memory | 
| [7,2] | Disk Failure |  Disk Write errors  Disk replaced  | 
| [5,15] | Lock Up |  Rabbit and Node were halted, node ID box LED was solid  Power cycled  | 
| [7,2] | Lock Up |  Rabbit and Node were halted, node ID box led was off  Power cycled  | 
| [16,1] | Lock Up |  Rabbit and Node were halted  Power cycled  | 
  Note:
 See   TracWiki
 for help on using the wiki.
    