|   | 1 | 1) Clear some space in the node repair area.  Obtain a Philips-head | 
          
          
            |   | 2 | screwdriver and a bin for garbage.  Open a web browser, and ssh | 
          
          
            |   | 3 | sessions to dhcp1.orbit-lab.org and repository2.orbit-lab.org | 
          
          
            |   | 4 | (probably through gw.orbit-lab.org) on a network connected computer. | 
          
          
            |   | 5 |  | 
          
          
            |   | 6 | 2) Make a page in the orbit-lab.org wiki with a name matching the | 
          
          
            |   | 7 | template Internal/RepairYYYYMMDD (Internal/Repair20070520 for | 
          
          
            |   | 8 | example).  Write the current time and whoever is helping do the | 
          
          
            |   | 9 | repairs on this wiki page. | 
          
          
            |   | 10 |  | 
          
          
            |   | 11 | 3) Determine the set of nodes you are going to replace.  These will be | 
          
          
            |   | 12 | any nodes marked as red on orbit-lab.org/wiki/Status, or nodes in | 
          
          
            |   | 13 | which the CM cannot reliably power up the node.  Do not repair more | 
          
          
            |   | 14 | than ten at a time.  Write the coordinates of these nodes down in the | 
          
          
            |   | 15 | wiki page for the repair.  Note which of those node positions are | 
          
          
            |   | 16 | supposed to have Atheros and which are supposed to have Intel.  It | 
          
          
            |   | 17 | simplifies things if you can do all Atheros or all Intel nodes in a | 
          
          
            |   | 18 | particular round of repairs. | 
          
          
            |   | 19 |  | 
          
          
            |   | 20 | 4) Comment out lines for these nodes from dhcp1:/etc/dhcp3/dhcpd.conf. | 
          
          
            |   | 21 | Restart dhcpd on dhcp1. | 
          
          
            |   | 22 |  | 
          
          
            |   | 23 | 5) For each node to be repaired, remove each node from its mounting in | 
          
          
            |   | 24 | the grid, leaving the node id box attached.  As you remove nodes, take | 
          
          
            |   | 25 | them and their node id box back to the node repair area.  One or two | 
          
          
            |   | 26 | other people can work on nodes in the node repair area while one | 
          
          
            |   | 27 | person moves nodes back and forth from the grid.  Note any exceptional | 
          
          
            |   | 28 | hardware or incorrectly installed connections on the wiki page. | 
          
          
            |   | 29 |  | 
          
          
            |   | 30 | 6) Once in the node repair area, remove the node id box and then the | 
          
          
            |   | 31 | yellow node enclosure.  Verify that the node id boxes match the list | 
          
          
            |   | 32 | of nodes to be repaired on the wiki page, and that the 802.11 hardware | 
          
          
            |   | 33 | vendor matches what is expected.  Note exceptions on the wiki page. | 
          
          
            |   | 34 |  | 
          
          
            |   | 35 | 7) Replace the power supply.  Take care to put old power supplies in | 
          
          
            |   | 36 | the garbage bin.  If the 802.11 hardware vendor did not match what is | 
          
          
            |   | 37 | expected, correct the hardware.  Replace the enclosure.  Replace the | 
          
          
            |   | 38 | node id box. | 
          
          
            |   | 39 |  | 
          
          
            |   | 40 | 8) Calibrate the node (NYI). | 
          
          
            |   | 41 |  | 
          
          
            |   | 42 | 8) Replace the node in the grid.  Verify the node id box against two | 
          
          
            |   | 43 | adjacent nodes. | 
          
          
            |   | 44 |  | 
          
          
            |   | 45 | 9) Once all nodes have been repaired and replaced, verify that the | 
          
          
            |   | 46 | nodes are not red on the orbit-lab.org/wiki/Staus page.  That is, that | 
          
          
            |   | 47 | the CM reports back to the CMC correctly. | 
          
          
            |   | 48 |  | 
          
          
            |   | 49 | 10) Turn the repaired nodes on.  Because they obtain pool addresses | 
          
          
            |   | 50 | from dhcp, they will load an 'inventory' image (NYI).  Wait five | 
          
          
            |   | 51 | minutes for the inventory image to finish loading.  Then, command the | 
          
          
            |   | 52 | CMC to run the inventory command on each node. | 
          
          
            |   | 53 |  | 
          
          
            |   | 54 | 11) Run the gendhcpconf script on repository2.  Compare its output | 
          
          
            |   | 55 | with the entries you commented out in step 4.  Correct | 
          
          
            |   | 56 | dhcp1:/etc/dhcp3/dhcpd.conf if needed. | 
          
          
            |   | 57 |  | 
          
          
            |   | 58 | 12) During the following maintenance slot, verify that you can image | 
          
          
            |   | 59 | all nodes that have been repaired since the last maintenance slot by | 
          
          
            |   | 60 | running the CM stress experiment (NYI). |