Old/Tutorials/HowToImage – Orbit

Context Navigation

Version 12 (modified by thierry, 18 years ago) ( diff )
—

Go back —> Tutorials

How to install a disk image on the nodes of a Testbed (using imageNodes4)

If you have not done so yet:

Register for an account
Make a reservation on the Schedular for a given tesbed

Lets assume that you registered as user 'bob' and made a reservation for the 'grid' testbed. You then access the 'grid' console using the command:

 ssh bob@console.grid.orbit-lab.org

Then, to image some nodes on the 'grid' testbed, use the command:

 imageNodes4 all baseline-7.11.ndz 
 # will write the disk image with the name 'baseline-7.11.ndz' on all the nodes on the 'grid'

 imageNodes4 [1..10,1..5] baseline-7.11.ndz 
 # same as above, but only on the nodes that have their coordinates x=[1..10] and y=[1..5]
 
 imageNodes4 [[3,1],[5,6]] baseline-7.11.ndz 
 # same as above, but only on nodes [3,1] and [5,6]

The output of the "imaging" process will look like the following:

Imaging nodes: '[1..20,1..20]' with image 'baseline-7.11.ndz' on default domain (retrieved from hostname)
 INFO init: NodeHandler Version 4.2.0 (1272)
 INFO init: Experiment ID: grid_2007_07_03_03_13_50
 ...
 WARN -:topo:image: Ignoring missing node '5@14'
 ...
 INFO stdlib: Waiting for nodes (Up/Down/Total): 0/387/387 - (still down: n_20_11,n_17_14,n_15_18)
 INFO stdlib: Waiting for nodes (Up/Down/Total): 34/353/387 - (still down: n_20_11,n_17_14,n_15_18)
 INFO stdlib: Waiting for nodes (Up/Down/Total): 61/326/387 - (still down: n_20_11,n_17_14,n_15_18)
 ...
 WARN stdlib: Giving up on node n_4_1
 INFO whenAll: *: 'status[@value='UP']' fires
 INFO exp: Progress(0/0/214): 0/0/0 min(n_20_11)/avg/max (216) - Timeout: 581 sec.
 ...
 INFO exp: Progress(208/14/214): 0/91/100 min(n_18_19)/avg/max (216) - Timeout: 6 sec.
 ...
 INFO Experiment: DONE!

At the end of the "imaging" process, you will have 3 topology files within your user directory:

system_topo_active_grid.rb - a topology with all the nodes that have successfully been imaged
system_topo_failed_grid.rb - a topology with all the nodes that have failed during the "imaging" process (possibly due to some disk read/write errors)
system_topo_timedout_grid.rb - a topology with all the nodes that have timed out the "imaging" process. These nodes correctly started writing the image on their disk, but they did not finish before the default timeout of 800 sec.

You can then:

use the information in the topo_grid_active.rb file to check/select which node to use in your experiments
or directly use the topology defined in this file within your experiment scripts, as described in details in this tutorial.

Attachments (6)

topo_grid_active.rb (1.5 KB ) - added by thierry 19 years ago.
topo_grid_failed.rb (424 bytes ) - added by thierry 19 years ago.
topo_grid_timedout.rb (355 bytes ) - added by thierry 19 years ago.
system_topo_active_grid.rb (2.1 KB ) - added by thierry 18 years ago.
system_topo_failed_grid.rb (536 bytes ) - added by thierry 18 years ago.
system_topo_timedout_grid.rb (421 bytes ) - added by thierry 18 years ago.

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text