Changes between Version 50 and Version 51 of Old/NodeHandler/Multicast


Ignore:
Timestamp:
Apr 10, 2006, 8:06:38 PM (19 years ago)
Author:
sswami
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Old/NodeHandler/Multicast

    v50 v51  
    77
    88== Introduction ==
    9 The current NodeHandler code works satisfactorily on the small grid and the sandboxes. But this same code fails to work correctly on the big grid. This is due to the fact that in the current grid consisting of 400 nodes, packet loss is a major problem. And this problem escalates sharply with the increase in the no. of nodes. Specifically, when trying to image more than 150 nodes in a single attempt, the high packet loss prevents successful completion. To alleviate this problem, it has been decided to explore the use of a reliable multicast protocol. The implementation being considered here is MCLv3, which is an Open Source
    10 Implementation of the ALC and NORM Reliable Multicast Protocols.
     9The current NodeHandler code works satisfactorily on the small grid and the sandboxes. But this same code fails to work correctly on the big grid. This is due to the fact that in the current grid consisting of 400 nodes, packet loss is a major problem. And this problem escalates sharply with the increase in the no. of nodes. Specifically, when trying to image more than 150 nodes in a single attempt, the high packet loss prevents successful completion. To alleviate this problem, it has been decided to explore the use of broadcast instead of multicast.
    1110
    1211== Major Design Requirements ==
    1312'''R.1:'''
    1413{{{
    15 It has been decided that a feedback-free reliable multicast protocol will be used and that all
    16 feedbacks will be sent through TCP. This is because then
     14It has been decided that all communications from the NodeHandler to the NodAgent will be through broadcast and that all feedbacks from the NodeAgent to the NodeHandler will be sent through TCP. This is because then
    1715
    1816- reliable feedbacks can then be ensured,
     
    2321  serve the dual purpose of providing feedbacks too.
    2422
    25 MCLv3 is an Open Source Implementation of the ALC and the NORM Reliable Multicast Protocols.
    26 Of these 2 protocols, only the use of the ALC/LCT protocol is being explored here. This is
    27 because the ALC/LCT protocol is feedback-free and also it provides an unlimited scalability.
    28 NORM lacks both these attributes.
    2923}}}
    3024
     
    3226{{{
    3327All communication will be handled in the communication layer which will be a separate process.
    34 ALC/LCT is a multi-threaded implementation and so we are not sure of the issues that may arise
    35 if it is made into a loadable library instead of a separate process. The present focus is on
    36 exploring reliable multicast and once this issue is resolved, the issues pertaining to
    37 converting this process into a loadable library will be addressed to.
     28The present focus is on exploring reliable communication with minimum packet loss and once this issue is resolved, the issues pertaining to converting this process into a loadable library will be addressed to.
    3829
    39 At this time, only changes to the communication layer in the NodeHandler is being considered.
    40 Similar changes to the communication layer in the NodeAgent will be considered later. At the
    41 moment, minor changes will be made to the current NodeAgent communication layer. The changes
    42 made will be limited to conforming to the new NodeHandler communication layer, e.g. existing
    43 UDP socket calls and socket processing code will be changed to that for TCP sockets.
     30This will need changes to the communication layer in both the NodeHandler and the NodeAgent.
    4431}}}
    4532
     
    4835The communication layer will use two separate approaches, one for sending messages and the
    4936other for receiving messages. Messages being sent from the NodeHandler to the NodeAgent will
    50 use ALC/LCT. A single message will be sent by the NodeHandler using ALC/LCT and this message
     37use broadcast. A single message will be broadcast by the NodeHandler and this message
    5138will be received by all the NodeAgents.
    5239
     
    5845{{{
    5946The messages sent from the NodeHandler to the NodeAgent consist of commands to be executed on
    60 the NodeAgent. These messages may be sent to all the nodes in the multicast group or to a
    61 subset of the nodes in the multicast group based on node Alias. If the message has to be sent
    62 to a subset of the nodes, then the NodeHandler will indicate as such to the communication
    63 layer and also identify the set of nodes which will receive the message. Otherwise, by
    64 default, the communication layer will send the message to all the nodes.
     47the NodeAgent. Since the communication layer will broadcast the message to all the nodes, the NodeAgents will have the filters to deteremine whether a message is to be accepted / rejected. Current NodeAgent code has such filters and these will be enhanced only if necessary.
    6548
    6649After a message is sent, the communication server will wait for ACKs from the NodeAgent, which
     
    7255next command.
    7356
    74 This amounts to an error correcting mechanism on top of reliable multicast, but it has been
    75 deemed necessary because the ACL/LCT implementation is not fully reliable in the sense that it
    76 doesn't guarantee reliable delivery.
    7757}}}
    7858