wiki:Internal/DesignNotes

Version 27 (modified by kishore, 19 years ago) ( diff )

Experiment description

A single sender broadcast (ethernet broadcast) a 512B packet to 238 receivers running tcpdump. The wired interface eth1 was used on all senders and receivers. The number of packets sent and the time between two consecutive packets was varied at the sender.

Results

Metrics chosen were

  • the number of packets successfully received,
  • the percentage of missing packets and
  • the time taken to receive all packets

on each receiving node.

Experiment 1

10000 packets were sent with an interval of 10ms between consecutive packets. The figure (on the left) shows the number of packets successfully received at each receiver. From the figure, all receivers except for one (node14-3) receive all 10000 packets. The same behavior was exhibited in another run of the same experiment. There seems to be a problem with node14-3. We also saw the same behavior when we used 5000 packets instead of 10000.

The figure above (on the right) shows the number of seconds it takes each receiver to receive all packets.

Experiment 2

10000 packets were sent "as fast as possible". There was no "sleep" statement between two consecutive send events at the sender. The figures above show the number of packets lost per receiver (top left) and the number of successful packet receptions (top right). The figure on the bottom left shows the total time taken to receive all packets at each receiver from the sender. Although it is inconclusive from these results as to what the source of this loss is, we surmise that it is due to buffer overflows at the sender. Our argument in favor of this hypothesis is as follows: We waited 5 minutes after the sender terminated before terminating the tcpdump application on each receiver. Hence we waited "long enough" for each sent packet to arrive at each receiver. Another observation is that each receiver is receiving roughly the same number of packets and the total time to receive all packets is only around 2 seconds.

A way to test this hypothesis would be to increase the size of the send socket buffer and re-run the same experiment.

Experiment 3

5000 packets were sent "as fast as possible". There was no "sleep" statement between two consecutive send events at the sender. The figures above show the number of packets lost per receiver (top left) and the number of successful packet receptions (top right). It is unclear from current experiments as to where this loss is occurring. The figure on the bottom left shows the total time taken to receive all packets at each receiver from the sender.

Design Notes for nodehandler

Underlying model

  • nodehandler broadcasts each command to all the nodes using a C program

System Parameters

  • Underlying traffic model: Figure out
    • how many packets
    • of what size
    • and with what interarrival are generated by a "typical" experiment script
  • For this traffic model, test the performance in terms of packet loss, and avg. latency

Building reliability into the protocol

Some thoughts on building reliability into the protocol

  1. Cumulative ACK by nodeagent every 5 packets
  2. Stagger sending of ACK's to reduce the collision domains based on
     N* backoff interval, where N = row no. 
    
  3. nodehandler separate thread maintains a bitmap for all packets and fills a "one" for every missing packet reported by the nodeagent
  4. nodehandler broadcasts these missing packets after a timeout
  5. This timeout is based on the time taken by all nodes to send an ACK to the nodehandler using the convention of Step 2

Fetching OML xml schema

  • Replace existing ruby webserver with proper Apache to serve xml files
  • nodeagents still use wget to fetch these

Attachments (9)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.