Version 1 (modified by 13 years ago) ( diff ) | ,
---|
Getting Frisbee traffic over OpenFlow Enabled Switches
Frisbee is a fast disk imaging process in which a server distributes chunks of disk image to one or more clients in a multicast group. Frisbee is used extensively for imaging nodes, so there is a need for the OpenFlow deployment here to be able to handle its traffic.
This page records one attempt to see if OpenFlow can be made to handle Frisbee traffic, and if so, measure its performance. Success here means being able to image a node in a reasonable amount of time.
I General Setup steps taken
A regular SandBox (SB6) with 2 nodes was used to carry out these steps. The control plane switch for SB6 is an NEC IP8800 (sw-sb-01) shared by this and SandBoxes 1,2,5,7 and 8.
1.1. Network setup - enable OpenFlow mode on switch
The SB6 control VLAN was set in OpenFlow mode (e.g. make it a virtual switch, or "VSI"). The command(s) for doing this differ depending on firmware version, which can be found with the command show version
in user mode. The VSI was pointed to our BSN controller, kvm-big.
- 11.1.C (old - this is what sw-sb-01 is running)
setvsi 12 31,32,48.12 tcp 172.16.0.14:6633 dpid 0x001010213231
- 11.1.Ae (the theoretic steps, not confirmed 1)
(config)# openflow openflow-id 1 virtual-switch !sw-gp(config-of) controller controller-name kvm-big 1 172.16.0.14 tcp !sw-gp(config-of) dpid 0000001010213231 !sw-gp(config-of) openflow-vlan 27 !sw-gp(config-of) enable
In the first code block, tcp 172.16.0.14:6633
points the switch to kvm-big
. Association can be confirmed with the show switch
command on the controller, in which you should see the VSI's DPID in the list of switches the controller has seen:
kvm-big> sh sw Switch DPID Alias Active Last Connect Time IP Address Socket Address Max Packets Max Tables -----------------------|-----|------|-----------------------|-------------|--------------------|-----------|---------- ... 00:00:00:10:10:21:32:31 True 2011-09-20 21:34:47 EDT 172.16.0.253 /172.16.0.253:64372 256 2 ...
Note, the controller lists DPIDs in a form similar to MAC addresses, but with '00:00' tacked in front.
1.2. Network setup - Add static flow entry to BSN Controller
Once the VSI is associated, add a static flow entry to the controller.
kvm-big> en kvm-big# config kvm-big(config)# switch 00:00:00:10:10:21:32:31 kvm-big(config-switch)# flow-entry floodall kvm-big(config-flow-entry)# ether-type 2048 kvm-big(config-flow-entry)# dst-ip 224.0.0.5 kvm-big(config-flow-entry)# actions output=flood kvm-big(config-flow-entry)# active True
- ether-type must be specified in decimal. As in this case, if doing a flow based on IP-layer parameters, ether-type must be set to 2048, for IP (hex ether-type 0x800).
- actions are specified in a syntax similar to that used in
dpctl
. It roughly follows the syntax:actions output=[port number|flood|drop]
Multiple output ports may be specified in a comma-separated chain of "output= ", e.g:actions output=2,output=4
- The flow must be enabled with
enable True
, with capitalized T.
1.3. Client/server setup
To avoid the overhead of the omf
commands, both Frisbee server (frisbeed) and client (frisbee) were run manually. The client must be booted into a pre-boot environment prior to running frisbee. This can be done by rebooting the node after manually adding a link (as root) to /tftpboot/pxelinux.cfg/ on repository1:
ln -s omf-5.2.4 0A100102
Where omf-5.2.4 is the name of the preboot environment and 0A100102 is the IP address of the node in hex. Once back up, the node should be in a busybox-like environment with frisbee
as one of the command options.
Once the client is ready to go, you can start the processes with the following commands:
- Server: on repository1:
/usr/sbin/frisbeed -i 10.16.0.42 -m 224.0.0.5 -p 7060 -W 50000000 /var/lib/omf-images-5.2/ubuntu10.10.ndz -ddd
Where:- -i 10.16.0.42 : Interface to use for session
- -m 224.0.0.5 : Multicast address server/clients will use to send/receive disk image chunks, respectively
- -p 7060 : UDP port to use for session
- -W 50000000 : throughput in bps
- -ddd : verbose mode
- Client: on node1-2:
frisbee -i 10.16.1.2 -m 224.0.0.5 -p 7060 /dev/hda -ddd
Where the flags have the same meanings as for frisbeed. In the client's case,-ddd
will give you a bunch of run-time messages, as well as the summary at the end:Client 271061107 Performance: runtime: 3387.386 sec start delay: 0.000 sec real data written: 1039785984 (306993 Bps) effective data written: 12002000896 (3543549 Bps) Client 271061107 Params: chunk/block size: 1024/1024 chunk buffers: 64 disk buffering: 64MB readahead/inprogress: 2/8 recv timo/count: 30000/3 re-request delay: 1000000 writer idle delay: 1000 randomize requests: 1 Client 271061107 Stats: net thread idle/blocked: 6173/0 decompress thread idle/blocked: 1053451/0 disk thread idle: 1197 join/request msgs: 1/52576 dupblocks(chunk done): 1080 dupblocks(in progress): 31043 partial requests/blocks: 50501/18254631 re-requests: 50501
The above, in fact, was the result of the imaging process done with the configuration steps described in this page. Note the runtime - 3387.386 sec, or approximately 56 min!
II Data collection
In addition to the steps above several other things were done for troubleshooting.
2.1 On kvm-big
- A trace of the control channel traffic can be observed with the command:
show switch 00:00:00:10:10:21:32:31 trace detail
Appending >> filename will spit the trace into a file in user bsn's home directory.
- The controller has a debug shell one can reach with command
debug bash
fromt he CLI. From here, a tcpdump session was started to sniff the control port:tcpdump -i eth0 -s 1514 -vvv not ether proto 0x88cc and not port 22 and port 6633 and not port 69 -w frisbee.pcap
The various filters remove LLDP, SSH, mDNS, and TFTP traffic, respectively, and the capture was saved to a file for later inspection.
2.2 On sw-sb-01
- Commands
showswitch
andshowflow
allow you to check the status of, and flows being pushed to, the device. - 'show cpu seconds` allows you to monitor switch CPU load:
sw-sb-02# show cpu seconds Date 2011/08/17 21:04:13 UTC *** second *** date time cpu average Aug 17 21:03:12-21:03:21 100 100 100 100 100 100 100 100 100 100 Aug 17 21:03:22-21:03:31 100 100 100 100 100 100 100 99 100 100 Aug 17 21:03:32-21:03:41 100 100 100 100 100 100 100 100 92 100 Aug 17 21:03:42-21:03:51 100 100 100 100 100 100 100 100 100 100 Aug 17 21:03:52-21:04:01 100 100 100 100 100 100 100 100 100 100 Aug 17 21:04:02-21:04:11 100 100 100 100 100 100 93 100 100 100
High load as seen above is bad. Expect normal load to be mostly single digits or in the teens.
1. I don't have a manual for the new OpenFlow features.