A flow hit the OpenFlow flow table at OVS, improperly - sdn

My questions
a flow doesn't hit the OpenFlow flow table although its property matches the table perfectly. How can it happen? Could you give me a little hint for trouble shooting?
What I did for finding answers.
(1) I made a small network by using mininet,ONOS, Iperf.
(2) I generated a UDP flow (srcIP=10.0.0.3, dstIP=10.0.0.2, dstPORT=50000)
(3) I added flow rules to each mininet switch by using ONOS REST api. You can see two flow rules below the raw flow rules.
1) cookie=0x4c0000ef7faa8a, duration=332.717s, table=0, n_packets=8974,
n_bytes=557090858, idle_age=153, priority=65050,ip,nw_dst=10.0.0.2
actions=output:4
2) cookie=0x4c0000951b3b33, duration=332.636s, table=0, n_packets=10,
n_bytes=460,idle_age=168,priority=65111,udp,nw_src=10.0.0.3,nw_dst=10.0.0.2,
tp_dst=50000 actions=output:3
(4) I found that 2) flow rule have more match fields with higher priority, Most of packet in the flow hit 1) flow rule.
(5) I used Wireshark for check whether traffic is generated properly. However there is no problem.(srcIP=10.0.0.3, dstIP=10.0.0.2, dstPORT=50000)
nimdrak#nimdrak-VirtualBox:~$ sudo ovs-ofctl dump-flows s1
NXST_FLOW reply (xid=0x4):
cookie=0x4c0000ef7faa8a, duration=332.717s, table=0, n_packets=8974, n_bytes=557090858, idle_age=153, priority=65050,ip,nw_dst=10.0.0.2 actions=output:4
cookie=0x4c0000ef7fb20c, duration=332.679s, table=0, n_packets=127, n_bytes=36814, idle_age=305, priority=65050,ip,nw_dst=10.0.0.4 actions=output:3
cookie=0x4c0000ef7f9b86, duration=332.736s, table=0, n_packets=518, n_bytes=102960, idle_age=138, priority=65050,ip,nw_dst=10.0.0.254 actions=output:5
cookie=0x4c0000ef7fae4b, duration=332.698s, table=0, n_packets=270, n_bytes=49059, idle_age=138, priority=65050,ip,nw_dst=10.0.0.3 actions=output:2
cookie=0x4c0000ef7fa6c9, duration=332.751s, table=0, n_packets=125, n_bytes=36646, idle_age=305, priority=65050,ip,nw_dst=10.0.0.1 actions=output:1
cookie=0x10000487f5557, duration=348.362s, table=0, n_packets=285, n_bytes=23085, idle_age=66, priority=40000,dl_type=0x88cc actions=CONTROLLER:65535
cookie=0x10000487f63a1, duration=348.362s, table=0, n_packets=285, n_bytes=23085, idle_age=66, priority=40000,dl_type=0x8942 actions=CONTROLLER:65535
cookie=0x10000488ebd5d, duration=348.362s, table=0, n_packets=12, n_bytes=504, idle_age=148, priority=40000,arp actions=CONTROLLER:65535
cookie=0x10000464443e2, duration=348.362s, table=0, n_packets=0, n_bytes=0, idle_age=348, priority=5,arp actions=CONTROLLER:65535
cookie=0x4c0000951a5275, duration=332.671s, table=0, n_packets=0, n_bytes=0, idle_age=332, priority=65050,udp,nw_src=10.0.0.3,nw_dst=10.0.0.1,tp_dst=50000 actions=output:1
cookie=0x4c0000951b3b33, duration=332.636s, table=0, n_packets=10, n_bytes=460, idle_age=168, priority=65111,udp,nw_src=10.0.0.3,nw_dst=10.0.0.2,tp_dst=50000 actions=output:3

Summary
The cause is related with UDP fragmentation. So fragmented packet doesn't hit the table
In detail
(1) I set the UDP datagram 63k when sending. Then it is fragmented at IP layer.
(2) Then the only first packet has UDP header information and only the packet hit the flow table properly
(3) For solve this problem, I use the Jumbo frame which means OVS can handle the packet with larger MSS. We should also set NIC MSS (http://docs.openvswitch.org/en/latest/topics/dpdk/jumbo-frames/)
And we update OVS version above 2.6.0
(it will be better to use http://docs.openvswitch.org/en/latest/intro/install/general/ than other sites that we can google)
(5) After setting Jumbo frame, we can see the flow table hitting works properly for larger UDP datagram.

Related

DPDK SRIOV multiple vlan traffic over single VF of SRIOV passthrough

When trying to use RTE API's for VLAN offload and VLAN filtering I observe that both VLAN tagged and untagged packets are being sent out.
API's used:
rte_eth_dev_set_vlan_offload ,
rte_eth_dev_vlan_filter
DPDK - 18.08
RHEL - 7.6
Driver - igb_uio
Is there a way to allow only VLAN tagged packets to be sent out?
Regards,
Not sure if I understand correctly - you're trying to strip vlan tags from tx packets? Why would you want to offload that? If you forward packets from somewhere else they already have their tags stripped by rx offload. If you create them yourself, well - you're in control.
Regardless, if you'd want to offload tx vlan insertion:
rte_eth_dev_set_vlan_offload only sets RX offload flags.
You'll probably have to set the tx offload flag in your port config manually, like in this abridged snippet from the DPDK Flow Filtering example code:
struct rte_eth_conf port_conf = {
.txmode = {
.offloads =
DEV_TX_OFFLOAD_VLAN_INSERT,
},
};

openflow rule with multiple action ports

I am a little bit confused when interpreting the action part for the following rule
cookie=0x2b000000000000a5, duration=528.939s, table=0, n_packets=176, n_bytes=33116, idle_age=0, priority=2,in_port=1 actions=output:4,output:2
we have multiple action ports in a certain order, when checking the "restconf/operational/opendaylight-inventory:nodes/" in ODL controller we have different order for each port
"action": [
{ "order": 0,"output-action": {
"max-length": 65535,
"output-node-connector": "2" }
{"order": 1, "output-action": {
"max-length": 65535,
"output-node-connector": "4" }
}
I am not sure how the packets hitting such entry will be forwarded, are they replicated and send over both? are they load balanced over all ports?
what does the max-length refer to?
Is there any documentation explaining all fields in detail?
It seems, this is a group-table flow.
You can use group table functionality to support multiports in action part. You can read Openflow 1.3 spec documentation for details. (Part. 5.6, 5.6.1)
For max length, again from the same document (Part A.2.5):
An Output action uses the following structure and fields:
Action structure for OFPAT_OUTPUT, which sends packets out 'port'. When the
'port' is the OFPP_CONTROLLER, 'max_len' indicates the max number of
bytes to send. A 'max_len' of zero means no bytes of the packet should
be sent. A 'max_len' of OFPCML_NO_BUFFER means that the packet is not
buffered and the complete packet is to be sent to the controller.

How to set OpenvSwitch to evict newest flows when memory is full instead of the oldest ones?

I am currently trying to overflow the OvS controller with the flow tables and make it reject new rules and subsequently, new packets.
I found this in documentation:
Flow Table Configuration
Limit flow table 0 on bridge br0 to a maximum of 100 flows:
ovs-vsctl -- --id=#ft create Flow_Table flow_limit=100 over‐
flow_policy=refuse -- set Bridge br0 flow_tables=0=#ft
So, I guess I need to implement firstly flow_policy = refuse, and do it for all 255 tables. Nevertheless, whenever I try to run this command, it returns me:
ubuntu#ubuntu:~$ sudo ovs-vsctl -- --id=#ft create Flow_Table flow_limit=100 over‐flow_policy=refuse -- set Bridge br0 flow_tables=0=#ft
ovs-vsctl: **Flow_Table does not contain a column whose name matches "over‐flow_policy"**
Is there any way to set the policy to refuse for all the tables, and why do I get this mistake?
you should try using overflow_policy instead of over-flow_policy..it'll works!!

The receiveBufferSize not being honored. UDP packet truncated

netty 4.0.24
I am passing XML over UDP. When receiving the UPD packet, the packet is always of length 2048, truncating the message. Even though, I have attempted to set the receive buffer size to something larger (4096, 8192, 65536) but it is not being honored.
I have verified the UDP sender using another UDP ingest mechanism. A standalone Java app using java.net.DatagramSocket. The XML is around 45k.
I was able to trace the stack to DatagramSocketImpl.createChannel (line 281). Stepping into DatagramChannelConfig, it has a receiveBufferSize of whatever I set (great), but a rcvBufAllocator of 2048.
Does the rcvBufAllocator override the receiveBufferSize (SO_RCVBUF)? Is the message coming in multiple buffers?
Any feedback or alternative solutions would be greatly appreciated.
I also should mention, I am using an ESB called vert.x which uses netty heavily. Since I was able to trace down to netty, I was hopeful that I could find help here.
The maximum size of incoming datagrams copied out of the socket is actually not a socket option, but rather a parameter of the socket read() function that your client passes in each time it wants to read a datagram. One advantage of this interface is that programs accepting datagrams of unknown/varying lengths can adaptively change the size of the memory allocated for incoming datagram copies such that they do not over-allocate memory while still getting the whole datagram. (In netty this allocation/prediction is done by implementors of io.netty.channel.RecvByteBufAllocator.)
In contrast, SO_RCVBUF is the size of a buffer that holds all of the datagrams your client hasn't read yet.
Here's an example of how to configure a UDP service with a fixed max incoming datagram size with netty 4.x using a Bootstrap:
import io.netty.bootstrap.Bootstrap;
import io.netty.channel.ChannelOption;
import io.netty.channel.FixedRecvByteBufAllocator;
import io.netty.channel.nio.NioEventLoopGroup;
import io.netty.channel.socket.nio.NioDatagramChannel;
int maxDatagramSize = 4092;
String bindAddr = "0.0.0.0";
int port = 1234;
SimpleChannelInboundHandler<DatagramPacket> handler = . . .;
InetSocketAddress address = new InetSocketAddress(bindAddr, port);
NioEventLoopGroup group = new NioEventLoopGroup();
Bootstrap b = new Bootstrap()
.group(group)
.channel(NioDatagramChannel.class)
.handler(handler);
b.option(ChannelOption.RCVBUF_ALLOCATOR, new FixedRecvByteBufAllocator(maxDatagramSize));
b.bind(address).sync().channel().closeFuture().await();
You could also configure the allocator with ChannelConfig.setRecvByteBufAllocator

Discarded UDP datagram over MTU size with IPv6

I've found out that when I send an UDP datagram that gets fragmented (over 1452 bytes with MTU=1500), according to tcpdump, all the fragments are received on the target machine but then no message is received on the socket. This happens only with IPv6 addresses (both global and link-local), with IPv4 everything works as expected (and with non-fragmented datagrams as well).
As the datagram is discarded, there is this ICMP6 message:
05:10:59.887920 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 69) 2620:52:0:105f::ffff:74 > 2620:52:0:105f::ffff:7b: [icmp6 sum ok] ICMP6, destination unreachable, length 69, unreachable port[|icmp6]
There's some repeated neighbour solicitation/advertisements going on and I see that it gets to the ARP cache (via ip neigh).
One minute later I get another ICMP6 messages saying that the fragment has timeout out.
What's wrong with the settings? The reassembled packet should not be discarded, when it can be delivered, right?
System is RHEL6 2.6.32-358.11.1.el6.x86_64