How does openvswitch interrupt packet? - openvswitch

I found that the interrupt handler of the ixgbe driver isn't called when the interface is registered as port in openvswitch.
Then, how does openvswitch interrupt packet?
Thanks

Related

Delay of incoming network package on Linux - How to analyse?

The problem is: Sometimes tcpdump sees that the receiving of a UDP packet is held back until the next incoming UDP packet, although the network tap device shows it goes without delay through the cable.
Scenary: My profinet stack on Linux (located in user space) has a cyclic connection where it receives and sends Profinet protocol packets every 4ms (via raw sockets). About every 30 ms it also receives UDP packets in another thread on a UDP socket and replies them immediately, according to that protocol. It's around 10% CPU load. Sometimes it seems such received UDP packets are stuck in the network driver. After 2 seconds the next UDP packet comes in and both, the missed UDP packet and that next one is received. There are no dropped packets.
My measuring:
I use tcpdump -i eth0 --time-stamp-precision=nano --time-stamp-type=adapter_unsynced -w /tmp/tcpdump.pcap to record the UDP traffic to a RAM disk file.
At the same time I use a network tap device to record the traffic.
Question:
How to find out where the delay comes from (or is it a known effect)?
(2. What does the timestamp (which tcpdump sets to each packet) tell me? I mean, which OSI layer refers it to, in other words: When is it taken?)
Topology: "embedded device with Linux and eth0" <---> tap-device <---> PLC. The program "tcpdump" is running on the embedded device. The tap device is listening on the cable. The actual Profinet connection is between PLC and embedded device. A PC is connected on the tap device to record what it is listening to.
Wireshark (via tap and tcpdump): see here (packet no. 3189 in tcpdump.pcap)
It was a bug in the freescale Fast Ethernet Driver (fec_main.c) which NXP has fixed by its awesome support now.
The actual answer (for the question "How to find out where the delay comes from?") is: One has to build a Linux with kernel tracing on, patch the driver code with kernel tracing and then analyse such tracing with the developer Linux tool trace-cmd. It's a very complicated thing but I'm very happy it is fixed now:
trace-cmd record -o /tmp/trace.dat -p function -l fec_enet_interrupt -l fec_enet_rx_napi -e 'fec:fec_rx_tp' tcpdump -i eth0 --time-stamp-precision=nano --time-stamp-type=adapter_unsynced -w /tmp/tcpdump.pcap

Modbus RTU over TCP

I'm trying to communicate with a Modbus slave using RTU over TCP, but it does not work. I don't get any exception. The program seems to wait for a response and no timeout exception is thrown.
When I change the Modbus slave to TCP only, it works fine. I can read the holding registers. Does PLC4X supports Modbus RTU over TCP? If yes, is there any example? If no, are you planning to implement it?

Can IPv6 multicasting work when one or more receivers are unable to bind to the program's well-known port?

Consider a simple IPv6 multicast application:
A "talker" program periodically sends out IPv6 UDP packets to a well-known multicast-group, sending them to a well-known port.
Zero or more "listener" programs bind themselves to that well-known port and join the well-known multicast group, and they all receive the UDP packets.
That all works pretty well, except in the case where one or more of the listener-programs is unable to bind to the well-known UDP port because a socket in some other (unrelated) program has already bound to that UDP port (and didn't set the SO_REUSEADDR and/or SO_REUSEPORT options to allow it to be shared with anyone else). AFAICT in that case, the listener program is simply out of luck, there is nothing it can do to receive the multicast data, short of asking the user to terminate the interfering program in order to free up the port.
Or is there? For example, is there some technique or approach that would allow a multicast listener to receive all the incoming multicast packets for a given multicast-group, regardless of which UDP port they are being sent to?
If you want to receive all multicast traffic regardless of port, you'd need to use raw sockets to get the complete IP datagram. You could then directly inspect the IP header, check if it's using UDP, then check the UDP header before reading the application layer data. Note that methods of doing this are OS specific and typically require administrative privileges.
Regarding SO_REUSEADDR and SO_REUSEPORT, apps that do this will allow multiple programs to receive multicast packets sent to a given port. However, if you also need to receive unicast packets this method has issues. Incoming unicast packets may be set to both sockets, may always be sent to one specific socket, or sent to each in an alternating fashion. This also differs based on the OS.

What happens between receiving network data on the ethernet port and apache2 doing something?

This one is kind of a vague question, because my own understanding is about as vague. I'm interested in what needs to happen for sporadic voltages on the network cables to cause a program running on your computer to do something.
Say I'm running apache2 on my webserver. Somebody triggers the correct sequence of events on their own internet-connected computer, which results in network data arriving at the server. Then what?
My guess is that there is some peripheral component on the motherboard which listens to the data, which then raises an interrupt in the CPU. Somehow, in the interrupt service routine, Linux must ask the apache2 code to do something. Is this correct? If so, would anyone be willing to share a few extra details?
Thanks
I'll outline what happens from the bottom up, making references to code wherever possible.
Layer 1 (PHY)
Ethernet card (NIC) receives and decodes the signal on the wire, and pushes it into a shift register
See Ethernet over twisted-pair for the line codes details for each variant of *BASE-T Ethernet
When full ethernet frame is received, it is placed into a receive (RX) queue in hardware
NIC raises an interrupt, using bus-specific mechanism (either PCI IRQ line, or message-signaled interrupt)
Interrupt controller (APIC) receives interrupt and directs it to a CPU
CPU saves running context and switches to interrupt context
CPU loads interrupt handler vector and begins executing it
IRQs can be shared by multiple devices. The kernel has to figure out which device is actually interrupting. I'll refer to e100.c driver as it is implemented in one C file and well-commented.
Linux kernel looks at all devices that share this IRQ, calling their driver to determine which device actually raised this interrupt. The driver function called is whatever was passed by the driver to request_irq. (See for_each_action_of_desc() in __handle_irq_event_percpu()).
Each driver of devices sharing this IRQ will look at their device's status register to see if they have an interrupt pending
NIC driver interrupt handler (e.g. e100_intr()) sees the NIC indeed interrupted. It disables the device interrupt (e.g. e100_disable_irq()) and schedules a NAPI callback (__napi_schedule()). NIC driver "claims" the interrupt by returning IRQ_HANDLED. Interrupt ends.
Linux kernel NAPI subsystem calls back NIC driver (e.g. e100_poll) which reads the packet from the NIC RX queue and puts it into a struct sk_buff (SKB), and pushes it into the kernel network stack (e.g. e100_rx_indicate()).
The whole TCP/IP stack is implemented in the Linux kernel for performance reasons:
Layer 2 (MAC)
Kernel ethernet layer looks at ethernet packet and verifies that it is destined for this machine's MAC address
Ethernet ethernet layer sees Ethertype == IP, hands it to IP layer
Note, the protocol is actually set by the device driver (e.g. in e100_indicate()).
Layer 3 (IP)
Kernel IP layer receives packet (ip_rcv())
Kernel IP layer queues up all IP fragments
When all IP frags are recieved, it processes the IP packet. It looks at the protocol field and sees that it is TCP, hands it to TCP layer
Layer 4 (TCP)
Kernel TCP layer receives packet (tcp_v4_rcv()).
Kernel TCP layer looks at src/dst IP/port and matches it up with an open TCP connection (socket) (tcp_v4_rcv() calls __inet_lookup_skb()).
If it is a SYN packet (new connection):
TCP will see that there is a listening socket open for port 80
TCP creates a new connection object for this new connection
Kernel wakes up the task that is sleeping, blocked on an accept call - or select
If it is not a SYN packet (there is data):
Kernel queues up the TCP data from this segment on the socket
Kernel wakes up a task that is asleep, blocked on a recv call - or select (sock_def_readable())
Layer 5 (Application - HTTP)
Apache (httpd) will wake up, depending on the system call it is blocked on:
accept() returns when a new child connection is available (this is handled with a wrapper called apr_socket_accept())
recv() returns when a socket has new data, which has been read into a userspace buffer
Apache processes the buffer, parsing HTTP protocol strings
Additional Resources
Linux networking stack from the ground up

Should an IPv6 UDP socket that is set up to receive multicast packets also be able to receive unicast packets?

I've got a little client program that listens on an IPv6 multicast group (e.g. ff12::blah:blah%en0) for multicast packets that are sent out by a server. It works well.
The server would also like to sometimes send a unicast packet to my client (since if the packet is only relevant to one client there is no point in bothering all the other members of the multicast group with it). So my server just does a sendto() to my client's IP address and the port that the client's IPv6 multicast socket is listening on.
If my client is running under MacOS/X, this works fine; the unicast packet is received by the same socket that receives the multicast packets. Under Windows, OTOH, the client never receives the unicast packet (even though it does receive the multicast packets without any problems).
My question is, is it expected that a multicast-listener IPv6 UDP socket should also be able to receive unicast packets on that same port (in which case perhaps I'm doing something wrong, or have Windows misconfigured)? Or is this something that "just happens to work" under MacOS/X but isn't guaranteed, so the fact that it doesn't work for me under Windows just means I had the wrong expectations?
It should work fine. As long as you bind to IN6ADDR_ANY, then join the multicast groups, you should be able to send and receive unicast packets with no problem.
It's important to bind to IN6ADDR_ANY (or INADDR_ANY for IPv4) when using multicast. If you bind to a specific interface, this breaks multicast on Linux systems.