Centos 6.3 Server ignoring IGMP Queries - udp

I am using a centos 6.3 server to subscribe to UDP multicast data and I noticed that my server doesn't answer to the IGMP queries sent by the switch it is connected to.
As a result, when I open my multicast socket I start receiving multicast data until my IGMP subscription timeout since the server doesn't renew its subscription. (To insure that the problem doesn't comes from any code of mine, I am simply using smcroute to open multicast subscriptions)
I search online for a while and none of the tips I found helped me to fix this problem.
Here is an screenshot of the IGMP communications on any interface of my server:
http://img521.imageshack.us/img521/9953/capture10y.png
As we can see, my server first send 2 IGMP joins but a few minutes after, when the switch send IGMP query, nobody answers.
The version of the IGMP protocol set for the concerned interface is V2:
[root#localhost ~]# cat /proc/net/igmp
Idx Device : Count Querier Group Users Timer Reporter
1 lo : 0 V2
010000E0 1 0:00000000 0
2 eth0 : 5 V2
FB0000E0 1 0:00000000 1
010000E0 1 0:00000000 0
5 tap0 : 5 V3
FB0000E0 1 0:00000000 0
010000E0 1 0:00000000 0
7 eth1.371: 13 V2
414000E0 1 0:00000000 1
404000E0 1 0:00000000 1
3F4000E0 1 0:00000000 1
504000E0 1 0:00000000 1
524000E0 1 0:00000000 1
494000E0 1 0:00000000 1
4A4000E0 1 0:00000000 1
4B4000E0 1 0:00000000 1
FB0000E0 1 0:00000000 0
010000E0 1 0:00000000 0
The rp_filter is disabled on this interface:
[root#localhost ~]# cat /proc/sys/net/ipv4/conf/eth1.371/rp_filter
0
Thanks a lot for any help you could give me.
Best,
Laurent

Try to temporary disable iptables:
# service iptables stop
and see if it will help.

It might be coming from asymmetric route rules.
DO you have a rule to reach 170.19.52.5 from your server?
(you can use route -n)

Related

TFTP fails: server error "unknown option"

i am trying to get or put with TFTP protocol. I set the TFPT server and everything worked but stopped working lately. I am able to connect to the server but am getting an error "unknown option -?" when inspecting the Syslog. When running Tcpdump i see the message is "19 RRQ filename netascii"
i have looked for a couple of days in the net but with no success.
from RFC 1350
opcode operation
1 Read request (RRQ)
2 Write request (WRQ)
3 Data (DATA)
4 Acknowledgment (ACK)
5 Error (ERROR)
2 bytes string 1 byte string 1 byte
------------------------------------------------
| Opcode | Filename | 0 | Mode | 0 |
------------------------------------------------
This is what you have to see in your RRQ or WRQ packet

How do I target the MSS value in a TCP packet using BPF

I am learning BPF and converting some iptables rules to BPF bytcode. I am primarily using the nfbpf_compile application to do this, rather than trying to write C or Assembler. I am having a lot of luck but the syntax of one rule is escaping me.
I'd like to drop packets with the syn flag set that is also missing an MSS value. In iptables the MSS is targetted with --tcp-option 2. I know that MSS is in the TCP options that start at byte 22 of the TCP packet, and MSS is 'kind' 2. I am able to filter the MSS by using tcp[22:2]==$NUMBER in BPF syntax. However, what I want to do is target SYN packets where the MSS is missing entirely.
I have tried every variant of "null" I can think of but am having no luck.
Does anyone know the equivalent of iptables ! --tcp-option 2 in BPF syntax?
An example of something I have tried:
$ ./nfbpf_compile RAW 'tcp[22:2]==0x0' (I know this won't work..it's an example)
12,48 0 0 0,84 0 0 240,21 0 8 64,48 0 0 9,21 0 6 6,40 0 0 6,69 4 0 8191,177 0 0 0,72 0 0 22,21 0 1 0,6 0 0 65535,6 0 0 0
# iptables -I INPUT -m bpf --bytecode '12,48 0 0 0,84 0 0 240,21 0 8 64,48 0 0 9,21 0 6 6,40 0 0 6,69 4 0 8191,177 0 0 0,72 0 0 22,21 0 1 0,6 0 0 65535,6 0 0 0
' -j DROP
TL;DR If you know there are only 0 or 1 TCP options, or if you know the MSS option is always the first option, then you can use the following filter:
tcp && (tcp[tcpflags] == tcp-syn) && ((((tcp[12] & 0xf0) >> 2) < 21) || tcp[20] != 2)
If you don't know this (there are several TCP options and the MSS option may be any of them), which is generally the case, I don't think it's possible to express a matching filter with nfbpf_compile's syntax. In that case, I would recommend writing a C program and loading it with -m bpf --object-pinned /path/to/pinned/bpf.
Let me explain the above filter first. You have two cases to match: 1) there is no TCP option or 2) the first TCP option is not the MSS:
tcp[12] & 0xf0 extracts the data offset field from the TCP header, i.e., the number of 32bit word in the TCP header.
(tcp[12] & 0xf0) >> 2 multiplies this by 4 to get the number of bytes.
If you have less than 21 bytes in your TCP header, then you know there are no TCP options.
tcp[20] != 2 checks that the Option-Kind field of the first TCP option (starts at offset 20) is not 2, the Option-Kind for MSS.
Why is the general case harder to match? TCP options have a variable length (depending on their Option-Kind) and there is a variable, bounded number of TCP options. Say you want to extend the above filter to match on the second TCP option. You first need to know where that option starts; the first option has a variable length so this is not a fixed offset.
With cBPF (the BPF bytecode emitted by nfbpf_compile), you might be able to express that by storing the current option offset in the X register and then loading a byte into register A with the 2nd addressing mode (see the Linux documentation, BPF engine and instruction set). However, I do not think you can do this with the limited nfbpf_compile syntax (assuming it's the same syntax as tcpdump's).

How to set up RSS hash fuction in XL710 to receive IPv4 flow type?

In DPKD the ETH_RSS_IPV4 data flow is not activated by default for XL710 Intel NIC. So, when you want to distribute packets among lcores you have to select other IPv4 data flows which are supported by XL710, namely ETH_RSS_FRAG_IPV4, ETH_RSS_NONFRAG_IPV4_TCP, ETH_RSS_NONFRAG_IPV4_UDP, ETH_RSS_NONFRAG_IPV4_SCTP, and ETH_RSS_NONFRAG_IPV4_OTHER. However you will face a silly problem when you are dealing with the fragmented IP packets. If you choose to go with ETH_RSS_FRAG_IPV4 and ETH_RSS_NONFRAG_IPV4_TCP options then some fragmented packets of a connection will fall into another queue, because they don't have L4 port numbers. If you exclude ETH_RSS_NONFRAG_IPV4_TCP function then the ETH_RSS_FRAG_IPV4 hash function will not be applied to non-fragmented packets and those packets will go to queue 0. All other combination of hash functions will not work. So, what should we do?
The behavior of XL710 is not compatible with the conventions in DPDK. So, you must directly work with the API offered by i40e driver in order to set up RSS for ETH_RSS_IPV4. As mentioned in the Intel® Ethernet Controller 710 Series Specification Update, page 18 (release Jan 2017):
Functions that require the Hash (RSS) filters on IPv4 packets should
set all IPv4 PCTYPEs in the PFQF_HENA / VFQF_HENA (PCTYPEs 31, 33…36)
Supported packet types (PCTYPE) are mentioned in Intel® Ethernet Controller 710 Series Datasheet pages 597 and 598 (release Jan 2017). You can see that there is no packet type defined for IPv4.
However there is a solution. The clue is to modify the input set for all required flow types (or packet types). Let's try it with testpmd tool which is provided by DPDK in app folder. After compiling DPDK and the app, run the testpmd application:
./app/test-pmd/testpmd -c ff -n 2 -w 0a:00.0 -w 0a:00.1 -- -i --rxq=4 --txq=4
We have two XL710 in our system. With the following commands you can configure XL710 to behave as you want to support IPv4 data flow.
port config all rss all
set_hash_input_set 0 ipv4-tcp src-ipv4 select
set_hash_input_set 0 ipv4-tcp dst-ipv4 add
set_hash_input_set 0 ipv4-udp src-ipv4 select
set_hash_input_set 0 ipv4-udp dst-ipv4 add
set_hash_input_set 1 ipv4-tcp src-ipv4 select
set_hash_input_set 1 ipv4-tcp dst-ipv4 add
set_hash_input_set 1 ipv4-udp src-ipv4 select
set_hash_input_set 1 ipv4-udp dst-ipv4 add
set_hash_global_config 0 default ipv4-frag enable
set_hash_global_config 0 default ipv4-tcp enable
set_hash_global_config 0 default ipv4-udp enable
set_hash_global_config 1 default ipv4-frag enable
set_hash_global_config 1 default ipv4-tcp enable
set_hash_global_config 1 default ipv4-udp enable
It selects the proper input set for TCP and UDP flow types by removing the L4 port section. The set_hash_global_config command enables the symmetric hash if you need it. By modifying the TCP input set, it behaves just like Frag IPv4 flow type and as a result all packets belonging to the same connection go to the same lcore.
Note that the default input set for Frag IPv4 and NonFIPv4, Other is IP4-S and IP4-D. So it doesn't need to be modified. Remember to modify all other IPv4 flows input set and symmetric quality of them.
You can find the API functions of those commands by looking at the source code of the testpmd application.

How to monitor the process in a container?

I currently look into the LXC container API. I am trying to figure out how can I make the operating system know to which container the currently running process belongs. In this way, OS can allocate resource for processes according to the container.
I am assuming your query is - Given a PID, how to find the container in which this process is running?
I will try to answer it based on my recent reading on Linux containers. Each container can be configured to start with its own user and group id mappings.
From https://linuxcontainers.org/lxc/manpages/man5/lxc.container.conf.5.html:
lxc.id_map
Four values must be provided. First a character, either 'u', or 'g', to specify whether user or group ids are being mapped. Next is
the first userid as seen in the user namespace of the container. Next
is the userid as seen on the host. Finally, a range indicating the
number of consecutive ids to map.
So, you would add something like this in config file (Ex: ~/.config/lxc/default.conf):
lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536
The above basically means that uids/gids between 0 and 65536 are mapped to numbers between 100000 and 1655356. So, a uid of 0 (root) on container will be seen as 100000 on host
For Example, inside container it will look something like this:
root#unpriv_cont:/# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 02:18 ? 00:00:00 /sbin/init
root 157 1 0 02:18 ? 00:00:00 upstart-udev-bridge --daemon
But on host the same processes will look like this:
ps -ef | grep 100000
100000 2204 2077 0 Dec12 ? 00:00:00 /sbin/init
100000 3170 2204 0 Dec12 ? 00:00:00 upstart-udev-bridge --daemon
100000 1762 2204 0 Dec12 ? 00:00:00 /lib/systemd/systemd-udevd --daemon
Thus, you can find the container of a process by looking for its UID and relating it to the mapping defined in that container's config.

LVS: All connections are InActConn

All connections are InActConn
I'm a newbie using LVS. I've tried LVS/TUN and LVS/DR, the result is the same, all connections are InActConn. But the realservers can be reach (through PING). Pls help!!!
OS: CentOS 6.2
RemoteAddress:Port Forward Weight ActiveConn InActConn
UDP 192.168.10.240:2345 rr
-> 192.168.10.251:2345 Tunnel 1 0 10
-> 192.168.10.252:2345 Tunnel 1 0 9
-> 192.168.10.253:2345 Tunnel 1 0 9
This is the expected behavior for services not maintaining connections, like UDP. You may want to read the LVS Howto, especially the part about Active/Inactive connections :
http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.ipvsadm.html#ActiveConn
Old Question : But I got to this post from Google and want to paste my findings here.
In the above answer, the link pasted by #remi-ggacogne missed 1 step for Real server.
You have to turn rp_filter off (esp. in Centos / RHEL ) https://www.slashroot.in/linux-kernel-rpfilter-settings-reverse-path-filtering
Open /etc/sysctl.conf and paste below lines ( as per your network interface )
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.tunl0.rp_filter = 0
To make the above active -->
$systcl -p