LVS: All connections are InActConn - load-balancing

All connections are InActConn
I'm a newbie using LVS. I've tried LVS/TUN and LVS/DR, the result is the same, all connections are InActConn. But the realservers can be reach (through PING). Pls help!!!
OS: CentOS 6.2
RemoteAddress:Port Forward Weight ActiveConn InActConn
UDP 192.168.10.240:2345 rr
-> 192.168.10.251:2345 Tunnel 1 0 10
-> 192.168.10.252:2345 Tunnel 1 0 9
-> 192.168.10.253:2345 Tunnel 1 0 9

This is the expected behavior for services not maintaining connections, like UDP. You may want to read the LVS Howto, especially the part about Active/Inactive connections :
http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.ipvsadm.html#ActiveConn

Old Question : But I got to this post from Google and want to paste my findings here.
In the above answer, the link pasted by #remi-ggacogne missed 1 step for Real server.
You have to turn rp_filter off (esp. in Centos / RHEL ) https://www.slashroot.in/linux-kernel-rpfilter-settings-reverse-path-filtering
Open /etc/sysctl.conf and paste below lines ( as per your network interface )
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.tunl0.rp_filter = 0
To make the above active -->
$systcl -p

Related

Understanding iptables output for filtering packets

I am working on one sles12 system where IPTables are configured in this way:
num target prot opt source destination
1 ACCEPT tcp -- anywhere anywhere tcp dpt:bctp owner UID match dd-test-user
2 DROP tcp -- anywhere anywhere tcp dpt:bctp
3 DROP all -- anywhere instance-data.us-west-2.compute.internal owner GID match test
4 ACCEPT all -- anywhere ip-100-34-0-0.us-west-2.compute.internal/21 owner GID match test
Can someone please help me understand this?
With IPTable rule 2, All packets will be dropped?
What does dpt:bctp mean here? I could not find anything about it manual.
Does Rule 4 even get chance to be applied for the process running from group id of "test" group?
I tried searching online documentation of iptables, but I could not find answer.
I found out that bctp stands for one of the protocol defined on the system
cat /etc/services | grep bctp
bctp 8999/tcp # Brodos Crypto Trade Protocol [Alexander_Sahler]
bctp 8999/udp # Brodos Crypto Trade Protocol [Alexander_Sahler]
Rule 2 is applied when destination port (dpt) is this protocol (8999/tcp). Rule 3 and 4 are applied for rest of ports from the process belonging to users from the group "test".

How do I target the MSS value in a TCP packet using BPF

I am learning BPF and converting some iptables rules to BPF bytcode. I am primarily using the nfbpf_compile application to do this, rather than trying to write C or Assembler. I am having a lot of luck but the syntax of one rule is escaping me.
I'd like to drop packets with the syn flag set that is also missing an MSS value. In iptables the MSS is targetted with --tcp-option 2. I know that MSS is in the TCP options that start at byte 22 of the TCP packet, and MSS is 'kind' 2. I am able to filter the MSS by using tcp[22:2]==$NUMBER in BPF syntax. However, what I want to do is target SYN packets where the MSS is missing entirely.
I have tried every variant of "null" I can think of but am having no luck.
Does anyone know the equivalent of iptables ! --tcp-option 2 in BPF syntax?
An example of something I have tried:
$ ./nfbpf_compile RAW 'tcp[22:2]==0x0' (I know this won't work..it's an example)
12,48 0 0 0,84 0 0 240,21 0 8 64,48 0 0 9,21 0 6 6,40 0 0 6,69 4 0 8191,177 0 0 0,72 0 0 22,21 0 1 0,6 0 0 65535,6 0 0 0
# iptables -I INPUT -m bpf --bytecode '12,48 0 0 0,84 0 0 240,21 0 8 64,48 0 0 9,21 0 6 6,40 0 0 6,69 4 0 8191,177 0 0 0,72 0 0 22,21 0 1 0,6 0 0 65535,6 0 0 0
' -j DROP
TL;DR If you know there are only 0 or 1 TCP options, or if you know the MSS option is always the first option, then you can use the following filter:
tcp && (tcp[tcpflags] == tcp-syn) && ((((tcp[12] & 0xf0) >> 2) < 21) || tcp[20] != 2)
If you don't know this (there are several TCP options and the MSS option may be any of them), which is generally the case, I don't think it's possible to express a matching filter with nfbpf_compile's syntax. In that case, I would recommend writing a C program and loading it with -m bpf --object-pinned /path/to/pinned/bpf.
Let me explain the above filter first. You have two cases to match: 1) there is no TCP option or 2) the first TCP option is not the MSS:
tcp[12] & 0xf0 extracts the data offset field from the TCP header, i.e., the number of 32bit word in the TCP header.
(tcp[12] & 0xf0) >> 2 multiplies this by 4 to get the number of bytes.
If you have less than 21 bytes in your TCP header, then you know there are no TCP options.
tcp[20] != 2 checks that the Option-Kind field of the first TCP option (starts at offset 20) is not 2, the Option-Kind for MSS.
Why is the general case harder to match? TCP options have a variable length (depending on their Option-Kind) and there is a variable, bounded number of TCP options. Say you want to extend the above filter to match on the second TCP option. You first need to know where that option starts; the first option has a variable length so this is not a fixed offset.
With cBPF (the BPF bytecode emitted by nfbpf_compile), you might be able to express that by storing the current option offset in the X register and then loading a byte into register A with the 2nd addressing mode (see the Linux documentation, BPF engine and instruction set). However, I do not think you can do this with the limited nfbpf_compile syntax (assuming it's the same syntax as tcpdump's).

Modem escape sequence (+++) passed as data

one question regarding modem, Hayes escape sequence.
First to explain what is happening:
==> ATD 123\r\n
<== +CR: REL ASYNC\r\n
<== CONNECT 9600\r\n
After this moment I have online session. When I want to hangup, I am doing next.
< no data 1.5 seconds >
==> +++ (no \r\n)
**+++ is received on destination side (why?)**
<== OK
< no data 1.5 seconds >
==> ATH\r\n
<== OK
Destination side gets NO CARRIER.
The problem for me is that escape sequence is received as regular data on destination side.
Does anyone have an idea what should I do? Some modem configuration tweak?
Thanks!
I will answer my question.
I did not find the way to do that.
Instead, in order to hangup I use DTR (Data Terminal Ready) signal. After switching from active to inactive, the other side determines that as hangup (if modem is configured with AT&D2).

How to set up RSS hash fuction in XL710 to receive IPv4 flow type?

In DPKD the ETH_RSS_IPV4 data flow is not activated by default for XL710 Intel NIC. So, when you want to distribute packets among lcores you have to select other IPv4 data flows which are supported by XL710, namely ETH_RSS_FRAG_IPV4, ETH_RSS_NONFRAG_IPV4_TCP, ETH_RSS_NONFRAG_IPV4_UDP, ETH_RSS_NONFRAG_IPV4_SCTP, and ETH_RSS_NONFRAG_IPV4_OTHER. However you will face a silly problem when you are dealing with the fragmented IP packets. If you choose to go with ETH_RSS_FRAG_IPV4 and ETH_RSS_NONFRAG_IPV4_TCP options then some fragmented packets of a connection will fall into another queue, because they don't have L4 port numbers. If you exclude ETH_RSS_NONFRAG_IPV4_TCP function then the ETH_RSS_FRAG_IPV4 hash function will not be applied to non-fragmented packets and those packets will go to queue 0. All other combination of hash functions will not work. So, what should we do?
The behavior of XL710 is not compatible with the conventions in DPDK. So, you must directly work with the API offered by i40e driver in order to set up RSS for ETH_RSS_IPV4. As mentioned in the Intel® Ethernet Controller 710 Series Specification Update, page 18 (release Jan 2017):
Functions that require the Hash (RSS) filters on IPv4 packets should
set all IPv4 PCTYPEs in the PFQF_HENA / VFQF_HENA (PCTYPEs 31, 33…36)
Supported packet types (PCTYPE) are mentioned in Intel® Ethernet Controller 710 Series Datasheet pages 597 and 598 (release Jan 2017). You can see that there is no packet type defined for IPv4.
However there is a solution. The clue is to modify the input set for all required flow types (or packet types). Let's try it with testpmd tool which is provided by DPDK in app folder. After compiling DPDK and the app, run the testpmd application:
./app/test-pmd/testpmd -c ff -n 2 -w 0a:00.0 -w 0a:00.1 -- -i --rxq=4 --txq=4
We have two XL710 in our system. With the following commands you can configure XL710 to behave as you want to support IPv4 data flow.
port config all rss all
set_hash_input_set 0 ipv4-tcp src-ipv4 select
set_hash_input_set 0 ipv4-tcp dst-ipv4 add
set_hash_input_set 0 ipv4-udp src-ipv4 select
set_hash_input_set 0 ipv4-udp dst-ipv4 add
set_hash_input_set 1 ipv4-tcp src-ipv4 select
set_hash_input_set 1 ipv4-tcp dst-ipv4 add
set_hash_input_set 1 ipv4-udp src-ipv4 select
set_hash_input_set 1 ipv4-udp dst-ipv4 add
set_hash_global_config 0 default ipv4-frag enable
set_hash_global_config 0 default ipv4-tcp enable
set_hash_global_config 0 default ipv4-udp enable
set_hash_global_config 1 default ipv4-frag enable
set_hash_global_config 1 default ipv4-tcp enable
set_hash_global_config 1 default ipv4-udp enable
It selects the proper input set for TCP and UDP flow types by removing the L4 port section. The set_hash_global_config command enables the symmetric hash if you need it. By modifying the TCP input set, it behaves just like Frag IPv4 flow type and as a result all packets belonging to the same connection go to the same lcore.
Note that the default input set for Frag IPv4 and NonFIPv4, Other is IP4-S and IP4-D. So it doesn't need to be modified. Remember to modify all other IPv4 flows input set and symmetric quality of them.
You can find the API functions of those commands by looking at the source code of the testpmd application.

Centos 6.3 Server ignoring IGMP Queries

I am using a centos 6.3 server to subscribe to UDP multicast data and I noticed that my server doesn't answer to the IGMP queries sent by the switch it is connected to.
As a result, when I open my multicast socket I start receiving multicast data until my IGMP subscription timeout since the server doesn't renew its subscription. (To insure that the problem doesn't comes from any code of mine, I am simply using smcroute to open multicast subscriptions)
I search online for a while and none of the tips I found helped me to fix this problem.
Here is an screenshot of the IGMP communications on any interface of my server:
http://img521.imageshack.us/img521/9953/capture10y.png
As we can see, my server first send 2 IGMP joins but a few minutes after, when the switch send IGMP query, nobody answers.
The version of the IGMP protocol set for the concerned interface is V2:
[root#localhost ~]# cat /proc/net/igmp
Idx Device : Count Querier Group Users Timer Reporter
1 lo : 0 V2
010000E0 1 0:00000000 0
2 eth0 : 5 V2
FB0000E0 1 0:00000000 1
010000E0 1 0:00000000 0
5 tap0 : 5 V3
FB0000E0 1 0:00000000 0
010000E0 1 0:00000000 0
7 eth1.371: 13 V2
414000E0 1 0:00000000 1
404000E0 1 0:00000000 1
3F4000E0 1 0:00000000 1
504000E0 1 0:00000000 1
524000E0 1 0:00000000 1
494000E0 1 0:00000000 1
4A4000E0 1 0:00000000 1
4B4000E0 1 0:00000000 1
FB0000E0 1 0:00000000 0
010000E0 1 0:00000000 0
The rp_filter is disabled on this interface:
[root#localhost ~]# cat /proc/sys/net/ipv4/conf/eth1.371/rp_filter
0
Thanks a lot for any help you could give me.
Best,
Laurent
Try to temporary disable iptables:
# service iptables stop
and see if it will help.
It might be coming from asymmetric route rules.
DO you have a rule to reach 170.19.52.5 from your server?
(you can use route -n)