iperf2 latency is a two way or one way latency - iperf

iperf2 (version 2.0.9) reports latency in its output as shown below.
Is it a two-way latency or one-way latency measurement ?
Server listening on UDP port 5001 with pid 5167
Receiving 1470 byte datagrams
UDP buffer size: 208 KByte (default)
[ 3] local 192.168.1.102 port 5001 connected with 192.168.1.101 port 59592
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Latency avg/min/max/stdev PPS
[ 3] 0.00-1.00 sec 122 KBytes 1.00 Mbits/sec 0.063 ms 0/ 6254 (0%) 659.932/659.882/660.502/ 8.345 ms 6252 pps
[ 3] 1.00-2.00 sec 122 KBytes 1.00 Mbits/sec 0.020 ms 0/ 6250 (0%) 660.080/659.919/666.878/ 0.110 ms 6250 pps
[ 3] 2.00-3.00 sec 122 KBytes 1.00 Mbits/sec 0.020 ms 0/ 6250 (0%) 660.113/659.955/660.672/ 0.047 ms 6250 pps
[ 3] 3.00-4.00 sec 122 KBytes 1.00 Mbits/sec 0.022 ms 0/ 6250 (0%) 660.153/659.994/660.693/ 0.047 ms 6250 pps
[ 3] 4.00-5.00 sec 122 KBytes 1.00 Mbits/sec 0.021 ms 0/ 6250 (0%) 660.192/660.034/660.617/ 0.049 ms 6250 pps

It's one-way which requires the clocks to be synchronized to a common reference. You may want to check in to Precision Time Protocol. Also, tell your hosting provider that you want better clocks in their data centers. The GPS atomic clock is quite accurate and the signal is free.
There is a lot more work going on with iperf 2.0.14 related to TCP write to read latencies. Version 2.0.14 will enforce the use of --trip-times on the client before any end/end or one way latency measurements are presented. This way the user tells iperf that the systems have their clocks synchronized to the accuracy which the user deems as sufficient. We also produce a Little's law inP metric along with network power. See the man pages for more. The hope is to have iperf 2.0.14 released by early 2021.
[rjmcmahon#localhost iperf2-code]$ src/iperf -s -i 1
[ 4] local 192.168.1.10%enp2s0 port 5001 connected with 192.168.1.80 port 47420 (trip-times) (MSS=1448) (peer 2.0.14-alpha)
[ ID] Interval Transfer Bandwidth Reads Dist(bin=16.0K) Burst Latency avg/min/max/stdev (cnt/size) inP NetPwr
[ 4] 0.00-1.00 sec 1.09 GBytes 9.34 Gbits/sec 18733 2469:2552:2753:2456:2230:2272:1859:2142 2.988/ 0.971/ 3.668/ 0.370 ms (8908/131072) 3.34 MByte 390759.84
Note: For my testing during iperf 2 development, I have GPS disciplined oven controlled oscillators from spectracom in my systems. These cost about $2.5K each and require a GPS signal.

Related

How to scale Cisco Joy capturing speed over 5 GBPS or even more

Currently I am capturing network packets using tcpreplay at a speed of 800 MBPS but I want to scale it over 5 GBPS.
I am running Joy on a server with 16GB Ram and 8 Cores
Tcpreplay Output:
`Actual: 2427978 packets (2098973496 bytes) sent in 20.98 seconds
Rated: 100003501.6 Bps, 800.02 Mbps, 115678.59 pps
Flows: 49979 flows, 2381.11 fps, 2426216 flow packets, 1756 non-flow
Statistics for network device: vth0
Successful packets: 2427978
Failed packets: 0
Truncated packets: 0
Retried packets (ENOBUFS): 0
Retried packets (EAGAIN): 0`
Total Packets Captured: 2412876
I am running Joy on 4 threads but even if I use 24 threads I am not able to see any drastic change in the capturing or receiving speed.
Joy is using af_packet with Zero Copy Ring Buffer and even Cisco mercury use the same mechanism to write packets but they claim that Mercury can write at 40 GBPS on a server-class hardware so anyone have any suggestion on this issue then please revert back.

WSL2 io speeds on Linux filesystem are slow

Trying out WSL2 for the first time. Running Ubuntu 18.04 on a Dell Latitude 9510 with an SSD. Noticed build speeds of a React project were brutally slow. Per all the articles on the web I'm running the project out of ~ and not the windows mount. Ran a benchmark using sysbench --test=fileio --file-test-mode=seqwr run in ~ and got:
File operations:
reads/s: 0.00
writes/s: 3009.34
fsyncs/s: 3841.15
Throughput:
read, MiB/s: 0.00
written, MiB/s: 47.02
General statistics:
total time: 10.0002s
total number of events: 68520
Latency (ms):
min: 0.01
avg: 0.14
max: 22.55
95th percentile: 0.31
sum: 9927.40
Threads fairness:
events (avg/stddev): 68520.0000/0.00
execution time (avg/stddev): 9.9274/0.00
If I'm reading this correctly, that wrote 47 mb/s. Ran the same test on my mac mini and got 942 mb/s. Is this normal? This seems like the Linux i/o speeds on WSL are unusably slow. Any thoughts on ways to speed this up?
---edit---
Not sure if this is a fair comparison, but the output of winsat disk -drive c on the same machine from the Windows side. Smoking fast:
> Dshow Video Encode Time 0.00000 s
> Dshow Video Decode Time 0.00000 s
> Media Foundation Decode Time 0.00000 s
> Disk Random 16.0 Read 719.55 MB/s 8.5
> Disk Sequential 64.0 Read 1940.39 MB/s 9.0
> Disk Sequential 64.0 Write 1239.84 MB/s 8.6
> Average Read Time with Sequential Writes 0.077 ms 8.8
> Latency: 95th Percentile 0.219 ms 8.9
> Latency: Maximum 2.561 ms 8.7
> Average Read Time with Random Writes 0.080 ms 8.9
> Total Run Time 00:00:07.55
---edit 2---
Windows version: Windows 10 Pro, Version 20H2 Build 19042
Late answer, but I had the same issue and wanted to post my solution for anyone who has the problem:
Windows Defender seems to destroy the read speeds in WSL. I added the entire rootfs folder as an exclusion. If you're comfortable turning off Windows Defender, I recommend that as well. Any antivirus probably has similar issues, so adding the WSL directories as an exclusion is probably you best bet.

How to enable RNDIS gadget on u-boot

I am using stm32mp157c-dk2 board and I added the RNDIS gadget to the config file. When I run u-boot on the board I get "No ethernet found". This is the log of the boot:
U-Boot SPL 2018.11-stm32mp-r2.1-00026-g161ca183f1-dirty (Jan 31 2020 - 12:34:38 +0200)
Model: STMicroelectronics STM32MP157C-DK2 Discovery Board
RAM: DDR3-1066/888 bin G 1x4Gb 533MHz v1.41
Trying to boot from MMC1
U-Boot 2018.11-stm32mp-r2.1-00026-g161ca183f1-dirty (Jan 31 2020 - 12:34:38 +0200)
CPU: STM32MP157CAC Rev.B
Model: STMicroelectronics STM32MP157C-DK2 Discovery Board
Board: stm32mp1 in basic mode (st,stm32mp157c-dk2)
Board: MB1272 Var2 Rev.C-01
DRAM: 512 MiB
Clocks:
- MPU : 650 MHz
- MCU : 208.878 MHz
- AXI : 266.500 MHz
- PER : 24 MHz
- DDR : 533 MHz
*******************************************
* WARNING 500mA power supply detected *
* Current too low, use a 3A power supply! *
*******************************************
NAND: 0 MiB
MMC: STM32 SDMMC2: 0, STM32 SDMMC2: 1
Loading Environment from EXT4... OK
In: serial
Out: serial
Err: serial
Net: No ethernet found.
Hit any key to stop autoboot: 0
Boot over mmc0!
Do you have any suggestions? thanks for helpers!
That big warning about too small of a power supply is the first thing to look at. A lack of power tends to lead to not all blocks of the SoC being used / available.

Optimizing Disk I/O when running Redis and PostgreSQL on a single machine

Background:
I have a live Django app that utilizes 4 Redis instances.
The first two are big in size: back ups amount to ~2GB and ~4.4GB respectively. The other two are small: ~85M and ~15M.
redis-server --version yields Redis server v=4.0.2 sha=00000000:0 malloc=jemalloc-4.0.3 bits=64 build=401ce53d7b0383ca.
The problem:
It's a busy server running PostgreSQL 9.6.5 as well. PG data and Redis backups are both saved in the secondary drive xvdb.
I've noticed that whenever my big Redis instances start backing up, disk I/O naturally spikes and PostgreSQL commit statements start piling up in the slow log. Behold:
21:49:26.171 UTC [44861] ubuntu#myapp LOG: duration: 3063.262 ms statement: COMMIT
21:49:26.171 UTC [44890] ubuntu#myapp LOG: duration: 748.307 ms statement: COMMIT
21:49:26.171 UTC [44882] ubuntu#myapp LOG: duration: 1497.461 ms statement: COMMIT
21:49:26.171 UTC [44893] ubuntu#myapp LOG: duration: 655.063 ms statement: COMMIT
21:49:26.171 UTC [44894] ubuntu#myapp LOG: duration: 559.743 ms statement: COMMIT
21:49:26.172 UTC [44883] ubuntu#myapp LOG: duration: 1415.733 ms statement: COMMIT
As a consequence, this is how my PostgreSQL commits look like every day:
The question:
Is there anything I can do on the Redis side to help smoothe out this spikey situation? I'd like Redis and PostgreSQL to live in as much harmony as they possibly can on a single machine.
More information:
Ask for more information if you need it.
Machine specs:
AWS EC2 m4.4xlarge (16 cores, 64GB RAM)
Elastic Block Store gp2 volumes (105 IOPS, burst upto 3000 IOPS)
The following config exists in the Append Only Mode section of my Redis conf files:
appendonly no
appendfilename "appendonly.aof"
# appendfsync always
appendfsync everysec
# appendfsync no
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble no
Typical iostat -xmt 3 values are:
10/15/2017 08:28:35 PM
avg-cpu: %user %nice %system %iowait %steal %idle
10.44 0.00 0.93 0.15 0.06 88.43
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 2.00 0.00 0.04 38.67 0.00 0.00 0.00 0.00 0.00 0.00
xvdb 0.00 2.67 0.00 44.67 0.00 0.41 18.99 0.13 2.81 0.00 2.81 1.07 4.80
Compare that to the same around the time slow commits are logged:
10/15/2017 10:18:11 PM
avg-cpu: %user %nice %system %iowait %steal %idle
8.16 0.00 0.65 11.90 0.04 79.24
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 4.00 0.00 1.00 0.00 0.02 48.00 0.00 1.33 0.00 1.33 1.33 0.13
xvdb 0.00 0.00 1.67 1312.00 0.01 163.50 254.90 142.56 107.64 25.60 107.75 0.76 100.00
The first Redis instance has the following snapshotting config:
save 7200 1
#save 300 10
#save 60 10000
The second Redis instance has the following snapshotting config:
save 21600 10
#save 300 10
#save 60 10000
I can suggest one solution - docker. docker has the capability to limit the os resources that a particular container will use. In your case, the os resource seems to be the disk i/o.
The --device-read-bps flag limits the read rate (bytes per second)
from a device. For example, this command creates a container and
limits the read rate to 1mb per second from /dev/sda:
$ docker run -it --device-read-bps /dev/sda:1mb
docker also has a flag --device-read-iops=[] which limit read rate (IO per second) from a device.
Your image here will be redis, but it won't hurt to move postgres also to docker. This will help you limit the disk i/o rates on redis containers so that postgres has more of it.
This solution should work for you but the infrastructure changes might be a little painful to achieve.
You can refer this SO post for docker limit i/o speed

Iperf: Transfer of data

I have a question in order to understand how iperf is working, I am using the following command.
What i dont understand is "How can 6945 datagrams are send?" because if 9.66 MBytes are transfered, then 9.66M/1458 = 6625 data grams should be tranfereded according to my understanding.
If 10.125MBytes (2.7Mbps * 30 sec) would have been transfered then 6944 data grams would have been send (excluding udp and other header)
Please clerify if some one knows ..
(Also I have used wireshark on both client and server and checked and there the number of packets is greater then the number of packets shown by iperf)
umar#umar-VPCEB11FM:~$ iperf -t 30 -c 192.168.3.181 -u -b 2.7m -l 1458
------------------------------------------------------------
Client connecting to 192.168.3.181, UDP port 5001
Sending 1458 byte datagrams
UDP buffer size: 208 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.3.175 port 47241 connected with 192.168.3.181 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-30.0 sec 9.66 MBytes 2.70 Mbits/sec
[ 3] Sent 6946 datagrams
[ 3] Server Report:
[ 3] 0.0-92318.4 sec 9.66 MBytes 878 bits/sec 0.760 ms 0/ 6945 (0%)
iperf uses base 2 for M and K, meaning that K = 1024 and M = 1024*1024.
When you do that math that way, you get 9.66 MB / 1458 B/d = 6947 datagrams which is within precision error (you have a max resolution of 0.01 MB which means a rounding error of 0.005 MB ~= 3.6 datagrams).