tftp retry timeout exceeded - tftp

My issue is retry count exceeds when I download kernel image to Econa processor board (Econa is ARM based processor) via TFTP as shown below
CNS3000 # tftp 0x4000000 bootpImage.cns3420.uclibc
MAC PORT 0 : Initialize bcm53115M
MAC PORT 2 : Initialize RTL8211
TFTP from server 192.168.0.219; our IP address is 192.168.0.112
Filename 'bootpImage.cns3420.uclibc'.
Load address: 0x4000000
Loading: T T T T T T T T T T
Retry count exceeded; starting again
Following are the points which may help you in finding the cause of this error.
Ping response is OK
CNS3000 # ping 192.168.0.219
MAC PORT 0 : Initialize bcm53115M
MAC PORT 2 : Initialize RTL8211
host 192.168.0.219 is alive
When I tried to verify TFTP is running, I tried as shown below. It seems TFTP server is working. I placed a small file in /tftpboot:
# echo "Hello, embedded world" > /tftpboot/hello.txt"
Then I did localhost
# tftp localhost
tftp> get hello.txt
Received 23 bytes in 0.1 seconds
tftp> quit
Please note that there is no firewall or SELinux on my machine.
Please verify location of these files are OK. I have placed kernel image file bootpImage.cns3420.uclibc in /tftpbootTFTP service file is located in /etc/xinetd.d/tftp.
My TFTP service file is:
service tftp
{
socket_type =dgram
protocol=udp
wait=yes
user=root
server=/usr/sbin/in.tftpd
server_args=-s /tftpboot -b 512
disable=no
per_source=11
cps=100 2
flags=ipv4
}
printenv response in U-boot is:
CNS3000 # printenv
bootargs=root=/dev/mtdblock0 mem=256M console=ttyS0
baudrate=38400
ethaddr=00:53:43:4F:54:54
netmask=255.255.0.0
tftp_bsize=512
udp_frag_size=512
mmc_init=mmcinit
loading=fatload mmc 0 0x4000000 bootpimage-82511
running=go 0x4000000
bootcmd=run mmc_init;run loading;run running
serverip=192.168.0.219
ipaddr=192.168.0.112
bootdelay=5
port=1
bootfile=/tftpboot/bootpImage.cns3420.uclibcl
stdin=serial
stdout=serial
stderr=serial
verify=n
Environment size: 437/4092 bytes
Regards
Waqas

Loading: T T T T T T T T T T
Means there is no transfer at all; this can be caused by wrong interface setting i.e.
u-boot is configured for 100Mbit full duplex, and you try to connect via half duplex or 10Mbit (or some mix of it). Another point is the MTU size, should be 1500 (u-boot cannot handle packet fragmentation)
Hint for windows/vmware users:
tftp timeouts from u-boot are caused by windows ip-forwarding.
1) If you have a home network : switch it of.
2) You are running Routing and Remote Access service : shut down service
3) check registry for ip forwarding:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\IPEnableRouter
set value to 0 (and maybe reboot)

Related

ASP.NET Core SignalR websocket connection limit

I produce load testing of SignalR (ASP.NET Core) application hosted at Windows Server 2016 standard using Microsoft.AspNetCore.SignalR.Client.
Dotnet core hosting 2.1.1 installed
And i can not create more than 3000 (2950-3050) connections.
Already tried recomendations as described here:
How to configure concurrency in .NET Core Web API?
Limiting performance factors of WebSocket in ASP.NET 4.5?
Set limit concurrent connections for websocket on iis 8
Added limits to UseKestrel (this seems to work if i set values to 100 or 1000):
var host = new WebHostBuilder()
.UseKestrel(options =>
{
options.Limits.MaxConcurrentConnections = 50000;
options.Limits.MaxConcurrentUpgradedConnections = 50000;
})
Changed all aspnet.config files by adding this:
<system.web>
<applicationPool maxConcurrentRequestsPerCPU="50000" />
</system.web>
Executed this command:
cd %windir%\System32\inetsrv\ appcmd.exe set config /section:system.webserver/serverRuntime /appConcurrentRequestLimit:50000
Added performance counter for Web Service\Current Connections - Maximum Connections. And Maximum Connections increases to 3300 and stops.
There are no exceptions in server logs. But I feel that there are some restrictions in system.
Server IIS logs contains only this:
GET /messageshub
id=A_3x1sH9kHM1Rc3oPSgP6w
80 - 172.20.192.11 - - 404 0 0 3
Client exceptions is basically the following:
System.Net.Http.HttpRequestException: Error while copying content to a
stream. ---> System.IO.IOException: Unable to read data from the
transport connection: An existing connection was forcibly closed by
the remote host.
On Windows you may have dynamic port assignment issue .
Windows by default has 5000 port numbers ready to be assigned to TCP connections and 1024 of them are reserved for the OS itself which you will end up with 3977 ports free to be assigned .
In your case the number is 3300 as you mentioned but it's possible that 3300 of the connections are established and 677 of them are Time_Waited.
In any case i recommend to use
netstat -an | find 'Established" -c
netstat -an | find 'TIME" -c
netstat -an | find 'CLOSED" -c
In order to figure out the number of established & time_wait & close_wait connections at the time you received the IO exception and if the number is close to 5000 just add this to your registry and reboot and test again
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \Tcpip \Parameters]
MaxUserPort = 5000 (Default = 5000, Max = 65534)

How to solve: UDP send of xxx bytes failed with error 11 in Ubuntu?

UDP send of XXXX bytes failed with error 11
I am running a WebRTC streaming app on Ubuntu 16.04.
It streams video and audio from Logitec HD Webcam c930e within an Electronjs Desktop App.
It all works fine and smooth running on my other machine Macbook Pro. But on my Ubuntu machine I receive errors after 10-20 seconds when the peer connection is established:
[2743:0513/193817.691636:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1019 bytes failed with error 11
[2743:0513/193817.691775:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1020 bytes failed with error 11
[2743:0513/193817.696615:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1020 bytes failed with error 11
[2743:0513/193817.696777:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1020 bytes failed with error 11
[2743:0513/193817.712369:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1029 bytes failed with error 11
[2743:0513/193817.712952:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1030 bytes failed with error 11
[2743:0513/193817.713086:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1030 bytes failed with error 11
[2743:0513/193817.717713:ERROR:stunport.cc(282)] Jingle:Port[0xa5faa3df800:audio:1:0:local:Net[wlx0013ef503b67:192.168.0.x/24:Wifi]]: UDP send of 1030 bytes failed with error 11
==> Btw, if I do NOT stream audio, but video only. I got the same error but only with the "video" between the Log lines...
somewhere in between the lines I also got one line that says:
[3441:0513/195919.377887:ERROR:stunport.cc(506)] sendto: [0x0000000b] Resource temporarily unavailable
I also looked into sysctl.conf and increased the values there. My currenct sysctl.conf looks like this:
fs.file-max=1048576
fs.inotify.max_user_instances=1048576
fs.inotify.max_user_watches=1048576
fs.nr_open=1048576
net.core.netdev_max_backlog=1048576
net.core.rmem_max=16777216
net.core.somaxconn=65535
net.core.wmem_max=16777216
net.ipv4.tcp_congestion_control=htcp
net.ipv4.ip_local_port_range=1024 65535
net.ipv4.tcp_fin_timeout=5
net.ipv4.tcp_max_orphans=1048576
net.ipv4.tcp_max_syn_backlog=20480
net.ipv4.tcp_max_tw_buckets=400000
net.ipv4.tcp_no_metrics_save=1
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_synack_retries=2
net.ipv4.tcp_syn_retries=2
net.ipv4.tcp_tw_recycle=1
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_wmem=4096 65535 16777216
vm.max_map_count=1048576
vm.min_free_kbytes=65535
vm.overcommit_memory=1
vm.swappiness=0
vm.vfs_cache_pressure=50
Like suggested here: https://gist.github.com/cdgraff/7920db287988463aafd7ea09eef6f9f0
It does not seem to help. I am still getting these errors and I experience lagging on the other side.
Additional info: on Ubuntu the Electronjs App connects to Heroku Server (Nodejs) and the other side of the peer connection (Chrome Browser) also connects to it. Heroku Server acts as Handshaking Server to establish WebRTC connection. Both have as configuration:
{'urls': 'stun:stun1.l.google.com:19302'},
{'urls': 'stun:stun2.l.google.com:19302'},
and also an additional Turn Server from numb.viagenie.ca
Connection is established and within the first 10 seconds the quality is very high and there is no lagging at all. But then after 10-20 seconds there is lagging and on the Ubuntu console I am getting these UDP errors.
The PC that Ubuntu is running on:
PROCESSOR / CHIPSET:
CPU Intel Core i3 (2nd Gen) 2310M / 2.1 GHz
Number of Cores: Dual-Core
Cache: 3 MB
64-bit Computing: Yes
Chipset Type: Mobile Intel HM65 Express
RAM:
Memory Speed: 1333 MHz
Memory Specification Compliance: PC3-10600
Technology: DDR3 SDRAM
Installed Size: 4 GB
Rated Memory Speed: 1333 MHz
Graphics
Graphics Processor Intel HD Graphics 3000
Could please anyone give me some hints or anything that could solve this problem?
Thank you
==============EDIT=============
I found in my very large strace log somewhere these two lines:
7671 sendmsg(17, {msg_name(0)=NULL, msg_iov(1)=[{"CHILD_PING\0", 11}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 11
7661 <... recvmsg resumed> {msg_name(0)=NULL, msg_iov(1)=[{"CHILD_PING\0", 12}], msg_controllen=32, [{cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS, {pid=7671, uid=0, gid=0}}], msg_flags=0}, 0) = 11
On top of that, somewhere near when the error happens (at the end of the log file, just before I quit the application) I see in the log file the following:
https://gist.github.com/Mcdane/2342d26923e554483237faf02cc7cfad
First, to get an impression of what is happening in the first place, I'd look with strace. Start your application with
strace -e network -o log.strace -f YOUR_APPLICATION
If your application looks for another running process to turn the work too, start it with parameters so it doesn't do that. For instance, for Chrome, pass in a --user-data-dir value that is different from your default.
Look for = 11 in the output file log.strace afterwards, and look what happened before and after. This will give you a rough picture of what is happening, and you can exclude silly mistakes like sendtos to 0.0.0.0 or so (For this reason, this is also very important information to include in a stackoverflow question, for instance by uploading the output to gist).
It may also be helpful to use Wireshark or another packet capture program to get a rough overview of what is being sent.
Assuming you can confirm with strace that a valid send call is taken place, you can then further analyze the error conditions.
Error 11 is EAGAIN. The documentation of send says when this error is supposed to happen:
EAGAIN (...) The socket is marked nonblocking and the requested operation would block. (...)
EAGAIN (Internet domain datagram sockets) The socket referred to by
sockfd had not previously been bound to an address and, upon
attempting to bind it to an ephemeral port, it was determined that all
port numbers in the ephemeral port range are currently in use. See
the discussion of /proc/sys/net/ipv4/ip_local_port_range in
ip(7).
Both conditions could apply.
The first will be obvious by the strace log if you trace the creation of the socket involved.
To exclude the second, you can run netstat -una (or, if you want to know the programs involved, sudo netstat -unap) to see which ports are open (if you want Stack Overflow users to look into it, post the output on gist or similar and link to it here). Your port range net.ipv4.ip_local_port_range=1024 65535 is not the standard 32768 60999; this looks like you attempted to do something about lacking port numbers already. It would help to trace back to the reason of why you changed that parameter, and the conditions that convinced you to do so.

Aerospike heartbeat configuration for single server, error "Unable to find any suitable network device for node ID"

I want to run Aerospike server in single-server mode.
Now I have this configuration:
service {
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 15000
}
logging {
console {
context any info
}
}
network {
service {
address 127.0.0.1
port 3000
}
heartbeat {
mode multicast
multicast-group 239.1.99.222
port 9918
# To use unicast-mesh heartbeats, remove the 3 lines above, and see
# aerospike_mesh.conf for alternative.
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test {
replication-factor 1
memory-size 20M
default-ttl 1d # 30 days, use 0 to never expire/evict.
storage-engine memory
}
And when I try to start server I got error in the log:
"Unable to find any suitable network device for node ID"
I don't want server to be available to internet.
How to achieve this and fix the issue?
The Node ID is generated using the MAC id of the interface on the host.
https://github.com/aerospike/aerospike-server/blob/master/cf/src/socket.c#L2470
If you dont have any of the default interface names that aerospike is aware of, then you might get this error.
To fix this problem, you can specify your interface name.
http://www.aerospike.com/docs/operations/troubleshoot/startup#problem-with-network-interface
To avoid exposing your aerospike node on internet, you can bind it only to localhost or to a private interface only or use other network tools/devices to avoid exposing the server port such as firewall or ACL. Best way to avoid exposing aerospike on internet is to ensure that the server hosting aerospike is not exposed to internet. If that is not doable then restrict your aerospike port access to your aerospike clients IP only using firewall. Also, you can use database credentials available in enterprise edition.
http://www.aerospike.com/docs/guide/security.html

Nagios host notifications not sending via email or logging

I am re-doing our nagios infrastructure with puppet but I am currently stopped at a seemingly simple problem (most likely a config issue).
Using puppet, I spit out some basic nagios config files on disk. Nagios reloads fine and everything looks okay in the UI but, when I mark a host down, it does not send a notification.
nagios.log shows:
[1470699491] EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;divcont01;1;test notification
[1470699491] PASSIVE HOST CHECK: divcont01;1;test notification
[1470699491] HOST ALERT: divcont01;DOWN;HARD;1;test notification
In production (where I have changed nothing), I see in nagios.log (after marking a host down in ui):
[1470678186] EXTERNAL COMMAND:
PROCESS_HOST_CHECK_RESULT;PALTL12;1;test ey
[1470678187] PASSIVE HOST CHECK: PALTL12;1;test ey
[1470678187] HOST ALERT: PALTL12;DOWN;HARD;1;test ey
[1470678187] HOST NOTIFICATION:
pal_infra;PALTL12;DOWN;host-notify-by-pom;test ey
[1470678187] HOST NOTIFICATION:
pal_infra;PALTL12;DOWN;host-notify-by-email;test ey
[1470678192] HOST ALERT: PALTL12;UP;HARD;1;PING OK - Packet loss = 0%,
RTA = 0.81 ms
[1470678192] HOST NOTIFICATION:
pal_infra;PALTL12;UP;host-notify-by-pom;PING OK - Packet loss = 0%,
RTA = 0.81 ms
[1470678192] HOST NOTIFICATION:
pal_infra;PALTL12;UP;host-notify-by-email;PING OK - Packet loss = 0%,
RTA = 0.81 ms
As seen in the logs, there is a HOST NOTIFICATION logged and sent directly after the HOST ALERT in prod. I have been exhaustively comparing config files today and I cannot find a reason why the new config stops short of the notification.
I have verified that notifications are enabled at the top level. I have verified that email can be sent from this box (though, I am using the logs to verify functionality, not email). I have also tried multiple other google suggestions (and will continue my search too).
Relevant config details below. Please pardon the verbosity of my configuration and lackluster stack-overflow formatting. Thank you in advance.
hosts/divcont01.cfg:
define host {
address snip
host_name divcont01
use generic-host-puppetized
}
host-templates/generic-host-puppetized.cfg:
define host {
check_command check-host-alive
check_interval 1
contact_groups generic-contactgroup
checks_enabled 1
event_handler_enabled 0
flap_detection_enabled 0
name generic-host-puppetized
hostgroups +generic-host-puppetized
max_check_attempts 4
notification_interval 4
notification_options d,u,r
notification_period 24x7
notifications_enabled 1
process_perf_data 0
register 0
retain_nonstatus_information 1
retain_status_information 1
}
hostgroups/generic-host-puppetized.cfg:
define hostgroup {
hostgroup_name generic-host-puppetized
}
contactgroups/generic-contactgroup.cfg
define contactgroup {
contactgroup_name generic-contactgroup
members generic-puppetized-contact
}
contacts/generic-puppetized-contact.cfg
define contact {
use generic-contact
contact_name generic-puppetized-contact
email <my email>
}
objects/templates.cfg (generic-contact config only)
define contact{
use my email
name generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
host_notification_commands generic-puppetized-contact-host-notify-by-email-low
service_notification_commands notify-by-email,service-notify-by-pom
service_notification_options u,c,r,f ; send notifications for all service states, flapping events, and scheduled downtime events
host_notification_options d,r,f ; send notifications for all host states, flapping events, and scheduled downtime events
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
}
commands/generic-puppetized-contact-host-notify-by-email-low.cfg:
define command {
command_line /etc/nagios/global/scripts/nagios-mailx.sh -t my email -s "** notification Host Alert: hostname is hoststate **" -m "***** Nagios ***** Notification Type: notification type Host: host State: hoststate Address: address Info: output Date/Time: date"
command_name generic-puppetized-contact-host-notify-by-email-low
}
Figured it out...I was building my system within another pre-existing system (dangerous, I know) and my contacts were actually pointing to a generic-contact that had its notifications disabled.
Whoops :)

u-Boot VxWorks TFTP boot failure: " ERROR: booting os 'Unknown OS' (14) is not supported"

I am trying to boot VxWOrks using tftp for zynq.
I have set the enviroment varibles for ipaddr, serverip, netmask accordingly and files are loaded in RAM succesfully. however, i get the following error when trying to boot the vxWorks image. There is not problem with the VxWOrks image as i can successfully boot with these iamges when i write these files to SDcard and boot from the sdcard
zynq-uboot> bootm 0x5000000 - 0x4000000
#ERROR: booting os 'Unknown OS' (14) is not supported"
Here is a complete screen shot
zynq-uboot> setenv ipaddr 192.168.88.169;setenv serverip 192.168.88.88;setenv netmask 255.255.255.0
zynq-uboot> tftp 0x8000000 BOOT.bin
Trying to set up GEM link...
Phy ID: 01410E40
Resetting PHY...
PHY reset complete.
Waiting for PHY to complete auto-negotiation...
Link is now at 1000Mbps!
Using zynq_gem device
TFTP from server 192.168.88.88; our IP address is 192.168.88.169
Filename 'BOOT.bin'.
Load address: 0x8000000
Loading: T ########################
done
Bytes transferred = 345180 (5445c hex)
zynq-uboot> tftp 0x5000000 uVxWorks && tftp 0x4000000 zynq-7000.dtb
Using zynq_gem device
TFTP from server 192.168.88.88; our IP address is 192.168.88.169
Filename 'uVxWorks'.
Load address: 0x5000000
Loading: T T #################################################################
#################################################################
###############################################################
done
Bytes transferred = 2829468 (2b2c9c hex)
Using zynq_gem device
TFTP from server 192.168.88.88; our IP address is 192.168.88.169
Filename 'zynq-7000.dtb'.
Load address: 0x4000000
Loading: #
done
Bytes transferred = 3588 (e04 hex)
zynq-uboot> bootm 0x5000000 - 0x4000000
## Booting kernel from Legacy Image at 05000000 ...
Image Name: vxWorks
Image Type: ARM Unknown OS Kernel Image (uncompressed)
Data Size: 2829404 Bytes = 2.7 MiB
Load Address: 00200000
Entry Point: 00200000
Verifying Checksum ... OK
Loading Kernel Image ... OK
OK
ERROR: booting os 'Unknown OS' (14) is not supported
zynq-uboot> <INTERRUPT>
SOlution:
i had to to load the vxWorks.bin as well and it worked.
zynq-uboot> tftp 0x200000 vxWorks.bin