Aerospike heartbeat configuration for single server, error "Unable to find any suitable network device for node ID" - aerospike

I want to run Aerospike server in single-server mode.
Now I have this configuration:
service {
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 15000
}
logging {
console {
context any info
}
}
network {
service {
address 127.0.0.1
port 3000
}
heartbeat {
mode multicast
multicast-group 239.1.99.222
port 9918
# To use unicast-mesh heartbeats, remove the 3 lines above, and see
# aerospike_mesh.conf for alternative.
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test {
replication-factor 1
memory-size 20M
default-ttl 1d # 30 days, use 0 to never expire/evict.
storage-engine memory
}
And when I try to start server I got error in the log:
"Unable to find any suitable network device for node ID"
I don't want server to be available to internet.
How to achieve this and fix the issue?

The Node ID is generated using the MAC id of the interface on the host.
https://github.com/aerospike/aerospike-server/blob/master/cf/src/socket.c#L2470
If you dont have any of the default interface names that aerospike is aware of, then you might get this error.
To fix this problem, you can specify your interface name.
http://www.aerospike.com/docs/operations/troubleshoot/startup#problem-with-network-interface
To avoid exposing your aerospike node on internet, you can bind it only to localhost or to a private interface only or use other network tools/devices to avoid exposing the server port such as firewall or ACL. Best way to avoid exposing aerospike on internet is to ensure that the server hosting aerospike is not exposed to internet. If that is not doable then restrict your aerospike port access to your aerospike clients IP only using firewall. Also, you can use database credentials available in enterprise edition.
http://www.aerospike.com/docs/guide/security.html

Related

org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable inside kubernetes environment

We have a setup wherein, one ignite server node serves 15 to 20 thick client nodes and 40 to 50 thin client nodes, thin client connection is singlton,
In operation, some times we get below error,
org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable [sock=Socket[addr=hostnm19.hostx.com/10.13.10.19,port=30519,localport=57552]]
On the Server node, we are inserting data inside a third party store using CacheStoreAdapters
Don't know where it goes wrong since out of 100 operations one operation fails with the above error.
Also, let me know what can we do for this failure handling.
Apache Ignite version: 2.8
Edits: (Code Snippet)
ClientConfiguration cfg = new ClientConfiguration()
.setAddresses("host:port");
IgniteClient client = Ignition.startClient(cfg); // this client is singleton
client.getOrCreateCache("ABC_CACHE").put(key, val);
StatckTrace:
org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable [sock=Socket[addr=hostnm19.hostx.com/10.13.10.19,port=30519,localport=57552]]
at org.apache.ignite.internal.client.thin.TcpClientChannel.handleIOError(TcpClientChannel.java:499)
at org.apache.ignite.internal.client.thin.TcpClientChannel.handleIOError(TcpClientChannel.java:491)
at org.apache.ignite.internal.client.thin.TcpClientChannel.access$100(TcpClientChannel.java:92)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.read(TcpClientChannel.java:538)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.readInt(TcpClientChannel.java:572)
at org.apache.ignite.internal.client.thin.TcpClientChannel.processNextResponse(TcpClientChannel.java:272)
at org.apache.ignite.internal.client.thin.TcpClientChannel.receive(TcpClientChannel.java:234)
at org.apache.ignite.internal.client.thin.TcpClientChannel.service(TcpClientChannel.java:171)
at org.apache.ignite.internal.client.thin.ReliableChannel.service(ReliableChannel.java:160)
at org.apache.ignite.internal.client.thin.ReliableChannel.request(ReliableChannel.java:187)
at org.apache.ignite.internal.client.thin.TcpIgniteClient.getOrCreateCache(TcpIgniteClient.java:114)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.read(TcpClientChannel.java:535)
... 36 more
You probably have network or NAT configured which will reset connections when not used, or even sporadically.
In this case, you will have to reconnect.
Another option, are you sure you are connecting to thin client port and not some other port?

IPVS (keepalived) doesn't balance UDP connections

I have two load balancer with Debian 8 and three Graylog server with Debian 9.
Every server in my network sends logs via rsyslog to a virtual server configured on the LB. The connection is UDP.
The problem is that the packets are not balanced. (all connections goes on the first real server on the list)
In case of failover the packets are correctly sent to the others real servers.
The only way I found to re-balance the connection is to remove all real server from the LB and the restart keepalived service.
I already tied to set:
ipvsadm --set 0 0 1
Timeout (tcp tcpfin udp): 900 120 1
I already set these two variables:
echo 1 > /proc/sys/net/ipv4/vs/expire_nodest_conn
echo 1 > /proc/sys/net/ipv4/vs/expire_quiescent_template
IPVS is configure as follow:
vrrp_instance logserver {
state MASTER
interface eth0
virtual_router_id 195
priority 200
advert_int 1
authentication {
auth_type keepalived
auth_pass xxxxxx
}
virtual_ipaddress {
10.20.20.195/22
}
}
virtual_server 10.20.20.195 0 {
delay_loop 60
protocol UDP
lb_algo wrr
lb_kind DR
persistence_timeout 30
real_server 10.20.20.196 0 {
weight 100
MISC_CHECK {
connect_timeout 3
misc_path "/etc/keepalived/checkgraylog 10.20.20.196"
}
}
real_server 10.20.20.197 0 {
weight 100
MISC_CHECK {
connect_timeout 3
misc_path "/etc/keepalived/checkgraylog 10.20.20.197"
}
}
real_server 10.20.20.198 0 {
weight 100
MISC_CHECK {
connect_timeout 3
misc_path "/etc/keepalived/checkgraylog 10.20.20.198"
} } }
Is there a way to effective balance UDP connection with Direct Routing?
Thank you
virtual_server 10.20.20.195 12333 {
delay_loop 60
protocol UDP
lb_algo wrr
lb_kind DR
ops # <<< - Try this. Works for me (Ubuntu 18.04, Keepalived v1.3.9, ipvsadm v1.28)
real_server 10.20.20.196 12333 {
Option ops for me works only if either:
Virtual server port is explicitly defined.
fwmark is used together with in virtual_server definition.
Does not work for virtual_server_IP 0 form - in that case ipvsadm -Ln shows that persistent option is used as well.

ASP.NET Core SignalR websocket connection limit

I produce load testing of SignalR (ASP.NET Core) application hosted at Windows Server 2016 standard using Microsoft.AspNetCore.SignalR.Client.
Dotnet core hosting 2.1.1 installed
And i can not create more than 3000 (2950-3050) connections.
Already tried recomendations as described here:
How to configure concurrency in .NET Core Web API?
Limiting performance factors of WebSocket in ASP.NET 4.5?
Set limit concurrent connections for websocket on iis 8
Added limits to UseKestrel (this seems to work if i set values to 100 or 1000):
var host = new WebHostBuilder()
.UseKestrel(options =>
{
options.Limits.MaxConcurrentConnections = 50000;
options.Limits.MaxConcurrentUpgradedConnections = 50000;
})
Changed all aspnet.config files by adding this:
<system.web>
<applicationPool maxConcurrentRequestsPerCPU="50000" />
</system.web>
Executed this command:
cd %windir%\System32\inetsrv\ appcmd.exe set config /section:system.webserver/serverRuntime /appConcurrentRequestLimit:50000
Added performance counter for Web Service\Current Connections - Maximum Connections. And Maximum Connections increases to 3300 and stops.
There are no exceptions in server logs. But I feel that there are some restrictions in system.
Server IIS logs contains only this:
GET /messageshub
id=A_3x1sH9kHM1Rc3oPSgP6w
80 - 172.20.192.11 - - 404 0 0 3
Client exceptions is basically the following:
System.Net.Http.HttpRequestException: Error while copying content to a
stream. ---> System.IO.IOException: Unable to read data from the
transport connection: An existing connection was forcibly closed by
the remote host.
On Windows you may have dynamic port assignment issue .
Windows by default has 5000 port numbers ready to be assigned to TCP connections and 1024 of them are reserved for the OS itself which you will end up with 3977 ports free to be assigned .
In your case the number is 3300 as you mentioned but it's possible that 3300 of the connections are established and 677 of them are Time_Waited.
In any case i recommend to use
netstat -an | find 'Established" -c
netstat -an | find 'TIME" -c
netstat -an | find 'CLOSED" -c
In order to figure out the number of established & time_wait & close_wait connections at the time you received the IO exception and if the number is close to 5000 just add this to your registry and reboot and test again
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \Tcpip \Parameters]
MaxUserPort = 5000 (Default = 5000, Max = 65534)

Aerospike - “n_bytes_memory went negative” with in memory-only namespace with ttl

We have a namespace configured to store data in memory only with couple of minutes default ttl. After starting putting some data into it, when expiration kicks in, we're getting these messages in the log (a lot, for ~30% of expired records):
WARNING (namespace): (namespace.c::762) set_id 1 - n_bytes_memory went negative!
I have simple client app with server config that can reproduce this: https://github.com/akkomar/aerospike-test (it's based on docker and is very easy to start)
Any advice what might be the reason?
Edit:
I checked this on versions 3.6.4, 3.7.0.1 and 3.7.4
Configuration file used for testing (from https://github.com/akkomar/aerospike-test/blob/master/etc/aerospike.conf):
service {
user root
group root
paxos-single-replica-limit 1
pidfile /var/run/aerospike/asd.pid
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 1024
}
logging {
file /var/log/aerospike/aerospike.log {
context any info
}
console {
context any info
context namespace detail
}
}
network {
service {
address any
port 3000
}
heartbeat {
mode mesh
port 3002
mesh-port 3002
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test_ns {
replication-factor 2
memory-size 1G
default-ttl 10S
storage-engine memory
}
Edit2:
It seems that it's happening only if I update records via UDF. The simplest one that reproduces this:
local VAL_KEY = "v"
function add_data(rec, val_to_add, ttl_to_set)
if aerospike:exists(rec) then
rec[VAL_KEY] = val_to_add
aerospike:update(rec)
else
rec[VAL_KEY] = val_to_add
aerospike:create(rec)
end
end
When I execute the same operation via Java API - everything seems to work fine (example github repo mentioned earlier is updated with Java API example)
The meaning of the error message is that the space we have accounted for the set in memory went to a negative number which should not be possible.
This has been logged in our internal bug tracking system for resolution in future releases
It turned out it was a bug in Aerospike.
It's fixed in version 3.7.4.1 (detailed explanation in https://discuss.aerospike.com/t/problem-with-expiring-records-in-memory-only-namespace-n-bytes-memory-went-negative/2560/6)

tftp retry timeout exceeded

My issue is retry count exceeds when I download kernel image to Econa processor board (Econa is ARM based processor) via TFTP as shown below
CNS3000 # tftp 0x4000000 bootpImage.cns3420.uclibc
MAC PORT 0 : Initialize bcm53115M
MAC PORT 2 : Initialize RTL8211
TFTP from server 192.168.0.219; our IP address is 192.168.0.112
Filename 'bootpImage.cns3420.uclibc'.
Load address: 0x4000000
Loading: T T T T T T T T T T
Retry count exceeded; starting again
Following are the points which may help you in finding the cause of this error.
Ping response is OK
CNS3000 # ping 192.168.0.219
MAC PORT 0 : Initialize bcm53115M
MAC PORT 2 : Initialize RTL8211
host 192.168.0.219 is alive
When I tried to verify TFTP is running, I tried as shown below. It seems TFTP server is working. I placed a small file in /tftpboot:
# echo "Hello, embedded world" > /tftpboot/hello.txt"
Then I did localhost
# tftp localhost
tftp> get hello.txt
Received 23 bytes in 0.1 seconds
tftp> quit
Please note that there is no firewall or SELinux on my machine.
Please verify location of these files are OK. I have placed kernel image file bootpImage.cns3420.uclibc in /tftpbootTFTP service file is located in /etc/xinetd.d/tftp.
My TFTP service file is:
service tftp
{
socket_type =dgram
protocol=udp
wait=yes
user=root
server=/usr/sbin/in.tftpd
server_args=-s /tftpboot -b 512
disable=no
per_source=11
cps=100 2
flags=ipv4
}
printenv response in U-boot is:
CNS3000 # printenv
bootargs=root=/dev/mtdblock0 mem=256M console=ttyS0
baudrate=38400
ethaddr=00:53:43:4F:54:54
netmask=255.255.0.0
tftp_bsize=512
udp_frag_size=512
mmc_init=mmcinit
loading=fatload mmc 0 0x4000000 bootpimage-82511
running=go 0x4000000
bootcmd=run mmc_init;run loading;run running
serverip=192.168.0.219
ipaddr=192.168.0.112
bootdelay=5
port=1
bootfile=/tftpboot/bootpImage.cns3420.uclibcl
stdin=serial
stdout=serial
stderr=serial
verify=n
Environment size: 437/4092 bytes
Regards
Waqas
Loading: T T T T T T T T T T
Means there is no transfer at all; this can be caused by wrong interface setting i.e.
u-boot is configured for 100Mbit full duplex, and you try to connect via half duplex or 10Mbit (or some mix of it). Another point is the MTU size, should be 1500 (u-boot cannot handle packet fragmentation)
Hint for windows/vmware users:
tftp timeouts from u-boot are caused by windows ip-forwarding.
1) If you have a home network : switch it of.
2) You are running Routing and Remote Access service : shut down service
3) check registry for ip forwarding:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\IPEnableRouter
set value to 0 (and maybe reboot)