Redis stops working and becomes unresponsive - redis

I have an application using Redis as a cache.
I have sporadic episodes where Redis suddenly becomes unresponsive. Even a simple curl test fails:
~ curl localhost:6379
curl: (7) Failed to connect to localhost port 6379: Connection refused
Redis console logs show no errors,
I am using this metrics exporter: https://github.com/oliver006/redis_exporter
During these episodes, redis_up is 0 and redis_exporter_last_scrape_error is 1 with no error detail.
FYI, I have 46 million short keys in Redis with empty string values.
How do I go about diagnosing and hopefully resolving this?

Related

gitlab CI/Ci runner is failing to failing to start redis server

I am trying to run a gitlab CI/CD pipeline but job to run the text script is failing due to redis server failing to start, I am getting an error saying
Starting redis ... error
ERROR: for redis Cannot start service redis: driver failed programming external connectivity on endpoint redis (8c425d45729aecef06c5c15b082ac96867a73986400f5c8bae29ebab55eb5fdf): Error starting userland proxy: listen tcp 0.0.0.0:6379: bind: address already in
I have tried run gitlab-runner restart command from terminal but the job is still failing
I al tried setting the Maximum job timeout to 10 mins but the job is still persisting

Docker containers no longer allow web access

I have a very strange issue on my local development environment. I have a couple of Docker containers that run a couple of different environments, but both fronted with Apache. Both are connected to the same bridge network and one has port 80 exposed and the other port 8010. When the containers are running I can connect using telnet as follows:
telnet localhost 80
or
telnet localhost 8010
However, from the browser, nothing happens and in the end, it just times out. In the logs on the Docker contains there is nothing to show an inbound connection.
From the Docker containers shell, I can access the HTTP server using curl without issue.
I tried deleting the bridge network and adding it again but that didn't help.
I've tried turning the macOS firewall off but that doesn't help.
If I stop the docker containers and then try the above telnet command it errors with "Connection refused" as would be expected, so the telnet command is definitely connecting to the docker container.
Also, this setup has been working fine for sometime until today.
I'm lost as to what to try next and have found nothing similar Googling.
Any ideas of how to resolve this would be gratefully received.
To resolve this I did:
docker-compose rm -f
docker images --no-trunc --format '{{.ID}}' | xargs docker rmi
and then rebuilt the images / containers.
Be careful with the above as they are destructive commands.

Docker: how to force graylog web interface over https?

I'm currently struggling to get graylog working over https in a docker environment. I'm using the jwilder/nginx-proxy and I have the certificates in place.
When I run:
docker run --name=graylog-prod --link mongo-prod:mongo --link elastic-prod:elasticsearch -e VIRTUAL_PORT=9000 -e VIRTUAL_HOST=test.myserver.com -e GRAYLOG_WEB_ENDPOINT_URI="http://test.myserver.com/api" -e GRAYLOG_PASSWORD_SECRET=somepasswordpepper -e GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918 -d graylog2/server
I get the following error:
We are experiencing problems connecting to the Graylog server running
on http://test.myserver.com:9000/api. Please verify that the server is
healthy and working correctly.
You will be automatically redirected to the previous page once we can
connect to the server.
This is the last response we received from the server:
Error message
Bad request Original Request
GET http://test.myserver.com/api/system/sessions Status code
undefined Full error message
Error: Request has been terminated
Possible causes: the network is offline, Origin is not allowed by Access-Control-Allow-Origin, the page is being unloaded, etc.
When I go to the URL in the message, I get a reply: {"session_id":null,"username":null,"is_valid":false}
This is the same reply I get when running Graylog without https.
In the docker log file from the graylog is nothing mentioned.
docker ps:
CONTAINER ID IMAGE COMMAND
CREATED STATUS PORTS
NAMES 56c9b3b4fc74 graylog2/server "/docker-entrypoint.s" 5
minutes ago Up 5 minutes 9000/tcp, 12900/tcp
graylog-prod
When running docker with the option -p 9000:9000 all is working fine without https, but as soon as I force it to go over https I get this error.
Anyone an idea what I'm doing wrong here?
Thanks a lot!
Did you try GRAYLOG_WEB_ENDPOINT_URI="https://test.myserver.com/api" ?

SSH drops a few seconds after connecting

I have a TP-Link MR-3020 router that is hardwired to my "real" router. The TP Link has OpenWRT installed on in and has the static IP address of 192.168.1.111 assigned to it. From my laptop, I can
ssh root#192.168.1.111 into the router.
Often times it will say
ssh: connect to host 192.168.1.111 port 22: Connection refused.
If I wait a few seconds and try again, I may get the same error or I may be prompted for the password. If I'm able to successfully log in I will often be kicked out with a broken pipe. Sometimes it's 5 seconds after logging in, sometimes it's 5 hours.
All of this is happening internally on a network and I've confirmed there aren't any other 192.168.1.111 devices out there being assigned by DHCP. What are some things I can do to debug why I keep losing my connection?
First of all, you should understand if it is SSH-only issue, or your router is rebooted / goes offline completely. To do so, start ping and watch if it work when SSH doesn't allow you to connect.
If ping works, but SSH doesn't, then watch the process id of dropbear process (ssh daemon) issuing pgrep dropbear command. If it changes, then SSH process is restarted, and it needs further investigation.
Also, post your logs (logread and dmesg command) - it might contain useful information on this issue. Try to note the time when SSH goes off and look for corresponding lines in the log.

Remotely shut down RabbitMQ server

I have been trying to remotely kill a rabbitmq server but haven't been lucky so far. I can easily connect to it and publish and receive messages using the pika library.
Steps I have tried so far:
Used RabbitMQ's HTTP API to DELETE a connection
/api/connections/name
An individual connection. DELETEing it
willclose the connection. Optionally set the "X-Reason"
header when DELETEing to provide a reason.'
When I tried something like http://localhost:15672/api/connection/127.0.0.1:31332, I get an error:
{"error":"Object Not Found","reason":"\"Not Found\"\n"}
Used rabbitmqadmin locally
Tried to use rabbitmqctl to remotely shut down the rabbitmq server
rabbitmqctl
Here is how to do that using rabbitmqctl
set RABBITMQ_CTL_ERL_ARGS=-setcookie FWQUGISFBWECSKWFVFRP
rabbitmqctl.bat -n rabbit#gabriele-VirtualBox stop
Erlang
Here is one way to kill a remote node using Erlang:
erl -setcookie FXQUEISFFRECSKWCVB -sname thekiller#gabriele-VirtualBox
Eshell V6.4 (abort with ^G)
(thekiller#gabriele-VirtualBox)1> net_adm:ping('rabbit#gabriele-VirtualBox').
pong
(thekiller#gabriele-VirtualBox)2> rpc:call('rabbit#gabriele-VirtualBox', init, stop, []).
ok
(thekiller#gabriele-VirtualBox)3>
Start erl console using your .erlang.cookie and with -sname
with same rabbitmq domain (in my case gabriele-VirtualBox).
Test if you reach the node using ping
call rpc:call('rabbit#gabriele-VirtualBox', init, stop, []).
Done, you killed the remote node.
After a bit of troubleshooting all over again, I was able to use the HTTP API to kill active connections. The trick was that the whole connection name was to be url encoded.
In my case the connection name was:
127.0.0.1:31332 -> 127.0.0.1:15672
So when I tried the following I got an error:
http://localhost:15672/api/connection/127.0.0.1:31332 ==> object not found error
It worked only after I URL encoded the connection name and sending a CURL DELETE like this:
http://localhost:15672/api/connection/127.0.0.1%3A31332%20-%3E%20127.0.0.1%3A15672