redis cluster continuously print log WSA_IO_PENDING - redis

When I start up all the redis-server of the redis cluster, all these servers continuously print logs like WSA_IO_PENDING clusterWriteDone
[9956] 03 Feb 18:17:25.044 # WSA_IO_PENDING writing to socket fd --------------------------------------------------------
[9956] 03 Feb 18:17:25.062 # clusterWriteDone written 2520 fd 15----------------------------------------------------------‌​---
[9956] 03 Feb 18:17:25.545 # WSA_IO_PENDING writing to socket fd --------------------------------------------------------
[9956] 03 Feb 18:17:25.568 # WSA_IO_PENDING writing to socket fd -------------------------------------------------------- –

There is no way to specifically turn those "warnings" off in 3.2.x port of Redis for Windows as the logging statements use highest LL_WARNING level. This issue has been reported in my fork of that unmaintained MSOpenTech's repo (which I updated to Redis 4.0.2) and has been fixed by decreasing that level to LL_DEBUG. More details: https://github.com/tporadowski/redis/issues/14
This change will be included in the next release (4.0.2.3) or you can get the latest source code and build it for yourself.
Current releases can be found here: https://github.com/tporadowski/redis/releases

An issue was open in the official redis repo 10 months ago about that problem. Unfortunately it seems to be abandoned, and it hasn't been solved yet:
Redis cluster print "WSA_IO_PENDING writing to socket..." continuously, does it matter?
However, that issue may not be related to redis itself, but to the Windows Sockets API, as pointed out by Cy Rossignol in the comments. It's the winsock API that returns that status to the application, as seen in the documentation:
WSA_IO_PENDING (997)
Overlapped operations will complete later.
The application has
initiated an overlapped operation that cannot be completed
immediately. A completion indication will be given later when the
operation has been completed. Note that this error is returned by the
operating system, so the error number may change in future releases of
Windows.
Maybe it didn't get much attention because it's not a bug, although it's indeed an inconvenience that floods the system logs. In that case, you may not get help there.
Seems like there's no temporary fix. The Windows Redis fork is archived and I don't know if you could get any help there either.

Go on this location C:\Program Files\Redis
Open file redis.windows-service.conf in Notepad.
You will find a section like below:
# Specify the server verbosity level.
# This can be one of:
# debug (a lot of information, useful for development/testing)
# verbose (many rarely useful info, but not a mess like the debug level)
# notice (moderately verbose, what you want in production probably)
# warning (only very important / critical messages are logged)
loglevel notice
# Specify the log file name. Also 'stdout' can be used to force
# Redis to log on the standard output.
logfile "Logs/redis_log.txt"
Here, you can change the value of loglevel as per your requirement. I think changing it to warning will solve this issue because it will log only essential errors.

Related

More concise log on Redis startup

As I use Redis to start up with a bunch of other processes via Foreman, I find its output on startup quite verbose.
Redis writes more than twice the number of lines to stdout than any other process in my Procfile, mainly because of the ASCII art that gets printed to the log.
Is there a (startup) option to keep the log more concise, for example by turning off the output of the logo?
TLDR: If you have redis version 4.0 or higher you can do redis-server | cat to trick it into thinking it's not running in a tty.
Original answer:
I've had a quick check in the config docs and you shouldn't be seeing this. Can you maybe check your config file and see if you've set always-show-logo to yes?
The comment that accompanies it is as follows:
# By default Redis shows an ASCII art logo only when started to log to the
# standard output and if the standard output is a TTY. Basically this means
# that normally a logo is displayed only in interactive sessions.
#
# However it is possible to force the pre-4.0 behavior and always show a
# ASCII art logo in startup logs by setting the following option to yes.
I guess if you're on a version < 4.0 then that might explain what you're seeing.
Here is the issue/fix from 2014 https://github.com/antirez/redis/issues/1935

Large commit stalls halfway through

I have a problem with our subversion server. Doing small commits works fine, but as soon as someone tried to commit a large collection sizeable files the commit stalls halfway through and the client finally time out. My test set consists of roughly 2000 files and the total size of the commit is about 1 GB. When I commit the files the file uploading starts but about halfway through the transfer rate drops to 0kb/s and the commit just stalls and never recovers. If I splitting the commit into smaller pieces (<150 Mb) everything works just fine, but that breaks the atomicity of the commit structure and is something I really want to avoid.
When I look at the logs generate by Apache there is no error messages.
When I bumped the loglevel from debug to trace6 on the Apache server, there is some errors appearing at the moment when the upload stalls:
...
OpenSSL: I/O error, 2229 bytes expected to read on BIO
OpenSSL: read 1460/2229 bytes from BIO
...
Versions used:
We are running the connection to the subversion via apache, mod_dav, mod_dav_svn, mod_authz_svn and mod_auth_digest. The client connects via https.
Server:
OpenSuse 42.3
svnserve: 1.9.7
Apache: 2.4.23
Client:
Windows 10 enterprise
svn client: 1.10.0-dev.
What I tried so far:
I have tried increasing the TimeOut value in the apache configuration. The only difference is that the client ends up in stalled mode longer before posting the timeout message.
I have tried increasing the MaxKeepAliveRequests from 100 to 1000. No change.
I have tried adding SVNAllowBulkUpdates Prefer to the svn settings. No change.
Have anyone got any hints on how to debug these types of errors?

Centos server hanged due to postfix/sendmail spam emails

My centos server is running web applications in LAMP stack. A couple of days back, the server was not responding for about 10 mins and I got http response failure alert from my monitoring tool. When I checked the httpd error log I found a huge log entry (~12000 lines) related to sendmail.
14585 sendmail: fatal: open /etc/postfix/main.cf: Permission denied
The server ran out of memory and not responding.
14534 [Fri Aug 19 22:14:52.597047 2016] [mpm_prefork:error] [pid 26641] (12)Cannot allocate memory: AH00159: fork: Unable to fork new process
14586 /usr/sbin/sendmail: error while loading shared libraries: /lib64/librt.so.1: cannot allocate version reference table: Cannot allocate memory
We are not using sendmail in any of our application. How can I stop this attack in future?
Thank you in advance!
Sorry I have no comment facilities; it looks like one of your website pages is vulnarable for code injection, finding out where and what page may be a huge job. Focus on input (forms) variables. Always sanitize input variables before using them! P.s. php uses "sendmail", even if you use Postfix, it will use a sendmail binary to send mail and the sendmail binary will redirect it to Postfix. If your forms work well and the 12k error log lines come out of the blue, then I would think someone is trying to inject code through your website (happens all the time by the way).

Solr issue: ClusterState says we are the leader, but locally we don't think so

So today we run into a disturbing solr issue.
After a restart of the whole cluster one of the shard stop being able to index/store documents.
We had no hint about the issue until we started indexing (querying the server looks fine).
The error is:
2014-05-19 18:36:20,707 ERROR o.a.s.u.p.DistributedUpdateProcessor [qtp406017988-19] ClusterState says we are the leader, but locally we don't think so
2014-05-19 18:36:20,709 ERROR o.a.s.c.SolrException [qtp406017988-19] org.apache.solr.common.SolrException: ClusterState says we are the leader (http://x.x.x.x:7070/solr/shard3_replica1), but locally we don't think so. Request came from null
at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:503)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:267)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:550)
at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:126)
at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:101)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:65)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
We run Solr 4.7 in Cluster mode (5 shards) on jetty.
Each shard run on a different host with one zookeeper server.
I checked the zookeeper log and I cannot see anything there.
The only difference is that in the /overseer_election/election folder I see this specific server repeated 3 times, while the other server are only mentioned twice.
45654861x41276x432-x.x.x.x:7070_solr-n_00000003xx
74030267x31685x368-x.x.x.x:7070_solr-n_00000003xx
74030267x31685x369-x.x.x.x:7070_solr-n_00000003xx
Not even sure if this is relevant. (Can it be?)
Any clue what other check can we do?
We've experienced this error under 2 conditions.
Condition 1
On a single zookeeper host there was an orphaned Zookeeper ephemeral node in
/overseer_elect/election. The session this ephemeral node was associated with no longer existed.
The orphaned ephemeral node cannot be deleted.
Caused by: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
This condition will also be accompanied by a /overseer/queue directory that is clogged-up with queue items that are forever waiting to be processed.
To resolve the issue you must restart the Zookeeper node in question with the orphaned ephemeral node.
If after the restart you see Still seeing conflicting information about the leader of shard shard1 for collection <name> after 30 seconds
You will need to restart the Solr hosts as well to resolve the problem.
Condition 2
Cause: a mis-configured systemd service unit.
Make sure you have Type=forking and have PIDFile configured correctly if you are using systemd.
systemd was not tracking the PID correctly, it thought the service was dead, but it wasn't, and at some point 2 services were started. Because the 2nd service will not be able to start (as they both can't listen on the same port) it seems to just sit there in a failed state hanging, or fails to start the process but just messes up the other solr processes somehow by possibly overwriting temporary clusterstate files locally.
Solr logs reported the same error the OP posted.
Interestingly enough, another symptom was that zookeeper listed no leader for our collection in /collections/<name>/leaders/shard1/leader normally this zk node contains contents such as:
{"core":"collection-name_shard1_replica1",
"core_node_name":"core_node7",
"base_url":"http://10.10.10.21:8983/solr",
"node_name":"10.10.10.21:8983_solr"}
But the node is completely missing on the cluster with duplicate solr instances attempting to start.
This error also appeared in the Solr Logs:
HttpSolrCall null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /roles.json
To correct the issue, killall instances of solr (or java if you know it's safe), and restart the solr service.
We figured out!
The issue was that jetty didn't really stop so we had 2 running processes, for whatever reason this was fine for reading but not for writing.
Killing the older java process solved the issue.

how to prevent the stdout.out in weblogic to increasing the size heavily (Windows)

I have deployed a system integrated with weblogic, but until now I faced a problem is the weblogic increasing the stdout.out size heavily(by GB per week), it caused the system to load slowly and slowly.
Any way to prevent it increase the size heavily or redirect into .log?
Thanks alot
As David Herget says above, using the WebLogic Scripting Tool (WLST) to redirect StdOut and StdErr did not actually work for me either; I had to also do so through the web console (even though they appear to be set on the console) and restart the relevant jvms.
I can't reply to David's comment above due to being a newbie. [Edited since for clarity]
Not totally sure to understand fully your question.
Are you talking about the {server_name}.out file located in the {Domain_Path}/servers/{server_name}/logs ?
If so, I've never found anyway to rotate those logs automatically so I run a script each day to rotate it (basically copying it to another name, zip it and echoing a NULL in the orginal file...erasing the older one after).
If you are talking about redirecting StdOut to the logs though, that can be done within the console for each server in the logging tab by checking "Redirect stdout logging enabled". Configuration to rotate those logs can also be done within that tab.
On that, StdErr can also be redirected, but not from the console (in WL9). You have to put "RedirectStderrToServerLogEnabled" at true in the MBean tree by wlst (it's located at /Servers/{server_name}/Log/{server_name}
I know the question was ask long time ago but hoping it would help nonetheless
Weblogic provides features of log files rotation based on the size and time interval.
You can try rotating the log files based on the size. You would need to configure the log rotation policy from the admin console. Please refer the below link for further details.
http://docs.oracle.com/cd/E12840_01/wls/docs103/ConsoleHelp/taskhelp/logging/RotateLogFiles.html
If you want to rotate the log files on demand, you can use the below WSLT script.
C:\>java weblogic.WLST
#connect WLST to an Administration Server
wls:/offline> connect('username','password')
#navigate to the ServerRuntime MBean hierarchy
wls:/mydomain/serverConfig> serverRuntime()
wls:/mydomain/serverRuntime>ls()
#navigate to the server LogRuntimeMBean
wls:/mydomain/serverRuntime> cd('LogRuntime/myserver')
wls:/mydomain/serverRuntime/LogRuntime/myserver> ls()
-r-- Name myserver
-r-- Type LogRuntime
-r-x forceLogRotation java.lang.Void :
#force the immediate rotation of the server log file
wls:/mydomain/serverRuntime/LogRuntime/myserver> cmo.forceLogRotation()
wls:/mydomain/serverRuntime/LogRuntime/myserver>
http://docs.oracle.com/cd/E12840_01/wls/docs103/logging/config_logs.html#wp1001654