Repcached issue - apache

Anyone using repcached?
Basically, I am experimenting with it in order to use it for storing sessions there, while providing failover with its buildin replication
Basically, I have 2 nodes running centos 5.4. The replication works fine wi\hen testing it and running some benchmarks with ab.
However, I am doing the below test.
I am having the 2 nodes started and replicating and start an ab test. While the benchmark is running, I take down one of the node, just to check the fail over.
At that point apache's error log starts printing
[Fri Oct 15 21:39:02 2010] [notice] child pid 2941 exit signal Segmentation fault (11)
It seems that some requests fail during the fail over
Anyone encountered such behavior?
Thanks

It turns out that this is caused by php's client memcache client.
Installing version 3.0.3 from pecl did the trick
The error appears with versions 3.0.4 and 3.0.5
I hope it will save people time

Related

Tidying up remote-jmeter

am trying to run distribution test using one host & one slave machine.. while trying to run non-gui mode getting below error
Tidying up remote # Wed Nov 09 10:00:00 IST YYYY (1320849384380)
... end of run
trying with different scenarios to get work around but could not help, please help me if any one can got this issue before?
"Tidying up remote" message is not an error, it's for debug purposes and it's printing it always to System.out.
It can't be disabled (without an enhancement) and you can ignore it if you don't need it

Centos server hanged due to postfix/sendmail spam emails

My centos server is running web applications in LAMP stack. A couple of days back, the server was not responding for about 10 mins and I got http response failure alert from my monitoring tool. When I checked the httpd error log I found a huge log entry (~12000 lines) related to sendmail.
14585 sendmail: fatal: open /etc/postfix/main.cf: Permission denied
The server ran out of memory and not responding.
14534 [Fri Aug 19 22:14:52.597047 2016] [mpm_prefork:error] [pid 26641] (12)Cannot allocate memory: AH00159: fork: Unable to fork new process
14586 /usr/sbin/sendmail: error while loading shared libraries: /lib64/librt.so.1: cannot allocate version reference table: Cannot allocate memory
We are not using sendmail in any of our application. How can I stop this attack in future?
Thank you in advance!
Sorry I have no comment facilities; it looks like one of your website pages is vulnarable for code injection, finding out where and what page may be a huge job. Focus on input (forms) variables. Always sanitize input variables before using them! P.s. php uses "sendmail", even if you use Postfix, it will use a sendmail binary to send mail and the sendmail binary will redirect it to Postfix. If your forms work well and the 12k error log lines come out of the blue, then I would think someone is trying to inject code through your website (happens all the time by the way).

Solr issue: ClusterState says we are the leader, but locally we don't think so

So today we run into a disturbing solr issue.
After a restart of the whole cluster one of the shard stop being able to index/store documents.
We had no hint about the issue until we started indexing (querying the server looks fine).
The error is:
2014-05-19 18:36:20,707 ERROR o.a.s.u.p.DistributedUpdateProcessor [qtp406017988-19] ClusterState says we are the leader, but locally we don't think so
2014-05-19 18:36:20,709 ERROR o.a.s.c.SolrException [qtp406017988-19] org.apache.solr.common.SolrException: ClusterState says we are the leader (http://x.x.x.x:7070/solr/shard3_replica1), but locally we don't think so. Request came from null
at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:503)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:267)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:550)
at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:126)
at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:101)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:65)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
We run Solr 4.7 in Cluster mode (5 shards) on jetty.
Each shard run on a different host with one zookeeper server.
I checked the zookeeper log and I cannot see anything there.
The only difference is that in the /overseer_election/election folder I see this specific server repeated 3 times, while the other server are only mentioned twice.
45654861x41276x432-x.x.x.x:7070_solr-n_00000003xx
74030267x31685x368-x.x.x.x:7070_solr-n_00000003xx
74030267x31685x369-x.x.x.x:7070_solr-n_00000003xx
Not even sure if this is relevant. (Can it be?)
Any clue what other check can we do?
We've experienced this error under 2 conditions.
Condition 1
On a single zookeeper host there was an orphaned Zookeeper ephemeral node in
/overseer_elect/election. The session this ephemeral node was associated with no longer existed.
The orphaned ephemeral node cannot be deleted.
Caused by: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
This condition will also be accompanied by a /overseer/queue directory that is clogged-up with queue items that are forever waiting to be processed.
To resolve the issue you must restart the Zookeeper node in question with the orphaned ephemeral node.
If after the restart you see Still seeing conflicting information about the leader of shard shard1 for collection <name> after 30 seconds
You will need to restart the Solr hosts as well to resolve the problem.
Condition 2
Cause: a mis-configured systemd service unit.
Make sure you have Type=forking and have PIDFile configured correctly if you are using systemd.
systemd was not tracking the PID correctly, it thought the service was dead, but it wasn't, and at some point 2 services were started. Because the 2nd service will not be able to start (as they both can't listen on the same port) it seems to just sit there in a failed state hanging, or fails to start the process but just messes up the other solr processes somehow by possibly overwriting temporary clusterstate files locally.
Solr logs reported the same error the OP posted.
Interestingly enough, another symptom was that zookeeper listed no leader for our collection in /collections/<name>/leaders/shard1/leader normally this zk node contains contents such as:
{"core":"collection-name_shard1_replica1",
"core_node_name":"core_node7",
"base_url":"http://10.10.10.21:8983/solr",
"node_name":"10.10.10.21:8983_solr"}
But the node is completely missing on the cluster with duplicate solr instances attempting to start.
This error also appeared in the Solr Logs:
HttpSolrCall null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /roles.json
To correct the issue, killall instances of solr (or java if you know it's safe), and restart the solr service.
We figured out!
The issue was that jetty didn't really stop so we had 2 running processes, for whatever reason this was fine for reading but not for writing.
Killing the older java process solved the issue.

Rails server goes down frequently in production mode

I'm using Rails 3.0.5 version and Ruby 1.9.2 in my application.
Its working fine in development mode but in production mode server goes down after every 3-4 days.
It gives below error in /var/log/https/error.log file.
[Sun Oct 21 09:39:03 2012] [error] [IP_ADDRESS] **Premature end of script headers:**
[ pid=24971 thr=1 file=ext/apache2/Hooks.cpp:817 time=2012-10-21 09:39:03.371 ]:
The backed application (process 29805) did not send a valid HTTP response; instead, it sent nothing at all. It is possible that it has crashed; please check whether there are crashing bugs in this application.
I am not getting what's the reason for server down.
which server your using? webrick or else? i got some problem like this in past that server goes down. i changed server from webrick to Mongrel its faster then webrick
Sorry i can't comment so answering, personally haven't come across this problem but there seems to be quite some talk about it. Here are a few resources that i came across:
Dalibor Nasevic's explanation as to why this is happening
Premature end of script headers — Rails
Intermittent “premature end of script headers” with Rails 3.1
Hope it helps.

"signal Segmentation fault". Where this error is coming from?

From time to time my Apache server logs this error
[Sat Nov 07 05:35:01 2009] [notice] child pid 2795 exit signal Segmentation fault (11)
What may be the reason behind the error?
Thanks!
Perhaps it helps to reduce the value of MaxRequestsPerChild in your apache2.conf. In addition, it might be helpful to disable all Apache modules you have no need for.
Looks like you are running a cgi of some sort that is segfault under certain conditions. Check what cgi's you have and then test them. Most likely they will be a C or C++ based cgi, since it's a segfault, but no guarantee.
A segfault basically is caused by an attempt to access memory in a non-authorized way. To determine where the problem occured, a core file can have been generated on your system. If necessary the system has to be configured to get those files, but this depends on your system ; see coreadm(1M) for instance.
Once you get the core file you can get the stack trace of the process that caused the fault with an utility such as pstack, and many more with a debugger.