CouchDB terminates unexpectedly - crash

Today I wrote a performance testing program to insert data and add attachments to couchdb. The server software is couchbase (which is a wrapper of couchdb) and the operating system is Windows 2003 server. The program is developed in C# and the couchdb driver is LoveSeat(which encapsulates some http methods to deal with couchdb). The data is quite simple, but the attachments are not small, about 70kb each. There were about 200 attachments and I was attaching them repeatedly. I started 5 threads in the client program.
Everything looked all right until couchdb server terminated unexpectedly. Actually it was not the first time it crashed. At first I thought it was because of the client program. But at last I found couchdb crashed. I don't think it was because of the data or attachment. because I was inserting the same data and adding the same attachment repeatedly. The program run for about an hour without problem until it crashed.
[Tue, 05 Jul 2011 11:00:19 GMT] [info] [<0.142.0>] 192.168.1.135 - - 'GET' /test/67366 200
[Tue, 05 Jul 2011 11:00:19 GMT] [info] [<0.108.0>] 192.168.1.135 - - 'GET' /test/7136 200
[Tue, 05 Jul 2011 11:00:19 GMT] [info] [<0.108.0>] 192.168.1.135 - - 'GET' /test/47306 200
[Tue, 05 Jul 2011 11:00:19 GMT] [info] [<0.108.0>] 192.168.1.135 - - 'GET' /test/27257 200
[Tue, 05 Jul 2011 11:00:19 GMT] [info] [<0.108.0>] 192.168.1.135 - - 'PUT' /test/7136/f?rev=1-334efd144dcdc52fd3a3a981dce4472f 201
[Tue, 05 Jul 2011 11:00:25 GMT] [error] [<0.145.0>] ** Generic server <0.145.0> terminating
** Last message in was {pread_iolist,4294342003}
** When Server state == {file,{file_descriptor,prim_file,{#Port<0.3143>,1464}},
0,4295164786}
** Reason for termination ==
** {{badmatch,{ok,<<183,92,29,219,169,127,153,2,50,217,252,186,178,175,202,
144,215,209,191,69,109,230,227,154,114,174,173,157,231,
153,246,124,105,239,174,51,143,24,108,175,101,215,175,
221,35,99,53,124,108,109,249,112,202,29,85,87,81,176,94,
219,11,103,129,231,25,111,242,108,246,207,107,72,173,172,
57,246,195,16,236,79,243,134,211,93,131,218,180,93,240,
173,213,199,226,175,176,217,250,154,89,39,237,157,250,77,
173,151,156,139,248,106,85,21,134,253,85,234,108,85,208,
67,177,130,124,247,161,98,77,173,126,170,111,80,84,45,
212,201,72,149,90,138,252,89,23,85,165,252,105,187,191,
41,86,125,148,106,149,175,252,78,185,198,154,207,172,142,
148,101,83,140,99,222,102,26,41,131,206,132,221,31,74,3,
172,176,158,236,136,71,120,169,63,35,161,251,208,86,202,
1,95,208,25,51,76,250,100,182,177,122,31,91,230,249,214,
245,229,250,212,118,86,167,120,116,6,173,78,113,18,171,
143,215,191,38,207,51,92,150,10,10,83,164,98,154,181,157,
......... a loooooot of numbers.

I'm sorry you had this error. May I suggest to post this question on the Couchbase forums? Our support crew is monitoring those more closely than SO: http://www.couchbase.org/forums/

I think I've found the cause. The file size reached 4GB. But according to this wiki page, Erlang/OTP release R14B01 no longer has this bug. So I think couchbase uses an embedded erlang/otp version which is previous to R14B01. (I also installed a standalone Erlang/OTP R14B03 on the machine, but seemed that it was not utilized)

The final conclustion:
couchbase 1.02 does NOT support data file larger than 4GB on a windows 2003 32 bit machine.
couchbase 2.0 developer prereview version DOES support data file larger than 4GB on a windows 2003 32 bit machine. But as far as I can tell, version 2.0 is at least 5 times slower than version 1.02.
couchdb 1.1 from this link DOES support file larger than 4GB on a windows 2003 32 bit machine. But it is as slow as couchbase 2.0.
Couchdb is tooooo slow on windows(at least in my user case it is slow). At last, I tried using mysql to store the files. Mysql turned out to be 8 times faster! inserting an attachment into couchdb takes 650ms while the number for mysql is only 80ms.

Related

How to trace memory allocations in Apache httpd server?

I am running apache benchmark (ab) with server [httpd2.4.52] running locally. I want to track how many memory allocations and what size allocations does the server make.
I run 'valgrind --trace-malloc=yes ab -n 10 http://127.0.0.1/'
But the number of allocations is ~4.6k regardless of the number of requests (I tried 10,100,1000).
Is this because Apache uses its own custom memory allocator?
How can I track the allocations (specifically #allocations, total/avg size of allocations) for this custom allocator?
This page mentions an option named ALLOC_USE_MALLOC in apr code, but, I could not find this option in apr source code (I checked versions 1.7.0, 1.4.8, 1.4.2 and httpd2.0.51)

redis cluster continuously print log WSA_IO_PENDING

When I start up all the redis-server of the redis cluster, all these servers continuously print logs like WSA_IO_PENDING clusterWriteDone
[9956] 03 Feb 18:17:25.044 # WSA_IO_PENDING writing to socket fd --------------------------------------------------------
[9956] 03 Feb 18:17:25.062 # clusterWriteDone written 2520 fd 15----------------------------------------------------------‌​---
[9956] 03 Feb 18:17:25.545 # WSA_IO_PENDING writing to socket fd --------------------------------------------------------
[9956] 03 Feb 18:17:25.568 # WSA_IO_PENDING writing to socket fd -------------------------------------------------------- –
There is no way to specifically turn those "warnings" off in 3.2.x port of Redis for Windows as the logging statements use highest LL_WARNING level. This issue has been reported in my fork of that unmaintained MSOpenTech's repo (which I updated to Redis 4.0.2) and has been fixed by decreasing that level to LL_DEBUG. More details: https://github.com/tporadowski/redis/issues/14
This change will be included in the next release (4.0.2.3) or you can get the latest source code and build it for yourself.
Current releases can be found here: https://github.com/tporadowski/redis/releases
An issue was open in the official redis repo 10 months ago about that problem. Unfortunately it seems to be abandoned, and it hasn't been solved yet:
Redis cluster print "WSA_IO_PENDING writing to socket..." continuously, does it matter?
However, that issue may not be related to redis itself, but to the Windows Sockets API, as pointed out by Cy Rossignol in the comments. It's the winsock API that returns that status to the application, as seen in the documentation:
WSA_IO_PENDING (997)
Overlapped operations will complete later.
The application has
initiated an overlapped operation that cannot be completed
immediately. A completion indication will be given later when the
operation has been completed. Note that this error is returned by the
operating system, so the error number may change in future releases of
Windows.
Maybe it didn't get much attention because it's not a bug, although it's indeed an inconvenience that floods the system logs. In that case, you may not get help there.
Seems like there's no temporary fix. The Windows Redis fork is archived and I don't know if you could get any help there either.
Go on this location C:\Program Files\Redis
Open file redis.windows-service.conf in Notepad.
You will find a section like below:
# Specify the server verbosity level.
# This can be one of:
# debug (a lot of information, useful for development/testing)
# verbose (many rarely useful info, but not a mess like the debug level)
# notice (moderately verbose, what you want in production probably)
# warning (only very important / critical messages are logged)
loglevel notice
# Specify the log file name. Also 'stdout' can be used to force
# Redis to log on the standard output.
logfile "Logs/redis_log.txt"
Here, you can change the value of loglevel as per your requirement. I think changing it to warning will solve this issue because it will log only essential errors.

Centos server hanged due to postfix/sendmail spam emails

My centos server is running web applications in LAMP stack. A couple of days back, the server was not responding for about 10 mins and I got http response failure alert from my monitoring tool. When I checked the httpd error log I found a huge log entry (~12000 lines) related to sendmail.
14585 sendmail: fatal: open /etc/postfix/main.cf: Permission denied
The server ran out of memory and not responding.
14534 [Fri Aug 19 22:14:52.597047 2016] [mpm_prefork:error] [pid 26641] (12)Cannot allocate memory: AH00159: fork: Unable to fork new process
14586 /usr/sbin/sendmail: error while loading shared libraries: /lib64/librt.so.1: cannot allocate version reference table: Cannot allocate memory
We are not using sendmail in any of our application. How can I stop this attack in future?
Thank you in advance!
Sorry I have no comment facilities; it looks like one of your website pages is vulnarable for code injection, finding out where and what page may be a huge job. Focus on input (forms) variables. Always sanitize input variables before using them! P.s. php uses "sendmail", even if you use Postfix, it will use a sendmail binary to send mail and the sendmail binary will redirect it to Postfix. If your forms work well and the 12k error log lines come out of the blue, then I would think someone is trying to inject code through your website (happens all the time by the way).

Rails server goes down frequently in production mode

I'm using Rails 3.0.5 version and Ruby 1.9.2 in my application.
Its working fine in development mode but in production mode server goes down after every 3-4 days.
It gives below error in /var/log/https/error.log file.
[Sun Oct 21 09:39:03 2012] [error] [IP_ADDRESS] **Premature end of script headers:**
[ pid=24971 thr=1 file=ext/apache2/Hooks.cpp:817 time=2012-10-21 09:39:03.371 ]:
The backed application (process 29805) did not send a valid HTTP response; instead, it sent nothing at all. It is possible that it has crashed; please check whether there are crashing bugs in this application.
I am not getting what's the reason for server down.
which server your using? webrick or else? i got some problem like this in past that server goes down. i changed server from webrick to Mongrel its faster then webrick
Sorry i can't comment so answering, personally haven't come across this problem but there seems to be quite some talk about it. Here are a few resources that i came across:
Dalibor Nasevic's explanation as to why this is happening
Premature end of script headers — Rails
Intermittent “premature end of script headers” with Rails 3.1
Hope it helps.

Repcached issue

Anyone using repcached?
Basically, I am experimenting with it in order to use it for storing sessions there, while providing failover with its buildin replication
Basically, I have 2 nodes running centos 5.4. The replication works fine wi\hen testing it and running some benchmarks with ab.
However, I am doing the below test.
I am having the 2 nodes started and replicating and start an ab test. While the benchmark is running, I take down one of the node, just to check the fail over.
At that point apache's error log starts printing
[Fri Oct 15 21:39:02 2010] [notice] child pid 2941 exit signal Segmentation fault (11)
It seems that some requests fail during the fail over
Anyone encountered such behavior?
Thanks
It turns out that this is caused by php's client memcache client.
Installing version 3.0.3 from pecl did the trick
The error appears with versions 3.0.4 and 3.0.5
I hope it will save people time