Apache benchmark: what does the total mean milliseconds represent? - apache

I am benchmarking php application with apache benchmark. I have the server on my local machine. I run the following:
ab -n 100 -c 10 http://my-domain.local/
And get this:
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 3 3.7 2 8
Processing: 311 734 276.1 756 1333
Waiting: 310 722 273.6 750 1330
Total: 311 737 278.9 764 1341
However, if I refresh my browser on the page http://my-domain.local/ I find out it takes a lot longer than 737 ms (the mean that ab reports) to load the page (around 3000-4000 ms). I can repeat this many times and the loading of the page in the browser always takes at least 3000 ms.
I tested another, heavier page (page load in browser takes 8-10 seconds). I used a concurrency of 1 to simulate one user loading the page:
ab -n 100 -c 1 http://my-domain.local/heavy-page/
And the results are here:
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 17 20 4.7 18 46
Waiting: 16 20 4.6 18 46
Total: 17 20 4.7 18 46
So what does the total line on the ab results actually tell? Clearly it's not the number of milliseconds the browser is loading the web page. Is the number of milliseconds that it takes from browser to load the page (X) linearly dependent of number of the total mean milliseconds ab reports (Y)? So if I'm able to reduce half of Y, have I also reduced half of X?
(Also Im not really sure what Processing, Waiting and Total mean).
I'll reopen this question since I'm facing the problem again.
Recently I installed Varnish.
I run ab like this:
ab -n 100 http://my-domain.local/
Apache bench reports very fast response times:
Requests per second: 462.92 [#/sec] (mean)
Time per request: 2.160 [ms] (mean)
Time per request: 2.160 [ms] (mean, across all concurrent requests)
Transfer rate: 6131.37 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 2 2.3 1 13
Waiting: 0 1 2.0 1 12
Total: 1 2 2.3 1 13
So the time per request is about 2.2 ms. When I browse the site (as an anonymous user) the page load time is about 1.5 seconds.
Here is a picture from Firebug net tab. As you can see my browser is waiting 1.68 seconds for my site to response. Why is this number so much bigger than the request times ab reports?

Are you running ab on the server? Don't forget that your browser is local to you, on a remote network link. An ab run on the webserver itself will have almost zero network overhead and report basically the time it takes for Apache to serve up the page. Your home browser link will have however many milliseconds of network transit time added in, on top of the basic page-serving overhead.

Ok.. I think I know what's the problem. While I have been measuring the page load time in browser I have been logged in.. So none of the heavy stuff is happening. The page load times in browser with anonymous user are closer to the ones ab is reporting.

Related

Multithreading pdf workload slows the program down

I tried building a program that scans PDF files to find certain elements and then outputs a new PDF file with all the pages that contain the elements. It was originally single-threaded and a bit slow to run. It took about 36 seconds on a 6-Core-5600X. So I tried multiprocessing it with concurrent.futures:
def process_pdf(filename):
# Open the PDF file
f = open(filename, "rb")
print("Searching: " + filename)
# Create a PDF object
pdf = PyPDF2.PdfReader(f)
# Extract the text from each page in the PDF
extracted_text = [page.extract_text() for page in pdf.pages]
# Initialize a list
matching_pages = []
# Iterate through the extracted text
for j, text in enumerate(extracted_text):
# Search for the symbol in the text
if symbol in text:
# If the symbol is found, get a new PageObject instance for the page
page = pdf.pages[j]
# Add the page to the list
matching_pages.append(page)
return matching_pages
Multiprocessing Block:
with concurrent.futures.ThreadPoolExecutor() as executor:
# Get a list of the file paths for all PDF files in the directory
file_paths = [
os.path.join(directory, filename)
for filename in os.listdir(directory)
if filename.endswith(".pdf")
]
futures = [executor.submit(process_pdf, file_path) for file_path in file_paths]
# Initialize a list to store the results
results = []
# Iterate through the futures as they complete
for future in concurrent.futures.as_completed(futures):
# Get the result of the completed future
result = future.result()
# Add the result to the list
results.extend(result)
# Add each page to the new PDF file
for page in results:
output_pdf.add_page(page)
The multiprocessing works, as evident from the printed text, but it somehow doesn't scale at all. 1 thread ~ 35 seconds, 12 Threads ~ 38 seconds.
Why? Where's my bottleneck?
Tried using other libraries, but most were broken or slower.
Tried using re instead of in to search for the symbol, no improvement.
In general PDF consumes much of a devices primary resources, Each device and system will be different so by way of illustration here is one simple PDF task.
Search for text by a common python library component poppler pdftotext, (others may be faster but the aim here is to show the normal attempts and issues)
As a ballpark yardstick for 1 minute I scanned roughly 2500 pages of one file for the word "AMENDMENTS" found 900 occurrences such as
page 82
[1] SHORT TITLE OF 1971 AMENDMENTS
[18] SHORT TITLE OF 1970 AMENDMENTS
[44] SHORT TITLE OF 1968 AMENDMENTS
[54] SHORT TITLE OF 1967 AMENDMENTS
page 83
[11] SHORT TITLE OF 1966 AMENDMENTS
[23] SHORT TITLE OF 1965 AMENDMENTS
[42] SHORT TITLE OF 1964 AMENDMENTS
page 84
[16] SHORT TITLE OF 1956 AMENDMENTS
[26] SHORT TITLE OF 1948 AMENDMENTS
page 85
[43] DRUG ABUSE, AND MENTAL HEALTH AMENDMENTS ACT OF 1988
to scan the whole file (13,234 pages) would be about 5 mins 20 seconds
and I know from testing 4 cpus could process 1/4 of that file (3,308 pages from 13,234) in under 1 minute (there is a gain for using smaller files).
So a 4 core device should do say 3000 pages per core in a short time, well lets see.
If I thread that as 3 x 1000 pages one finishes after about 50 seconds another about 60 seconds and a 3rd about 70 seconds
overall there is little or just a slight gain, so the cause is one application spending one third of its time in each thread.
Overall about 3000 pages per minute
Lets try clever lets use 3 applications on the one file. Surely they can each take less time, guess what
one finishes after about 50 seconds another about 60 seconds and a 3rd about 70 seconds no gain using 3 applications
Overall about 3000 pages per minute
Lets try more clever lets use 3 applications on 3 similar but different files. Surely they can each take less time, guess what
one finishes after about 50 seconds another about 60 seconds and a 3rd about 70 seconds no gain using 3 applications with 3 tasks
Overall about 3000 pages per minute
Whatever way I try the resources on this device are constrained to Overall about 3000 pages per minute.
I may just as well let 1 thread run un-fettered.
So the basic answer is use multiple devices, same as graphics farming, is done.

Apache Intermittant Hang is it Network Lag?

I have an intermittent lag on the web applications I am serving from Apache on a Debian box. Apache and MySQL check out. I am far from fully utilizing the box CPU/Memory. Still there is an intermittent lag. My theory is there is a network rate limit needing to be tweaked. Stats below.
Apache Server Status
Current Time: Tuesday, 02-Jun-2020 14:36:53 EDT
Restart Time: Monday, 01-Jun-2020 01:00:03 EDT
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 1 day 13 hours 36 minutes 50 seconds
Server load: 2.95 3.23 3.09
Total accesses: 1213060 - Total Traffic: 22.0 GB - Total Duration: 32311929295
CPU Usage: u396.94 s164.31 cu2065.15 cs789.27 - 2.52% CPU load
8.96 requests/sec - 170.5 kB/second - 19.0 kB/request - 26636.7 ms/request
296 requests currently being processed, 66 idle workers
WR.WWWW.KWW_W._W_KWWWWWWKWWWWW_WWWWK_WK_WWW_WW_RWWWWWKCWWWWWW._W
_WW_R_W_.__K_WWWW__WWWWWWKKWWWWWWKWWWW_W____WWWWWWWW_WWW_KWWWWWW
WWWWWWWW_.WWWWWK_WWW_WWKWWWWWWKWWKWK_WWWWWRKWWW.WW_KKWKWWWKW_WWW
WW.W_.K._WWWK_WW_K_K._WW..WWWWWWW_.W_WWWW_W_W.W_WWWW_.WWKWK_WKWW
_W_WWWW_W.WWWWWW.WWWW_K__..W.WW_WWWWWWWWKRW_WWW_C.W_KW_WWW_KW.._
..WWWWWWWCWWW.WWW_WKKWWWW_._WWW.....WWW.W_W.W._.KW...W...WWW.WWW
W..W..K..WW_.W._................W..._W.W.....K.W.K_...R..K...W.W
...W..W.............................................
top
top - 14:31:14 up 79 days, 21:39, 3 users, load average: 2.26, 2.57, 2.86
Tasks: 717 total, 1 running, 716 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.3 us, 0.7 sy, 0.2 ni, 95.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
MiB Mem : 64365.1 total, 539.8 free, 8847.0 used, 54978.4 buff/cache
MiB Swap: 65477.0 total, 63810.0 free, 1667.0 used. 54580.5 avail Mem
ss -s
Total: 1934
TCP: 2362 (estab 1233, closed 1105, orphaned 2, timewait 1104)
Transport Total IP IPv6
RAW 0 0 0
UDP 0 0 0
TCP 1257 430 827
INET 1257 430 827
FRAG 0 0 0
ulimit -n
1024
ss -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n
1 Local
6 192.XXX.XXX.XXX
100 127.0.0.1
340 10.0.0.XX
866 [
ss -ntu | awk '{print $6}' | cut -d: -f1 | sort | uniq -c | sort -n
..........
lists # of ip connections. Besides 127.0.0.1 and [ there are 2 ips over 50.
74 104.xxx.xxx.xxx
91 12.xxx.xxx.xxx
MySQL
No processes running more than a second. Number of processes well within limits.
I do not know what stats would be relevant beyond these in diagnosing network rate limiting issues. Any pointers would be appreciated.
EDITED
CPU
lscpu https://pastebin.com/Jha6F7J8
Apache Config
apachectl -t -D DUMP_RUN_CFG https://pastebin.com/i1L2hnjH
Mysql
SHOW GLOBAL STATUS https://pastebin.com/aQX4D01k
SHOW GLOBAL VARIABLES https://pastebin.com/L8EfmHfn
SHOW FULL PROCESSLIST https://pastebin.com/GtqK2tET
mysqltuner https://pastebin.com/GLhhKA9q
Optional Very Helpful Information
top -bn1 https://pastebin.com/r94vpXe6
iostat -xm 5 3 https://pastebin.com/R8YLK3QU
ulimit -a https://pastebin.com/KUC3wqxU
Dorothy, Your system is very busy with activity. Not knowing the frequency and duration of the intermittent hangs puts us at a disadvantage. One possible cause is com_drop_table had 3,318 uses in your 83 days of uptime. Another possible cause is volume of data read and written. It appears innodb_data_written was 484TB in 83 days and yet MySQLTuner reports only 800K of data in 10 tables. Our General Log Analysis could likely identify the cause of this high activity. These suggestions will be a starting effort, more analysis and changes should be accomplished.
From your OS command prompt,
ulimit -n 96000 would enable many more Open Files (handles) above today's 1024 limit.
This is a dynamic operation in Linux and does not require OS restart to be implemented.
For this change to persist across OS stop/start the following URL could be used as a guide.
Please use 96000, not 500000 - as in their example documentation.
https://glassonionblog.wordpress.com/2013/01/27/increase-ulimit-and-file-descriptors-limit/
Rate Per Second = RPS
Suggestions to consider for your my.cnf [mysqld] section
innodb_io_capacity=1900 # from 200 if you have SSD, 900 if you have magnetic storage to improve IOPS
net_buffer_length=32K # from 16K to reduce malloc operations
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function
key_cache_segments=16 # from 0 to reduce mutex contention with MyISAM opens
key_cache_division_limit=50 # from 100 for Hot/Warm storage to reduce key_page_reads RPS of 18
aria_pagecache_division_limit=50 # from 100 for Hot/Warm storage to reduce aria_pagecache_reads RPS of 5K
read_rnd_buffer_size=64K # from 256K to reduce handler_read_rnd_next RPS of 27,707
These changes should reduce elapsed time to complete most queries.
Additional areas to consider include the use of Slow Query Log analysis to find where an index could avoid a table scan. MySQLTuner reported more than 4 million joins performed without indexes. Our FAQ page includes information on how you could find the tables needing indexes to avoid scans. Let us know how these suggestions work for you.
Skype Talk works very well if you have the flexibility to use that form of communication.

rabbitmq-c consumer not receving all messages

I have ACK enabled on consumer and producer sending 2000 messages to server. What I see is only around 1700 messages are received at consumer. Can someone tell what is wrong?
I am running provided example code from rabbitmq-c library
./amqp_producer localhost 5672 1000
1000 ms: Sent 1000 - 1000 since last report (999 Hz)
PRODUCER - Message count: 2000
Total time, milliseconds: 2001
Overall messages-per-second: 999.083
root#ce-bras-mx240-e:/usr/sbin/rabbitmq_server-3.6.6 # sbin/rabbitmqctl list_connections send_cnt
Listing connections ...
2007
root#ce-bras-mx240-e:/usr/sbin/rabbitmq_server-3.6.6 # sbin/rabbitmqctl list_channels messages_unacknowledged
Listing channels ...
0
# ./amqp_consumer localhost 5672
3275 ms: Received 1 - 1 since last report (0 Hz)
3275 ms: Received 2 - 1 since last report (1919 Hz)
3277 ms: Received 3 - 1 since last report (656 Hz)
4001 ms: Received 727 - 724 since last report (999 Hz)
5000 ms: Received 1727 - 1000 since last report (1001 Hz)
Only 1727 out of 2000 are received at consumer. The consumer is having no-ack flag set to 0.
It was display issue only. There was bug in displaying the summary from amqp_consumer.cc in provided liberary which was incrementing the timestamp fo collecting next summary wrongly.

What could cause Redis RDB Snapshoting to Stall?

I have a redis install on Ubuntu 14.04, and I seem to have nearly weekly issues with RDB snapshots completing. Redis version is 3.0.4 64 bit.
3838:M 24 Feb 09:46:28.826 * Background saving terminated with success
3838:M 24 Feb 09:47:29.088 * 100000 changes in 60 seconds. Saving...
3838:M 24 Feb 09:47:29.230 * Background saving started by pid 17281 17281:signal-handler (1456338079) Received SIGTERM scheduling shutdown...
3838:M 24 Feb 13:24:19.358 # Background saving terminated by signal 9
3838:M 24 Feb 13:24:19.622 * 10 changes in 900 seconds. Saving...
3838:M 24 Feb 13:24:19.730 * Background saving started by pid 17477
What you see there is that at 9:47am the background save started, but when I found it at 1:24pm it appeared to be completely stalled. I found the forked process to have basically no activity - the amount of memory it was consuming wasn't increasing. I tried to "kill" the child process, but it never actually quit, so i had to kill it with extreme prejudice (-9).
When things are getting bad, I get the following errors in my app:
2016-02-24 13:11:12,046 [2344] ERROR kCollectors.Main - Error while adding to Redis: No connection is available to service this operation: SADD ALLCH
My redis config is to do rdb snapshots only (no AOF). The load is modification heavy, with thousands of writes per second.
Currently I'm at the point where no redis background save is succeeding, and the background process becomes so much larger than the regular process that my VM starts swapping. Here's my TOP. 3838 is my redis instance, and 17477 is the background save process (as noted above):
top - 14:06:42 up 118 days, 2:05, 1 user, load average: 1.07, 1.07, 1.13
Tasks: 81 total, 3 running, 78 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 1.5 sy, 0.0 ni, 45.8 id, 51.3 wa, 0.0 hi,
0.5 si, 0.0 st
KiB Mem: 8176996 total, 8036792 used, 140204 free, 120 buffers
KiB Swap: 6289404 total, 3968236 used, 2321168 free. 4044 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
36 root 20 0 0 0 0 S 2.3 0.0
288:05.05 kswapd0
3838 rrr 20 0 7791836 3.734g 612 S 2.0
47.9 330:08.65 redis-server
17477 rrr 20 0 7792228 6.606g 364 D 1.0 84.7 0:43.49 redis-server
This is very interesting since I don't remember to ever read of such issues, so to discover the root cause could be very useful.
So here you are reporting a child process that stays a long time active, and even continues to allocate memory. I've no explanation for this if not a data corruption in the process memory, causing the RDB process to find unexpected conditions and looping forever in some way.
A few questions:
Does this happen even if you restart the process? (However please DON'T DO IT if you can avoid restarting and you did not restated yet, otherwise we may no longer understand the root cause).
While the RDB saving is active, do you see the CPU usage to be high and the process running with ps/top?
Could you try to interrupt the process with gdb -p <pid> and obtain a stack trace of the process?
Could you provide Redis INFO output to check version and other configuration things and state?
Could you check free output while this happens?
TLDR: is it possible the system is out of memory and is swapping a lot? So the child process while saving the RDB file visited all the pages and forced everything to be in the Resident Set. The system can't cope with so much I/O so it takes ages to complete the RDB saving.
EDIT: I just noticed you reported memory info:
KiB Mem: 8176996 total, 8036792 used, 140204 free, 120 buffers
So the system is out of memory and is swapping like crazy, and this results in the above behavior. As RDB saving starts, COW will use a lot of additional memory pushing the server on the memory limits.
Thanks.

redis config question?

I am using redis for caching but recently I ran into a problem with the amount of memory used - had to restart my server since all ram had been consumed.
It's not the biggest machine but how should I configure redis to avoid the same problem again?
free -m
total used free shared buffers cached
Mem: 240 222 17 0 6 38
-/+ buffers/cache: 177 62
Swap: 255 46 209
I have changed the following settings:
timeout 60
databases 1
save 300 1
save 60 100
maxmemory 104857600
top
top - 14:15:28 up 1:19, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 49 total, 1 running, 48 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 245956k total, 228420k used, 17536k free, 6916k buffers
Swap: 262136k total, 47628k used, 214508k free, 39540k cached
you can use the "maxmemory" directive in the config file: when this amount of memory is exceeded then Redis will expire earlier keys having already an expire set (the keys that would expire sooner are the first that will be removed).
Unlike memcached, redis is supposed to be a databse; so it won't automatically remove old values to make room for new ones.
You have to explicitly set the expire time for each key/value, and even then you could overflow if you create key/values faster than that.
Use Redis virtual memory in Redis 2.0:
http://antirez.com/post/redis-virtual-memory-story.html