I have Apache setup with Plone 4.2 and SSL with the following rules in the ssl.conf Apache file:
RewriteEngine On
ProxyVia On
Redirect permanent / https://mywebsite.com/PloneSite/subfolder
RewriteRule ^/(.*) http://localhost:8080/VirtualHostBase/https/%{SERVER_NAME}:443/VirtualHostRoot/PloneSite/subfolder/$1 [L,P]
However, about once-twice a day (seemingly at random), the site will get really slow and eventually start serving up 502 errors (Proxy Error). The only thing that appears to fix it is to restart plone with "plonectl restart". I'm really at loss as to what is causing this, are there any of the rules above that appear to be incorrect?
This is not a proxy setup problem; Apache proxy rules for Plone either work or they don't. The proxy error is caused by Plone no longer responding, and that is why restarting Plone fixes the problem temporarily.
You'll need to figure out why Plone stops responding. This can have any number of reasons, and you'll have to pinpoint what is going on.
You could have a programming error, one that ties up a thread forever, in part of your site. Once you run out of threads, Plone can no longer serve normal requests and you get your proxy errors. You could use Products.signalstack to peak at what your threads are up to when your site is no longer responding.
It could be something trashed your ZODB cache; if a web crawler tried to load all of your site in short succession, for example, it may have caused so much cache churn that it takes a while to rebuild your catalog cache. Take a close look at your log files (both from Apache and the Plone instance) and look for patterns.
In such cases you'd either have to block the crawler, or install better caching to lighten the load on your Plone server (Varnish does a great job of such caching setups, with some careful tuning).
Some inexperienced catalog usage could have trashed your ZODB cache, with the same results. In one (very bad) case that I've seen, some code would look up all objects of a certain type from the catalog, call getObject() on those results (loading each and every object into memory), then filter the huge set down to a handful of objects that would actually be needed. Instead, the catalog should have been used to narrow the list of objects to load significantly before loading the objects themselves.
It could be you are not taking advantage of ZODB Blobs; large files stored on and served directly from the disk instead of from ZODB objects spares your memory cache significantly.
All in all this could be some work to sort out, depending on what the root cause is.
Related
I have received some results from a security scan that say that something is executing DNS A record look-ups on the URL in the Host header.
Having looked at the application code I can't see any such requests so I'm looking further up the stack.
I don't think Apache should be doing this but it's using mod_headers and mod_rewrite and maybe there is a configuration item in there that I have overlooked.
A long time ago, I came across an Apache httpd that was configured to do a reverse-lookup for IP-addresses before logging. While this was long denied, some requests were served quickly, while others took a long time (depending on the time required for the reverse lookup). And it became obvious once we looked at the logs (DNS names mixed with IP-addresses)
I don't see any reason why Headers and Rewrite would ever need to resolve any of the domains - they're purely working on strings/regexp.
Recommendation to figure out what's going on: Capture the traffic and figure out what domains/addresses are looked up when. With DNS still being largely unencrypted, this might be fairly easy, and point you to the smoking gun.
I've been doing further research about permanent 301 re-directs.
I'm doing a site re-design and 4,000 pages are changing their URLs so I have a few thousand 301 re-direct statements where the URLs are changing so much that I can't do regular expressions.
I've been researching about the performance differences between putting them in a .htaccess file or in httpd.conf. I'm reading and getting conflicting information about the benefits of each.
i.e. this which sounds promising:
"Note, if you are using Apache then it's strongly recommended to put the redirect rules in your httpd.conf (stored in memory when Apache starts) and not .htaccess files (which are loaded on every page request)."
Source - Major site rewrite and SEO with 301 redirects
but then conflicted by this:
"You can use Include directive in httpd.conf to be able to maintain redirects in another file. But it would not be very efficient, as every request would need to be checked against a lot of regular expressions. "
Source - http://www.faqoverflow.com/serverfault/414225.html
My host said:
"No performance impact with httpd.conf, you're in effect doing the same thing as adding them to the config itself. But you are doing so in a way that will not cause issues with it or have the changes lost."
Is it correct that adding thousands of 301 re-direct statements to httpd.conf won't cause performance issues for my site?
Is it correct that adding thousands of 301 re-direct statements to httpd.conf won't cause performance issues for my site?
Thousands of redirects are fine. I've heard or people attempting to do benchmarks on stuff like there's practically no significant impact, anymore so than regular stuff your server's OS does.
Your second quote is flat out wrong, at least as far as apache 2.2-2.4 is concerned. When you use the Include directive, it loads the contents of the file(s) as part of the server's configuration. That means it's loaded when you start the server, or when you explicitly tell apache to reload its configuration. It does not look at all the Included files for every request.
Apache uses this directive pretty liberally, as in most out-of-the-box configurations use Include to load entire directories of per-module configuration and per-vhost configuration.
I am running a Perl CGI tool that executes a system command (Unix) which may run for a few seconds up to an hour.
After the script is finished, the tool should display the results log on the screen (in a browser).
The problem is that after about 5 minutes I get a timeout message "Gateway Time-out" - the system command continue to run but I'm unable to display to the user the results of the run.
In the Apache config file (httpd.conf): Timeout 300.
Is there a simple way ordering the Apache to increase the timeout only for a specific run?
I don't really want to change the Apache timeout permanently (or should I?) and not dramatically update the code (a lot of regression tests).
Thanks in advance.
Mike
Make the script generate some output every once in a while. The timeout is not for running the program to completion, but is a timeout while Apache is waiting for data. So if you manage to get your program to output something regularly while running, you will be fine.
Note that HTTP clients, i.e. browsers, also have their own timeout. If your browser does not get any new data from the web server five minutes (typically), the browser will declare a timeout and give up even if the server is still processing. If your long running processing gives some output every now and then, it will help against browser timeouts too!
For completeness:
Though the accepted answer is the best (it's variously known as KeepAlive packets in TCP/IP, or Tickle packets way back in appletalk days) you did ask if you can do dynamic Apache config.
An apache module could do this. Oh, but that's a pain to write in C.
Remember that mod_perl (and to some extent mod_python, though it's deprecated) do not only handlers but wrap the internal config in perl as well. You could write something complicated to increase the timeout in certain situations. But, this would be a bear to write and test, and you're better off doing what Krisku says.
There doesn't seem to be any way to specify a timeout on the <!--#include virtual=... --> directive, but if you use mod_cgid instead of mod_cgi then starting with Apache 2.4.10 there's a configurable timeout parameter available which you can specify in httpd.conf or .htaccess:
CGIDScriptTimeout nnns
...where nnn is the number of seconds that Apache will allow a cogitating CGI script to continue to run.
Caveat: If you use PHP with Apache, then your Apache is presumably configured in /etc/httpd/conf.modules.d/00-mpm.conf to use "prefork" MPM (because PHP requires it unless built with thread-safe flags), and the default Apache installation used mod_cgi with the prefork MPM, so you'll probably need to edit /etc/httpd/conf.modules.d/01-cgi.conf to tell Apache to use mod_cgid instead of mod_cgi.
Although the comment in 01-cgi.conf says, "mod_cgid should be used with a threaded MPM; mod_cgi with the prefork MPM," that doesn't seem to be correct, because mod_cgid seems to work fine with prefork MPM and PHP, for me, with Apache 2.4.46.
Although that doesn't give you complete control over server timeouts, you could specify a different CGIDScriptTimeout setting for a particular directory (e.g., put your slow .cgi files in the ./slowstuff/ folder).
(Of course, as krisku mentioned in the accepted answer, changing CGIDScriptTimeout won't solve the problem of the user's web browser timing out.)
I am using CloudFront with mod_pagespeed running on the server.
When updating a CSS or flushing the cache I see problematic behavior, first refresh on the browser returns the original css (this is fine). When I refresh a second time I get the correct manipulated CSS file name but the content of the file from CloudFront is still the original and not the correct manipulated content.
Why would this happen?
Any idea how to fix this?
Update:
For some reason it just stopped happening... I don't know why.
SimonW, since your original post there has been a feature added to pagespeed (in March 2013 in version 1.2.24.1) to deal with this issue directly. The directive is enabled via the following:
Apache:
ModPagespeedRewriteDeadlinePerFlushMs deadline_value_in_milliseconds
Nginx:
pagespeed RewriteDeadlinePerFlushMs deadline_value_in_milliseconds;
The docs describe the directive as follows (emphasis mine):
When PageSpeed attempts to rewrite an uncached (or expired) resource
it will wait for up to 10ms per flush window (by default) for it to
finish and return the optimized resource if it's available. If it has
not completed within that time the original (unoptimized) resource is
returned and the optimizer is moved to the background for future
requests. The following directive can be applied to change the
deadline. Increasing this value will increase page latency, but might
reduce load time (for instance on a bandwidth-constrained link where
it's worth waiting for image compression to complete). Note that a
value less than or equal to zero will cause PageSpeed to wait
indefinitely.
So, if you specify a value of 0 for deadline_value_in_milliseconds you should always get the fully optimized page. I would caution that the latency can be high on this in some cases. I my case, I really wanted this behavior, even with the latency concern, because the content was to be cached on my CDN's edge servers and, thus, I wanted the most optimized version possible to be served to the CDN for caching.
This could happen if you have multiple backend servers and CloudFront is hitting a different one than the HTML request went through. In that case the resource was rewritten on the HTML server, but not on the other server. There is a short timeout and if the other server doesn't finish the rewrite in that time, it will just serve the original content with Cache-Control: private,max-age=300. It's possible CloudFront caches that for a little while (even though obviously it shouldn't), but then eventually re-requests the resource from your backend and gets the correctly rewritten version this time.
When using XAMPP (1.7.5 Beta) under Windows 7 (Ultimate, version 6.1, build 7600), it takes several seconds before pages actually show up. During these seconds, the browser shows "Waiting for site.localhost.com..." and Apache (httpd.exe, version 2.2.17) has 99% CPU load.
I have already tried to speed things up in several ways:
Uncommented "Win32DisableAcceptEx" in xampp\apache\conf\extra\httpd-mpm.conf
Uncommented "EnableMMAP Off" and "EnableSendfile Off" in xampp\apache\conf\httpd.conf
Disabled all firewall and antivirus software (Windows Defender/Windows Firewall, Norton AntiVirus).
In the hosts file, commented out "::1 localhost" and uncommented "127.0.0.1 localhost".
Executed (via cmd): netsh; interface; portproxy; add v6tov4 listenport=80 connectport=80.
Even disabled IPv6 completely, by following these instructions.
The only place where "HostnameLookups" is set, is in xampp\apache\conf\httpd-default.conf, to: Off.
Tried PHP in CGI mode by commenting out (in httpd-xampp.conf): LoadFile "C:/xampp/php/php5ts.dll" and LoadModule php5_module modules/php5apache2_2.dll.
None of these possible solutions had any noticeable effect on the speed. Does Apache have difficulty trying to find the destination host ('gethostbyname')? What else could I try to speed things up?
Read over Magento's Optimization White Paper, although it mentions enterprise the same methodologies will and should be applied. Magento is by no means simplistic and can be very resource intensive. Like some others mentioned I normally run within a Virtual Machine on a LAMP stack and have all my optimization's (both at server application levels and on a Magento level) preset on a base install of Magento. Running an Opcode cache like eAccelerator or APC can help improve load times. Keeping Magento's caching layers enabled can help as well but can cripple development if you forget its enabled during development, however there are lots of tools available that can clear this for you from a single command line or a tool like Alan Storms eCommerce Bug.
EDIT
Optimization Whitepaper link:
https://info2.magento.com/Optimizing_Magento_for_Peak_Performance.html
Also, with PHP7 now including OpCache, enabling it with default settings with date/time checks along with AOE_ClassPathCache can help disk I/O Performance.
If you are using an IDE with Class lookups, keeping a local copy of the code base you are working on can greatly speed up indexing in such IDEs like PHPStorm/NetBeans/etc. Atwix has a good article on Docker with Magento:
https://www.atwix.com/magento/docker-development-environment/
Some good tools for local Magento 1.x development:
https://github.com/magespecialist/mage-chrome-toolbar
https://github.com/EcomDev/EcomDev_LayoutCompiler.git
https://github.com/SchumacherFM/Magento-OpCache.git
https://github.com/netz98/n98-magerun
Use a connection profiler like Chrome's to see whether this is actually a lookup issue, or whether you are waiting for the site to return content. Since you tagged this question Magento, which is known for slowness before you optimize it, I'm guessing the latter.
Apache runs some very major sites on the internets, and they don't have several second delays, so the answer to your question about Apache is most likely no. Furthermore, DNS lookup happens between your browser and a DNS server, not the target host. Once the request is sent to the target host, you wait for a rendered response from it.
Take a look at the several questions about optimizing Magento sites on SO and you should get some ideas on how to speed your site up.