Apache - resources randomly hang (resulting in slow page loads) - apache

HTTP requests of resources randomly - about between 1-5% of the time (per resource, not per page loads) - take extremely long to be delivered to the browser (~20 seconds), not uncommonly hanging indefinitely even. (Server details listed in list at the bottom).
This results in about every 5th request to any page appear to hang due to a JavaScript resource hanging within the <head> tag.
The resources are css, js and small image files, served directly by apache (no scripting language), although page loads (involving PHP or Rails) also rarely hang, with equal chances as any other resource (1-5% of the time), so this seems to be an Apache Request related issue.
Additional information:
I've checked the idle workers on server-status and as expected, I still have 98% of my idle workers. Although this may be relevant as the hangings apply to static resources not served by FastCGI (the resources are static).
I am not the only one with this problem. Someone else is also having the same problem, and from a different IP address.
This happens in both Google Chrome and Firefox as HTTP clients.
I have tried constantly force refreshing the same JS file in a new tab. It eventually led to the same kind of hanging.
The Timing tab for Google Chrome reports 34ms waiting and 19.27s receiving for one of these hanging requests. Would that mean Apache already had the file contents to be delivered ready, only had trouble delivering it in a sensible amount of time?
error.log doesn't show any errors. There are some expected 404 and 500 errors in error.log, but those aren't related to the hanging; those are actual errors for nonexisting pages and PHP fatal errors.
I get some suspicious 206 Partial Content responses mostly for static content, although the hanging happens more often then those partial contents. I mostly get 200 OK responses everywhere and I can confirm indefinitely hanging resources that were reported as 200 OK in the apache access.log.
I do have mod_passenger installed for Redmine. I don't know if that helps, but suspiciously this server has it installed unlike all the other servers I worked with. Although mod_passenger shouldn't affect static content, especially not within a non-ruby project folder, should it?
The server is using Apache 2.4 Event MPM on Ubuntu 13.10, hosted on Digital Ocean.
What may be causing these hangings and how could I fix this?

I had the same problem, so after reading this thread I tried setting KeepAlive Off in my apache config which seems to have helped- all resources have expected waiting times now.
Not a great "fix", but at least I am one step closer to figuring out the cause and pages aren't taking 15s to fully load in the mean time.

Related

Xdebug boosts site speed

I have WAMP stack for development and a lot of sites are going slow, but I have a really big issue with PrestaShop because loading time is 1 min on average.
Although the content is loaded, the main request is responding very slowly and Chrome's waterfall shows that the delay is caused by Content Downloading, but all assets are already downloaded (local storage) or cached.
I noticed that when I enable the xdebug listener (on VSCode) the site is responding as it should, i.e. within miliseconds.
Any idea what might be happening ?

Scrapy on Ubuntu web server getting 417 error

I have been developing a crawling script for a number of news websites and using Scrapy to handle the logic.
When I run my script on an Ubuntu web server (Digital Ocean, if that helps), a lot of the websites that return 200 on my local machine turn out to be 417 instead.
I was wondering how I should fix this, if it is a problem at all? I'm actually not quite sure if it is affecting the final output, but it seems like it has been.
Some of my own research has turned up:
http://www.checkupdown.com/status/E417.html . I've tried adding an Expect header to my requests, which hasn't worked
I've heard that it might be a problem with HTTP 1.1 vs 1.0? EDIT: Nope. Scrapy's HTTPDownloaderHandler automatically chooses 1.1 if it is available
417 is the error a web server gives you when your client says it expects content-types a,b,c, but the content that the server could deliver doesn't match any of these types.
This looks like a scrapy bug or, more likely, misconfiguration.
It seems either your public ip address was already banned or was banned while you scraped by the web server of the page you want to scrape. For the first situation you can reboot your instance to get a new public ip (at least this works on Amazon). For the second scenario, here are some tips from the official documentation to avoid this situation:
rotate your user agent from a pool of well-known ones from browsers
(google around to get a list of them)
disable cookies (see COOKIES_ENABLED) as some sites may use cookies to spot bot behaviour
use download delays (2 or higher). See DOWNLOAD_DELAY setting.
if possible, use Google cache to fetch pages, instead of hitting the
sites directly
use a pool of rotating IPs. For example, the free Tor
project or paid services like ProxyMesh
use a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is Crawlera
Additionally, you can reduce concurrent requests settings in your spider, that worked once for me.

download disconnects and web server log shows 404

Imagine an Internet speed of 20KB/s and a 20MB file. It takes 1000 seconds to download the file completely.
Using proxies like browsec, download fails at the middle of it. each time, after a random seconds. In these situations, Web server log (server is apache) indicates a 404 error. (problem 1)
I thought 404 means file doesn't exist. Doesn't it? but the file exists and some percents of it (again, random percents. once 50%, once 20% ...) is downloaded.
I don't really know what is happening between the proxy and the server. But I am sure that the browser doesn't show a failed text as it seems the file downloaded completely.
Downloading the file directly without any proxies, there is no problem and it downloads completely. But Chrome doesn't have resume capability, while Firefox does. really why chrome doesn't resume the download? (problem 2)
Any help about these two problems is appreciated.
Thanks.

mod_python does not process multiple requests from the same browser simultaneously for the same file

I have a page which can take long time to process. But in the mean time if the same page is accessed (from the same system), the second instance gets blocked till the first page finishes. Actually instead of the blocking behaviour, I would be happy it the second instance fails rather than getting hung. Is there a way to around to make the same file accessbile at the same time.
I have found the same problem being present in PHP also. But those replies were related to PHP. Apache same orgin request blocking Why does apache not process multiple requests from the same browser simultaneously discusses about the same problem with php.

Google Chrome err_failed chrome (err2) - Web App

I'm a web application developer, who runs a site http://myfav.es. We've been struggling with this issue for about a month now.
We use the HTML application cache spec - www.w3.org/TR/offline-webapps/ - with dynamically generated manifest files - myfav.es/personal.manifest - to speed page delivery. These dynamically generated manifest files use proper headers, and PHP to serve up custom manifests for users.
We also use gzip compression to serve the site from a linux/apache host.
For the life-cycle of our site, users report getting a err_failed similar to this screenshot in chrome. twitpic.com/272237.
This error is intermittent, occuring once every 200-300 visits, but will persists on every page refresh, including hard refreshes, which presumably means that an error using app cache is causing them to continuously load a failed version of the site. However, mysteriously JUST clearing cookies causes the error to fix itself.
I'm completely out of ideas on how to approach this error, and googling the error message appears to get a ton of confused users with voodoo-ish approaches to solving it. I've personally seen the error, along with a number of complaint from other users of chrome, so I'm fairly certain it cannot be caused by a particular user having abnormal settings or browser preferences.
Does anyone have any insight into the cause of this browser error and its origins? Whether its likely server-side or a byproduct of app design?