I have two servers. I wish to send some data ( was doing it with HTTP GET till now ) to a php file residing on the server and get some output from it.
Of late, I saw the requests per second went up to 50 and Apache served HTTP 500 error for some of those. This server has 512 MB RAM and the script, in php-cli mode, usually eats up around 10 MB of memory.
I wish, if it were to reduce the load on server, to use SSH instead of HTTPS. Will it reduce the memory usage on this server (minus what the script itself needs)? Or will too many SSH connections still cause hindrance?
Note - I do not have HTTPS setup right now. But planning to switch over to it. And just then, this issue cropped up.
SSH will not speed up your program. What you can do, is to create your own server on the destination server (which will receive data). The web server do much more than just receive data, like interpret HTTP headers and route your requests to files. Your own server can do the same job in a much lighter way.
http://br.php.net/manual/en/sockets.examples.php has an example of how to do this.
I would use normal HTTP, but encrypt the data sent.
How big is the data you're wanting to transfer? Depending on the size SSH (in particular, SFTP) might very well be better. I say that because... well, when was the last time you tried to upload a 10MB file via a webpage and succeeded? Uploading small files works for HTTP but, at the end of the day, it's not a file transfer protocol.
My recommendation would be to use phpseclib, a pure PHP SFTP implementation. Upload a file via SFTP and then run a PHP script on that file via SSH.
Related
Is it possible to increase CloudFlare's time-out? If yes, how?
My code takes a while to execute and I wasn't planning on Ajaxifying it the coming days.
No, CloudFlare only offers that kind of customisation on Enterprise plans.
CloudFlare will time out if it fails to establish a HTTP handshake after 15 seconds.
CloudFlare will also wait 100 seconds for a HTTP response from your server before you will see a 524 timeout error.
Other than this there can be timeouts on your origin web server.
It sounds like you need Inter-Process Communication. HTTP should not be used a mechanism for performing blocking tasks without sending responses, these kind of activities should instead be abstracted away to a non-HTTP service on the server. By using RabbitMQ (or any other MQ) you can then pass messages from the HTTP element of your server over to the processing service on your webserver.
I was in communication with Cloudflare about the same issue, and also with the technical support of RabbitMQ.
RabbitMQ suggested using Web Stomp which relies on Web Sockets. However Cloudflare suggested...
Websockets would create a persistent connection through Cloudflare and
there's no timeout as such, but the best way of resolving this would
be just to process the request in the background and respond asynchronously, and serve a 'Loading...' page or similar, rather than having the user to wait for 100 seconds. That would also give a better user experience to the user as well
UPDATE:
For completeness, I will also record here that
I also asked CloudFlare about running the report via a subdomain and "grey-clouding" it and they replied as follows:
I will suggest to verify on why it takes more than 100 seconds for the
reports. Disabling Cloudflare on the sub-domain, allow attackers to
know about your origin IP and attackers will be attacking directly
bypassing Cloudflare.
FURTHER UPDATE
I finally solved this problem by running the report using a thread and using AJAX to "poll" whether the report had been created. See Bypassing CloudFlare's time-out of 100 seconds
Cloudflare doesn't trigger 504 errors on timeout
504 is a timeout triggered by your server - nothing to do with Cloudflare.
524 is a timeout triggered by Cloudflare.
See: https://support.cloudflare.com/hc/en-us/articles/115003011431-Troubleshooting-Cloudflare-5XX-errors#502504error
524 error? There is a workaround:
As #mjsa mentioned, Cloudflare only offers timeout settings to Enterprise clients, which is not an option for most people.
However, you can disable Cloudflare proxing for that specific (sub)domain by turning the orange cloud into grey:
Before:
After:
Note: it will disable extra functionalities for that specific (sub)domain, including IP masking and SSL certificates.
As Cloudflare state in their documentation:
If you regularly run HTTP requests that take over 100 seconds to
complete (for example large data exports), consider moving those
long-running processes to a subdomain that is not proxied by
Cloudflare. That subdomain would have the orange cloud icon toggled to
grey in the Cloudflare DNS Settings . Note that you cannot use a Page
Rule to circumvent Error 524.
I know that it cannot be treated like a solution but there is a 2 ways of avoiding this.
1) Since this timeout is often related to long time generating of something, this type of works can be done through crontab or if You have access to SSH you can run a PHP command directly to execute. In this case connection is not served through Cloudflare so it goes as long as your configuration allows it to run. Check it on Google how to run scripts from command line or how to determine them in crontab by using /usr/bin/php /direct/path/to/file.php
2) You can create subdomain that is not added to cloudlflare and move Your script there and run them directly through URL, Ajax call or whatever.
There is a good answer on Cloudflare community forums about this:
If you need to have scripts that run for longer than around 100 seconds without returning any data to the browser, you can’t run these through Cloudflare. There are a couple of options: Run the scripts via a grey-clouded subdomain or change the script so that it kicks off a long-running background process and quickly returns a status which the browser can poll until the background process has completed, at which point the full response can be returned. This is the way most people do this type of action as keeping HTTP connections open for a long time is unreliable and can be very taxing also.
This topic on Stackoverflow is high in SERPs so I decided to write down this answer for those who will find it usefull.
https://support.cloudflare.com/hc/en-us/articles/115003011431-Troubleshooting-Cloudflare-5XX-errors#502504error
Cloudflare 524 error results from a web page taking more than 100 seconds to completely respond.
This can be overridden to (up to) 600 seconds ... if you change to "Enterprise" Cloudflare account. The cost of Enterprise is roughtly $40k per year (annual contract required).
If you are getting your results with curl, you could use the resolve option to directly access your IP, not using the Cloudflare proxy IP:
For example:
curl --max-time 120 -s -k --resolve lifeboat.com:443:127.0.0.1 -L https://lifeboat.com/blog/feed
The simplest way to do this is to increase your proxy waiting timeout.
If you are using Nginx for instance you can simply add this line in your /etc/nginx/sites-availables/your_domain:
location / {
...
proxy_read_timeout 600s; # this increases it by 10mins; feel free to change as you see fit with your needs.
...
}
If the issue persists, make sure you use let's encrypt to secure your server alongside Nginx and then disable the orange cloud on that specific subdomain on Cloudflare.
Here are some resources you can check to help do that
installing-nginx-on-ubuntu-server
secure-nginx-with-let's-encrypt
I've been using proxy services, and I want to know some details behind it, regarding its speed and efficiency. Consider following scenario:
There's a mp3 file on server M, a client wants to download that file, but he doesn't want to expose himself, so he decides to use a proxy website to download. The get mp3 request is therefore send to proxy server P first, then proxy server would get that mp3 for the client, here's my question about some details:
Does P have to download the entire mp3 file first before it can pass it to the client? If so, the file is downloaded twice (first on proxy server, then on client's machine) , taking about twice amount of time?
Proxies normally operate in two modes: HTTP and Connect.
The Connect mode is for blackbox protocols like HTTPS or ftp. Where most of the data is meaningless octet streams. Because they are encrypted or unstructured files.
However, for HTTP, proxies are pretty smart. One of the things that they do is caching stuff. Like images and web page contents when you are downloading a website in your browser via proxy.Moreover, for octet streams under HTTP, proxies show the connect behavior, meaning that they open a relay socket and let you download the content. In the meanwhile, they will store it locally, and if it doesn't exceed a certain size the file will be also cached.
The files are also forwarded, or relayed, or sometimes called rewrited. This here is a sample config file that shows squid configured to forward Youtube videos and not caching them.
Another reason why downloading and forwarding is not an option is doubling the round trip time (RTT). It is really counterintuitive when you add another RTT that slows down a HTTP session.
How do I monitor bandwidth usage of individual virtual sites on Apache? (Ubuntu 14).
On our IIS server, we use the performance monitor, save to csv file and have MRTG parse the data and display it as graphs.
Can I do this with MRTG? I read of an unsupported module for Apache (mod_monitor??) that some had tried to use but really don't want to go with unsupported software.
The short answer is that you probably cannot do it without a little additional work.
The longer answer is that, while MRTG can graph anything in theory, you have to provide it with a way to obtain the data. The throughput of a network interface is already provided via SNMP, but the network traffic per virtual server is a little harder to come by, and you need to convince Apache to hand this data over in a format you can use.
You are clearly already aware of much of this, since under IIS you used the performance monitor to obtain the data from the perfstats. In fact, with IIS, instead of dumping the stats to a file and parsing it, you can use a plugin like mrtg-nsclient to query the perfstats directly via the Nagios nsclient++ agent. However, you are using Apache...
One way to achieve it would be to run each virtual server on a separate TCP port, and then use iptables logging rules to count the bytes passed. The output of iptables -L can then be parsed by MRTG to get the counters.
If you want to use name virtual hosts, though, only Apache's internals have the relevant data.
I have an MRTG data collection plugin that obtains total traffic counts via the mod_status URL. This allows graphing of the number of active Apache threads, and total traffic. However it is not split by virtual server, so you cannot get the individual statistics. Even with ExtendedStatus on, you only see the activity of the current threads, not counts split by vhost. ExtendedStatus will allow you to see how many threads are active per vhost, but not the total bytes transferred by each vhost.
The output you want appears to exist in mod_watch which will output one line of statistics per vhost on the URL /watch-list. However, this is an older module and may require modification in order for it to compile against Apache 2.4. It is also very hard to come by as the author has apparently tried to bury it. It used to be on github but vanished in 2012.
Try here: https://github.com/pld-linux/apache-mod_watch for the source,
Try here: http://fossies.org/windows/www/httpd-modules-2.4-win64-VC11.zip/index_o.html for the windows binary for Apache 2.4
I want to communicate between Apache and an external process.
I can modify the source of the process (written in C++) as much as I want, but Apache should (hopefully) remain the same. I was thinking about just using an Intranet socket between PHP and the program, but that just seems inefficient and hard to do if there are multiple page loads at once, and using a file is even worse.
Essentially, Apache (and PHP) would query the external program, and should read or modify a hashtable. How should I go about doing this?
Make your 'external process' expose an HTTP server, then reverse-proxy from apache to that HTTP server. Done.
It seems that nginx buffers requests before passing it to the updstream server,while it is OK for most cases for me it is very bad :)
My case is like this:
I have nginx as a frontend server to proxy 3 different servers:
apache with a typical php app
shaveet(a open source comet server) built by me with python and gevent
a file upload server built again with gevent that proxies the uploads to rackspace cloudfiles
while accepting the upload from the client.
#3 is the problem, right now what I have is that nginx buffers all the request and then sends that to the file upload server which in turn sends it to cloudfiles instead of sending each chunk as it gets it (those making the upload faster as i can push 6-7MB/s to cloudfiles).
The reason I use nginx is to have 3 different domains with one IP if I can't do that I will have to move the fileupload server to another machine.
As soon as this [1] feature is implemented, Nginx is able to act as reverse proxy without buffering for uploads (bug client requests).
It should land in 1.7 which is the current mainline.
[1] http://trac.nginx.org/nginx/ticket/251
Update
This feature is available since 1.7.11 via the flag
proxy_request_buffering on | off;
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_request_buffering
According to Gunicorn, they suggest you use nginx to actually buffer clients and prevent slowloris attacks. So this buffering is likely a good thing. However, I do see an option further down on that link I provided where it talks about removing the proxy buffer, it's not clear if this is within nginx or not, but it looks as though it is. Of course this is under the assumption you have Gunicorn running, which you do not. Perhaps it's still useful to you.
EDIT: I did some research and that buffer disable in nginx is for outbound, long-polling data. Nginx states on their wiki site that inbound requests have to be buffered before being sent upstream.
"Note that when using the HTTP Proxy Module (or even when using FastCGI), the entire client request will be buffered in nginx before being passed on to the backend proxied servers. As a result, upload progress meters will not function correctly if they work by measuring the data received by the backend servers."
Now available in nginx since version nginx-1.7.11.
See documentation
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_request_buffering
To disable buffering the upload specify
proxy_request_buffering off;
I'd look into haproxy to fulfill this need.