What makes nginx/apache a web server, HAProxy not? - apache

What makes nginx/apache a web server, HAProxy not?
What functionalities HAProxy lacks to be a web server?

HAProxy can listen on port 80 and can speak HTTP but that's not what people mean when they say "web server."
HAProxy is not a web server, because "web server" implies an HTTP endpoint that can serve static content from files and/or dynamic content generated from code. That's not what HAProxy is for.
Technically, there are certain capabilities in HAProxy that can be misused to emulate some capabilities of a web server -- you can serve very small static files from memory buffers and you can generate small dynamic responses using the optional embedded Lua interpreter -- but it is not intended or designed to be used as a web server. It's a proxy server -- emulating a web server toward the client, and emulating a client toward the real back-end web server(s) behind it -- because bidirectional emulation is commonly what proxies do.
With Nginx and Apache, you can specify a root directory from which files are served, and you can specify paths that are to be serviced by code running in languages like Perl, PHP, Python, etc. Not with HAProxy, because, again, that isn't what it's designed to do.
Both Nginx and Apache can also be used as proxy servers, as HAProxy can, but HAproxy is specifically designed and optimized for that primary purpose -- proxying and load balancing against multiple back-end, selecting the back-end using various rules and algorithms... in essence, HAProxy is an "intermediate router" for HTTP requests, delivering them rather than responding to them. It can also proxy and load balance non-HTTP protocols that rely on TCP.

Related

What is HTTPD exactly?

I mean is "httpd" only used by Apache for the download of the software or is it used by other websites as well? Also is it necessary to have httpd to run "cgi" or not?
And why does Apache use httpd to download the http server instead of having it in a file on their http website?
Apache HTTPD is an HTTP server daemon produced by the Apache Foundation. It is a piece of software that listens for network requests (which are expressed using the Hypertext Transfer Protocol) and responds to them.
It is open source and many entities use it to host their websites.
Other HTTP servers are available (including Apache Tomcat which is designed for running server side programs written in Java (which don't use CGI)).
CGI is a protocol that allows an HTTP server to use an external piece of software to determine how to respond to a request instead of simply returning the contents of a static file. Many HTTP servers support the CGI protocol.
You can use CGI without an HTTP server, but this typically has few uses beyond allowing a developer to perform command line testing of the CGI program. (You certainly can't interact with it directly from a web browser).
HTTP Daemon is a software program that runs in the background of a web server and waits for the incoming server requests. The daemon answers the request automatically and serves the hypertext and multimedia documents over the Internet using HTTP.
Apache Httpd is basically a web server used for handling requests and delivering static content. While CGI is a protocol which adds a scripts with the request and based on the script the content is delivered instead of simply returning a static content. So it is not necessary to use CGI with apache httpd but for delivering a dynnmic content httpd and cgi are used together.
Also using httpd with cgi is a very heavy process of delivering dynamic content as it creates and destroys process with every request response cycle, there are many other efficient alternatives with latest technology.
HTTPd - HyperText Transfer Protocol Daemon
HTTPd is a software program, that usually runs in the background, as a process.
It plays the role of server in a client-server model using HTTP and/or HTTPS network protocols.
HTTPd waits for the incoming client requests and for each request it answers by replying with requested information.
Following are some commonly used HTTPd
Apache
BusyBox
CERN HTTPd
Lighttpd
Ngnix

what is proxy server and how it helps in server architecture

I am very confused with proxy server, and proxy and this word proxy. I saw everywhere people are using proxy program, proxy server. Some of them using the proxy websites to unblock the websites. There are lot of things like reverse-proxy like that..
When I read one article about nginx I ran into one pic it says proxy cache. So what's proxy cache?
And how can I write a proxy program? What does that mean ? Why we need to use a proxy program?
Anybody can answer my question as simple as possible, I am not much in to this area.
A proxy server is used to facilitate security, administrative control or caching service, among other possibilities. In a personal computing context, proxy servers are used to enable user privacy and anonymous surfing. Proxy servers are used for both legal and illegal purposes.
On corporate networks, a proxy server is associated with -- or is part of -- a gateway server that separates the network from external networks (typically the Internet) and a firewall that protects the network from outside intrusion. A proxy server may exist in the same machine with a firewall server or it may be on a separate server and forward requests through the firewall. Proxy servers are used for both legal and illegal purposes.
When a proxy server receives a request for an Internet service (such as a Web page request), it looks in its local cache of previously downloaded Web pages. If it finds the page, it returns it to the user without needing to forward the request to the Internet. If the page is not in the cache, the proxy server, acting as a client on behalf of the user, uses one of its own IP addresses to request the page from the server out on the Internet. When the page is returned, the proxy server relates it to the original request and forwards it on to the user.
To the user, the proxy server is invisible; all Internet requests and returned responses appear to be directly with the addressed Internet server. (The proxy is not quite invisible; its IP address has to be specified as a configuration option to the browser or other protocol program.)
An advantage of a proxy server is that its cache can serve all users. If one or more Internet sites are frequently requested, these are likely to be in the proxy's cache, which will improve user response time. A proxy can also log its interactions, which can be helpful for troubleshooting.

Apache forward Websocket to Golang

The most part of my website is delivered with PHP and Apache,
which works just fine.
However I want to use Websocket for a page (or multiple pages).
For the Websocket communication I want to use golang.
To not let the clients run into any firewall problems Websocket should use the normal webport.
(443 that is in this case - for the SSL version of Websocket).
Because Apache is already listening on that port, I need it to forward Websocket requests (or requests to a specific URL) to my golang program.
(A single golang program must listen to all incoming websocket connections, to allow for easy communication between them.)
Is there a way to achieve that?
one of the web servers must proxy for the other. So you need to either configure Apache to proxy requests to your Golang program, or incorporate a reverse proxy into your golang program to deal with the Apache content.
It's probably easier to configure Apache as a proxy than include the reverse proxy into your Golang code, but there is a standard lib for it: http://golang.org/pkg/net/http/httputil/

apache to tomcat: mod_jk vs mod_proxy

What are the advantages and disadvantages of using mod_jk and mod_proxy for fronting a tomcat instance with apache?
I've been using mod_jk in production for years but I've heard that it's "the old way" of fronting tomcat. Should I consider changing? Would there be any benefits?
A pros/cons comparison for those modules exists on http://blog.jboss.org/
mod_proxy
* Pros:
o No need for a separate module compilation and maintenance. mod_proxy,
mod_proxy_http, mod_proxy_ajp and mod_proxy_balancer comes as part of
standard Apache 2.2+ distribution
o Ability to use http https or AJP protocols, even within the same
balancer.
* Cons:
o mod_proxy_ajp does not support large 8K+ packet sizes.
o Basic load balancer
o Does not support Domain model clustering
mod_jk
* Pros:
o Advanced load balancer
o Advanced node failure detection
o Support for large AJP packet sizes
* Cons:
o Need to build and maintain a separate module
If you wish to stay in Apache land, you can also try the newer mod_proxy_ajp, which uses the AJP protocol to communicate with Tomcat instead of plain old HTTP, but which leverages mod_proxy to do the work.
AJP vs HTTP
When using mod_jk, you are using the AJP. When using mod_proxy you will use HTTP or HTTPS. And this is essentially what makes all the difference.
The Apache JServ Protocol (AJP)
The Apache JServ Protocol (AJP) is a binary protocol that can proxy inbound requests from a web server through to an application server that sits behind the web server. AJP is a highly trusted protocol and should never be exposed to untrusted clients, which could use it to gain access to sensitive information or execute code on the application server.
Pros
Easy to set up as the correct forwarding of HTTP headers is not required.
It is less resource intensive because the TCP packets are forwarded in binary format instead of doing a costly HTTP exchange.
Cons
Transferred data is not encrypted. It should only be used within trusted networks.
Hypertext Transfer Protocol (HTTP)
HTTP functions as a request–response protocol in the client–server computing model. A web browser, for example, may be the client and an application running on a computer hosting a website may be the server. The client submits an HTTP request message to the server. The server, which provides resources such as HTML files and other content, or performs other functions on behalf of the client, returns a response message to the client. The response contains completion status information about the request and may also contain requested content in its message body.
Pros
Can be encrypted with SSL/TLS making it suitable for traffic across untrusted networks.
It is flexible as it allows to modify the request before forwarding. For example, setting custom headers.
Cons
More overhead as the correct forwarding of the HTTP headers has to be ensured.
More resource intensive as the request is fully parsed before forwarding.

How to put up an off-the-shelf https to http gateway?

I have an HTTP server which is in our internal network and accessible only from inside it. I would like to put another server that would listen to an HTTPS port accessible from outside, and forward the requests to that HTTP server (and send back the responses via HTTPS). I know that there are several ways to do this with some programming involved (and I myself made a temporary solution with Tomcat and a very simple servlet I wrote), but is there a way to do the same just plugging parts already made (like Apache + modules)?
This is the sort of use-case that stunnel is designed for. There is a specific example of using stunnel to wrap an HTTP server.
You should consider whether this is really a good idea, though. Web applications designed for use inside a corporate firewall are often fairly lax about security. Merely encrypting the connections prevents casual eavesdropping, but does not secure the site. If an attacker finds your outward facing server and starts connecting to it, they can still try to find exploitable flaws in the web service (SQL injection, cross-site scripting, etc).
With Apache look into mod_proxy.
Apache 2.2 mod_proxy docs
Apache 2.0 mod_proxy docs