I've got an Apache that's proxying requests to an external entity:
ProxyPass /something https://external.example.com/somethingelse
This external site likes to switch the values of that domain based on where they want their traffic. Apache seemingly doesn't pick up the new value until it's restarted. Is there a way to force Apache to do new lookups based on certain amount of time? After some research and even looking at the code, I don't see an obvious answer. If that isn't an option, any other suggestions?
According to Apache documentation:
DNS resolution for origin domains DNS resolution happens when the
socket to the origin domain is created for the first time. When
connection reuse is enabled, each backend domain is resolved only once
per child process, and cached for all further connections until the
child is recycled.
There is ProxyPass key=value parameter to control this:
disablereuse Off This parameter should be used when you want to force
mod_proxy to immediately close a connection to the backend after being
used, and thus, disable its persistent connection and pool for that
backend. This helps in various situations where a firewall between
Apache httpd and the backend server (regardless of protocol) tends to
silently drop connections or when backends themselves may be under
round- robin DNS. When connection reuse is enabled each backend domain
is resolved (with a DNS query) only once per child process and cached
for all further connections until the child is recycled. To disable
connection reuse, set this property value to On.
Related
I don't know exactly how to ask it, so I will try to explain with an example.
I have these resources on example.com, an HTTP/2 enabled server:
//example.com/css/file.css
//example.com/js/file.js
//example.com/images/file.png
What I want is to load one of these files through an alias domain cdn.example2.com that points to the domain example.com. So, the actual resources inside the HTML should look like:
//example.com/css/file.css
//cdn.example2.com/js/file.js -> points to //example.com/js/file.js
//example.com/images/file.png
My question here is: Shall all the resources in the second example be loaded by the browser over a single connection as they will be loaded when there is no alias domain?
Thanks for help.
If the aliases resolve to different IPs, there is no way the resources can be loaded over the same connection (called "connection re-use" by HTTP/2, if I'm not mistaken). That's a problem with CDNs from here on.
But for your peace of mind and utter rejoice of CDNs, connection re-use is a tricky thing and you may not have it even if all your domains resolve to the same IP, as is the case in your question.
To be future proof, you may want to ensure that your sites have the certificate extensions configured correctly to enable connection re-use.
In the current versions of Firefox and Chrome, I haven't observed connection re-use, even after crafting the certificates with all due care, and of course being sure that the two domains point to the same IP.
And just some food for thoughts: HTTP/2 over TLS requires SNI, which happens only when openning a connection. So when you connect for the first time to one domain, say example.com, the server obtains SNI data. But the server won't obtain such data if the same connection is re-used to send a request to cdn.example.com. Some servers or usage scenarios may be sensitive to this asymmetry, and that may have something to do with the way in which browsers implement (or not) connection re-use. But these are only speculations of yours truly...
The specification doesn't require its reuse, but it does explicitly include information on when reuse is acceptable -- such as two hosts that resolve to the same IP address.
https://www.rfc-editor.org/rfc/rfc7540#section-9.1.1
Connections that are made to an origin server, either directly or
through a tunnel created using the CONNECT method (Section 8.3), MAY
be reused for requests with multiple different URI authority
components. A connection can be reused as long as the origin server
is authoritative (Section 10.1). For TCP connections without TLS,
this depends on the host having resolved to the same IP address.
For "https" resources, connection reuse additionally depends on
having a certificate that is valid for the host in the URI. The
certificate presented by the server MUST satisfy any checks that the
client would perform when forming a new TLS connection for the host
in the URI.
I am very confused with proxy server, and proxy and this word proxy. I saw everywhere people are using proxy program, proxy server. Some of them using the proxy websites to unblock the websites. There are lot of things like reverse-proxy like that..
When I read one article about nginx I ran into one pic it says proxy cache. So what's proxy cache?
And how can I write a proxy program? What does that mean ? Why we need to use a proxy program?
Anybody can answer my question as simple as possible, I am not much in to this area.
A proxy server is used to facilitate security, administrative control or caching service, among other possibilities. In a personal computing context, proxy servers are used to enable user privacy and anonymous surfing. Proxy servers are used for both legal and illegal purposes.
On corporate networks, a proxy server is associated with -- or is part of -- a gateway server that separates the network from external networks (typically the Internet) and a firewall that protects the network from outside intrusion. A proxy server may exist in the same machine with a firewall server or it may be on a separate server and forward requests through the firewall. Proxy servers are used for both legal and illegal purposes.
When a proxy server receives a request for an Internet service (such as a Web page request), it looks in its local cache of previously downloaded Web pages. If it finds the page, it returns it to the user without needing to forward the request to the Internet. If the page is not in the cache, the proxy server, acting as a client on behalf of the user, uses one of its own IP addresses to request the page from the server out on the Internet. When the page is returned, the proxy server relates it to the original request and forwards it on to the user.
To the user, the proxy server is invisible; all Internet requests and returned responses appear to be directly with the addressed Internet server. (The proxy is not quite invisible; its IP address has to be specified as a configuration option to the browser or other protocol program.)
An advantage of a proxy server is that its cache can serve all users. If one or more Internet sites are frequently requested, these are likely to be in the proxy's cache, which will improve user response time. A proxy can also log its interactions, which can be helpful for troubleshooting.
Summary/Quesiton:
I have Apache running with Prefork MPM, running php. I'm trying to use Apache mod_proxy to create a reverse proxy that I can re-route my requests through, so that I can use Apache to do connection pooling. Example impl:
in httpd.conf:
SSLProxyEngine On
ProxyPass /test_proxy/ https://destination.server.com/ min=1 keepalive=On ttl=120
but when I run my test, which is the following command in a loop:
curl -G 'http://localhost:80/test_proxy/testpage'
it doesn't seem to re-use the connections.
After some further reading, it sounds like I'm not getting connection pool functionality because I'm using the Prefork MPM rather than the Worker MPM. So each time I make a request to the proxy, it spins up a new process with its own connection pool (of size one), instead of using the single worker that maintains its own pool. Is that interpretation right?
Background info:
There's an external server that I make requests to, over https, for every page hit on a site that I run.
Negotiating the SSL handshake is getting costly, because I use php and it doesn't seem to support connection pooling - if I get 300 page requests to my site, they have to do 300 SSL handshakes to the external server, because the connections get closed after each script finishes running.
So I'm attempting to use a reverse proxy under Apache to function as a connection pool, to persist the connections across php processes so I don't have to do the SSL handshake as often.
Sources that gave me this idea:
http://httpd.apache.org/docs/current/mod/mod_proxy.html
http://geeksnotes.livejournal.com/21264.html
First of all, your test method cannot demonstrate connection pooling since for every call, a curl client is born and then it dies. Like dead people don't talk a lot, a dead process cannot keep a connection alive.
You have clients that bothers your proxy server.
Client ====== (A) =====> ProxyServer
Let's call this connection A. Your proxy server does nothing, it is just a show off. The handsome and hardworking server is so humble that he hides behind.
Client ====== (A) =====> ProxyServer ====== (B) =====> WebServer
Here, if I am not wrong, the secured connection is A, not B, right?
Repeating my first point, on your test, you are creating a separate client for each request. Every client needs a separate connection. Connection is something that happens between at least two parties. One side leaves and connection is lost.
Okay, let's forget curl now and look together at what we really want to do.
We want to have SSL on A and we want A side of traffic to be as fast as possible. For this aim, we have already separated side B so it will not make A even slower, right?
Connection pooling? There is no such thing as connection pooling at A. Every client comes and goes making a lot of noise. Only thing that can help you to reduce this noise is "Keep-Alive" which means, keeping connection alive from a client for some short period of time so this very same client can ask for other files that will be required by this request. When we are done, we are done.
For connections on B, connections will be pooled; but this will not bring you any performance since on one-server setup you did not have this part of the noise production.
How do we help this system run faster?
If these two servers are on the same machine, we should get rid of the show-off server and continue with our hardworking webserver. It adds a lot of unnecessary work to the system.
If these are separate machines, then you are being nice to web server by taking at least encyrption (for ssl) load from this poor guy. However, you can be even nicer.
If you want to continue on Apache, switch to mpm_worker from mpm_prefork. In case of 300+ concurrent requests, this will work much better. I really have no idea about the capacity of your hardware; but if handling 300 requests is difficult, I believe this little change will help your system a lot.
If you want to have an even more lightweight system, consider nginx as an alternative to Apache. It is very easy to setup to work with PHP and it will have a better performance.
Other than front-end side of things, also consider checking your database server. Connection pooling will make real difference here. Be sure if your PHP installation is configured to reuse connections to database.
In addition, if you are hosting static files on the same system, then move them out either on another web server or do even better by moving static files to a cloud system with CDN like AWS's S3+CloudFront or Rackspace's CloudFiles. Even without CloudFront, S3 will make you happy. Rackspace's solution comes with Akamai!
Taking out static files will make your web server "oh what happened, what is this silence? ohhh heaven!" since you mentioned this is a website and web pages have many static files for each dynamically generated html page most of the time.
I hope you can save the poor guy from the killer work.
Prefork can still pool 1 connection per backend server per process.
Prefork doesn't necessarily create a new process for each frontend request, the server processes are "pooled" themselves and the behavior depends on e.g. MinSpareServers/MaxSpareServers and friends.
To maximise how often a prefork process will have a backend connection for you, avoid very high or low maxspareservers or very high minspareservers as these will result in "fresh" processes acceptin new connections.
You can log %P in your LogFormat directive to help get an idea if how often processes are being reused.
The Problem in my case was, the the connection pooling between reverse proxy and backend server was not taking place because of the Backend Server Apache closing the SSL connection at the end of each HTTPS request.
The backend Apache Server was doing this becuse of the following Directive being present in the httpd.conf:
SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown
This directive does not make sense when the backend server is connected via a reverse proxy and this can be removed from the backend server config.
I was interested in working with apache http server based on next parameters:
On a single server running listenin in one single port
Having condigured several Virtualhosts, one per domain
running each Virtualhost as an instance listening in por 80
been able to reload one domain configuration without having to restart the rest.
I have doubts about the memory consumption and if there's, how should i improve it.
I don't think that would be a memory problem (correct me if I'm wrong) as soon as there's only one http server running?
or maybe yes because each instance comsumes independent memory?
should be same memory compsumption as running all the VirtuallHosts on the main apache config file?
Many thanks, I mainly want to run one instance per domain because I want to be able to restart each VirtualHost configuration when is needed without having to restart the others.
Thanx
First I don't think you can run several apache instance if they are all listening to port 80. Only one process can bind the port.
Apache will have several child processes, all child of the process listenign on port 80, but each child process can be used for any VirtualHost.
You could achieve it by binding different IP on port 80, so having IP based VirtualHosts. Or by using one Apache as a proxy for other Apache instances binded on other ports.
But the restart problem is not a real problem. Apache can perform safe-restart (reload on some distributions) where each child process is reloaded after the end of his running job. So it's a transparent restart, without any HTTP request killed. Adding or removing a VirtualHost does not need a restart, a simple reload is enought.
I have to think there are ways of achieving what you want without individual instances. Seriously large virtual hosting companies use apache, I am hard pressed to believe your needs are more complex than theirs. Example: http://httpd.apache.org/docs/2.0/vhosts/mass.html
Maybe you should run two apache servers to do a rolling restart when it is really needed, which would prevent any individual site from being down as well.
If two web servers are configured in between a load balancer and a weblogic cluster, will the two Apache server maintain session stickiness?
Say for example, the load balancer forwards the first request to the 1st apache and in turn 1st apache forwards to 1st WL managed instance. Even if the second req from the same user is forwarded by the load balancer to the second apache, will the second apache be able to forward it to the 1st WLManaged instance which served the first request rather than the second WLManaged instance which is not aware of the session information at all.
What should ideally be the behaviour of the weblogic apache plugin? The catch is I don't want to enable session replication on the wl server cluster.
According to the section "Failover, Cookies, and HTTP Sessions" of the Apache HTTP Server Plug-In:
When a request contains session information stored in a cookie or in the POST data, or encoded in a URL, the session ID contains a reference to the specific server instance in which the session was originally established (called the primary server) and a reference to an additional server where the original session is replicated (called the secondary server). A request containing a cookie attempts to connect to the primary server. If that attempt fails, the request is routed to the secondary server. If both the primary and secondary servers fail, the session is lost and the plug-in attempts to make a fresh connection to another server in the dynamic cluster list. See Figure 3-1 Connection Failover.
Note: If the POST data is larger than 64K, the plug-in will not parse the POST data to obtain the session ID. Therefore, if you store the session ID in the POST data, the plug-in cannot route the request to the correct primary or secondary server, resulting in possible loss of session data.
Figure 3-1 Connection Failover
In other words, yes, both Apache servers will be able to forward an incoming request to the "right" WebLogic instance as the session ID contains all the required information for that. Note that there is no real need to confirm this with testing but it would very easy though.
UPDATE: Answering the following comment from the OP
I think this document stands good for only one apache server. In my case I have two and the load balancer forwards the requests to both the servers in a 50:50 manner. I did test this and the weblogic plugin is not maintaining the stickiness.
I understood you are using two apache fontend and I'm not sure this document applies to configuration with one apache server only. As explained, the session ID contains a reference of the primary server (and the secondary server as well) so both apache should be able to deal with it. At least, this is my understanding. Actually, I've worked with a similar configuration in the past but can't remember if things were working as I think they should or if the load balancer was configured to handle stickiness too (i.e. forward to a given Apache server). I have a little doubt now...
Could post your plugin configuration (of both apache server if they differ)? Could you also confirm that things are working as expected when only one apache server is up (and test this with both apache if their configuration differ, which shouldn't be the case though)?
When you have 2 Apache instances with a TCP load balancer in front, the stateflow diagram is not applicable anymore, because the Apache instances do not share their states.
I guess that the WebLogic plug-in maintains a state with a directional mapping [IPAddress+Port -> JVMID]. If it receives a cookie with a JVMID it does not know yet (for instance, it has never sent a request to this server yet), it has no way to know which IPAdress+Port it refers to, so it will not be able to reuse these JVMID and it will reassign new primary/secondary ones, which will be identical for 2 instances (maybe swapped), and which might be different if there are strictly more than 2 instances.
I did not confirm it by running specific tests, but on paper it seems not to work in all cases.
The answer is yes. We've got a write up of this on our blog http://blog.c2b2.co.uk/2012/10/basic-clustering-with-weblogic-12c-and.html which provides step by step instructions on setting up web session failover in a cluster.
Essentially the jsessionid cookie encodes the primary and secondary weblogic servers. Mod-wl parses the cookie and routes the request to the primary server. In your case Managed Server 1. If it is down it will automatically route the request to the backup server Managed Server 2.
The diagram above holds true for 2 Apache servers connected to the same WL cluster. The cookie session info contains details on what WLS to connect to and the plugin will respect that. If the primary (the server it originally connected to) WL server ins't available, then the request would be sent to the secondary server (designated such at the time of the first request based on the rules defined in selecting a "Preferred Replication Group"). This secondary server maintains the same session state as the primary WLS server and should be able to handle the request.
If session replication isn't setup (I think this is OFF by default), then there would be no session copied to another server and if the original/primary WL server goes down, you lose the session.
The answer is NO. As you have 2 Apache webserver, you need to implement stickiness at both hardware and software loadbalancer level in order to achieve your requirement.
Means you already have sticky session implemented in Weblogic plug-in for Apache level, but you also need Source IP based stickiness at the hardware loadbalancer level. This will allow your hardware loadbalancer to send the subsequent request from same user to same apace web server.