Two sets of cache with one server (Varnish Cache) - reverse-proxy

Is it possible to setup Varnish Cache with two independent cache stores?
Then based on a http custom header either use cache1 or cache2.
For example:
Request 1 comes in with header (store=Cache1) this should go to the Cache1
store on Varnish cache
Request 2 comes in which is exactly like Request 1 but with header (store=Cache2) this should go to the Cache2 store on Varnish cache
This use case occurs when the backend responds with different body based on the header (but with the same url) - a legitimate use case.

You could deal with this exactly as described by partitioning Varnish cache, similar to putting Varnish static files cache separately.
But what you want is actually much more simple. Your particular case should be addressed easily by adjusting VCL. You will only need to tell Varnish that the cache should be different based on that particular header. So in your VCL, you would specify:
sub vcl_hash {
if (req.http.store) {
hash_data(req.http.store);
}
}
The vcl_hash specifies that cache should be different depending on the value of store HTTP header.

Related

Define custom load balancing algorithm

Here is the situation:
I have a number of web servers, say 10. I need to use a (software) load balancer which can be implemented using a reverse proxy server, like HAProxy or Varnish. Now, All the traffic that we serve is over https and not http, so Varnish is out of the question.
Now, I want to divide the users' request into a few categories, which depend on one of the input (POST) parameters of the request. Depending on that parameter, I need to divide the request among the servers, as based on that, (even if all other input (POST) parameters are same) different servers would serve differently.
So, I need to define a custom load balancing algorithm, such that, for a particular value of that parameter, I divide the load to specific 3 (say), for some other value, divide the request to specific 2 and for other value(s), to remaining 5.
Since I cannot use varnish, as it cannot be use to terminate ssl (defining custom algorithm would have been easy in VCL), I am thinking of using HA-Proxy.
So, here is the question:
Can anyone help me with how to define a custom load balancing function using HA-Proxy?
I've researched a lot and I could not find any such document with which we can. So, if it is not possible with HA-Proxy, can you refer me to some other reverse-proxy service, that can be used as a load balancer too, such that it meets both the above criteria? (ssl termination and ability to define a custom load balancing).
EDIT:
This question is in succession with one of my previous questions. Varnish to be used for https
I'm not sure what your goal is, but I'd suggest NOT doing custom routing based on the HTTP request body at all. This will perform very poorly, and likely outweigh any benefit you are trying to achieve.
Anything that has to parse values beyond typical HTTP headers at your load balancer will slow things down. Cookies by themselves are generally a bad idea if you can avoid it.
If you can control the path/route values that is likely a much better idea than to parse every POST for certain values.
You can probably achieve what you want via NGINX with lua scripts (the Kong platform is based on them), but I can't say how hard that would be for you...
https://github.com/openresty/lua-nginx-module#readme
Here's an article with a specific example of setting different upstreams based on lua input.
http://sosedoff.com/2012/06/11/dynamic-nginx-upstreams-with-lua-and-redis.html
server {
...BASE CONFIG HERE...
port_in_redirect off;
location /somepath {
lua_need_request_body on;
set $upstream "default.server.hostname";
rewrite_by_lua '
ngx.req.read_body() -- explicitly read the req body
local data = ngx.req.get_body_data()
if data then
-- use data: see
-- https://github.com/openresty/lua-nginx-module#ngxreqget_body_data
ngx.var.upstream = some_deterministic_value
end
'
...OTHER PARAMS...
proxy_pass http://$upstream
}
}

jMeter issue when using Cookie manager and Regular expression extractor

So basically I need to extract an auth token from header response of 1st http request and then use the extracted data in 2nd (and all the following) http requests cookies.
The issue here is, that I have cookie manager set for the whole controller and instead of getting actual data I get the name of variable in my cookie ".authToken=${auth}".
I am guessing the reason is that the variable is not declared when the test reaches Cookie manager, but I would expect jmeter to be smart enough to declare the variable when it gets to the regular expression extractor.
Structure
Thread
Cache Manager
Cookie Manager (Cookie Policy:compatibility; Implementation:HC3)
Controller
Http Request
Regular expression extractor
Http request (I need to use value extracted above in Request Cookie here)
Http request (I need to use the same value in Request Cookie here)
Http request (I need to use the same value in Request Cookie here)
.....
Details:
All the http requests are recorded with implementation HttpClient3.1
Pretty sure I have everything configured correctly as in variable names, regular expression since it works in a very specific case:
The only time it seemed to work correctly was when I had Cookie manager inside the http request and disabled the 'main' Cookie manager (the one for the whole controller). Then it got extracted correctly, but that would be really silly workaround for such a basic requirement and also I have many http requests (over 100) where I need to use the extracted value.
Jmeter doesn't need to use the variable before it's declared by the regular expression extractor, I made sure that the domain is correct and it gets used for the first time after it should have been extracted.
Another workaround I thought of would be having separate threads, have them linked and send the variable in between them, launching the next one once the data gets extracted, but that seems a little bit too drastic.
What I tried:
Splitting http requests into 2 different controllers and using 2 different Cookie managers - got "${auth}" instead of some value
Defining user variable above controller and then using "Apply to: Jmeter Variable" option - again got just string "${auth}" instead of some value.
Moving the Cookie manager to a position after the http request which is used for the extraction - again "${auth}" instead of some value
Setting different cookie's policy (not all of them, but few)
Setting "CookieManager.save.cookies=true" in jmeter.properties (and still have on true)
Any help/ideas are appreciated. I have been trying to figure this out for about an hour and I think I must be missing something very simple.
Alright, finally got this resolved after roughly 2 hours.
Thanks to this article, I was able to do what I needed
https://capacitas.wordpress.com/2013/06/11/thats-the-way-the-cookie-crumbles-jmeter-style-part-2/
In nutshell: You need to use beanshell pre-processor and add the cookie manually
Here is the beanshell script in case the site dies:
import org.apache.jmeter.protocol.http.control.CookieManager;
import org.apache.jmeter.protocol.http.control.Cookie;
CookieManager manager = sampler.getCookieManager();
Cookie cookie = new Cookie("CookieName", vars.get("YourExtractedVariable"), "Domain", "Path", false, 0);
manager.add(cookie);

Cache-control: Is it possible to ignore query parameters when validating the cache?

Is it possible to set a cache-control header communicating with a reverse proxy to ignore query parameters in determining what is a unique uri or in short: validate a cache even if some query parameters have changed?
Sometimes query parameters have nothing to do with the rendering of the page at least from a server side perspective. For instance all utm_* variables from Google Adwords. These are needed for the javascript on your page so you don't want to strip them away and redirect to a cached page but at the same time it would be advantageous not to treat two uri's which are basically the same but have different utm_* parameters as unique when communicating with a reverse proxy.
An example:
http://www.example.com/search?sort=price
http://www.example.com/search?sort=price&utm_campaign=shoes
Is there anyway to tell the reverse proxy using the HTTP 1.1 spec (i.e. some type of http header) that it can just treat these two pages as the same?
You can filter the query string in vcl_recv and there is also a Varnish module for that [1].
Also, you have to keep in mind that query string parameter order matters in this case [2]
See also this related question [3]
[1] https://www.varnish-cache.org/vmod/querystring
[2] http://cyberroadie.wordpress.com/2012/01/05/varnish-reordering-query-string/
[3] Stripping out select querystring attribute/value pairs so varnish will not vary cache by them

Rails caching: expires_in writes to cache; how can I ignore certain params?

I've been using expires_in to add a Cache-Control header to my responses. This way when a given user hits the same page again (e.g. when they hit back button) it won't bother hitting the server again until the cache expires.
What I did not realize is that Rails also writes a copy of the HTML to its cache if you specify public: true. This seems innocuous, but if you have a lot of Adsense traffic you will find that the cache quickly fills up because the gclid param (which is unique for each visitor) is not ignored by expires_in. This is especially problematic if you are using some kind of in-memory cache like Redis or Memcache.
With caches_action I am able to specify a :caches_path argument, and I use that to ignore certain parameters, such as gclid. Is there a way to do something similar with expires_in? Or is the only solution to use 'public: false' ?

Does the `Expires` HTTP header needs to be consistent across multiple cold-cache requests?

I'm implementing a custom web server of a kind. And am looking into adding an Expires header support. However, I'm a little unsure of how exactly to implement it.
If multiple cold-cache requests are being made to the same unchanged resource on the server and the server returned different Expires header (say it uses relative time to calculate the exact value of the Expires date e.g. +6 hours from the request time), does that invalidate the cache on all the proxy servers in-between as well? Or is it impossible to happen (per the spec)?
Does the Expires HTTP header needs to be consistent across multiple cold-cache requests?
Ok, never mind, found the relevant information under the Cache Revalidation and Reload Controls section of the HTTP Spec
Basically, you can serve all the different validators you want but you must be aware that in such case proxies may have a set of different validators from their own cache and from various user agents communicating with the proxy. They may choose to send one to you and that might not be the correct or the most optimal one for the end-users. However, a "best approach" has been suggested in the spec.
I suppose this should covers Expires headers as well as ETags, Cache-Control and whatnot.
Here's the relevant excerpt, in case anyone's interested:
When an intermediate cache is forced,
by means of a max-age=0 directive, to
revalidate its own cache entry, and
the client has supplied its own
validator in the request, the supplied
validator might differ from the
validator currently stored with the
cache entry. In this case, the cache
MAY use either validator in making its
own request without affecting semantic
transparency. However, the choice of
validator might affect performance.
The best approach is for the
intermediate cache to use its own
validator when making its request. If
the server replies with 304 (Not
Modified), then the cache can return
its now validated copy to the client
with a 200 (OK) response. If the
server replies with a new entity and
cache validator, however, the
intermediate cache can compare the
returned validator with the one
provided in the client's request,
using the strong comparison function.
If the client's validator is equal to
the origin server's, then the
intermediate cache simply returns 304
(Not Modified). Otherwise, it returns
the new entity with a 200 (OK)
response. If a request includes the
no-cache directive, it SHOULD NOT
include min-fresh, max-stale, or
max-age.