Varnish 4 rewrite URL transparently - apache

I am looking after a website that is currently running a pretty standard varnish/apache set up. The client needs to add a new domain that transparently serves from a path/query string in order to create a lightweight version of their site. For example:
The user visits mobile.example.com which points to the same server as example.com
Varnish rewrites the mobile.example.com request to example.com/mobile?theme=mobile
User receives the page served from example.com/mobile?theme=mobile by apache, but stays on mobile.example.com
We need to hit both a path and add the query string here, as well as maintain any path the user has entered, i.e: mobile.example.com/test should serve the content at example.com/mobile/test?theme=mobile
Any tips for doing this with Varnish 4? Is it possible?

Got it working!
if (req.http.host ~ "^mobile\.example\.com") {
set req.http.host = "example.com";
set req.url = regsub(req.url, "^/", "/mobile/");
set req.url = regsub(req.url, "$", "?theme=mobile");
}

Related

ktor redirecting to 0.0.0.0 when doing an https redirect

I've added an http redirect to my Ktor application and it's redirecting to https://0.0.0.0 instead of to the actual domain's https
#ExperimentalTime
fun Application.module() {
if (ENV.env != LOCAL) {
install(ForwardedHeaderSupport)
install(XForwardedHeaderSupport)
install(HttpsRedirect)
}
Intercepting the route and printing out the host
routing {
intercept(ApplicationCallPipeline.Features) {
val host = this.context.request.host()
i seem to be getting 0:0:0:0:0:0:0:0 for the host
Do i need to add any special headers to Google Cloud's Load Balancer for this https redirect to work correctly? Seems like it's not picking up the correct host
As your Ktor server is hidden behind a reverse proxy, it isn't tied to the "external" host of your site. Ktor has specific feature to handle working behind reverse proxy, so it should be as simple as install(XForwardedHeaderSupport) during configuration and referencing request.origin.remoteHost to get actual host.
Let's try to see what's going on.
You create a service under http://example.org. On the port 80 of the host for example.org, there is a load balancer. It handles all the incoming traffic, routing it to servers behind itself.
Your actual application is running on another virtual machine. It has its own IP address, internal to your cloud, and accessible by the load balancer.
Let's see a flow of HTTP request and response for this system.
An external user sends an HTTP request to GET / with Host: example.org on port 80 of example.org.
The load balancer gets the request, checks its rules and finds an internal server to direct the request to.
Load balancer crafts the new HTTP request, mostly copying incoming data, but updating Host header and adding several X-Forwarded-* headers to keep information about the proxied request (see here for info specific to GCP).
The request hits your server. At this point you can analyze X-Forwarded-* headers to see if you are behind a reverse proxy, and get needed details of the actual query sent by the actual user, like original host.
You craft the HTTP response, and your server sends it back to the load balancer.
Load balancer passes this respone to the external user.
Note that although there is RFC 7239 for specifying information on request forwarding, GCP load balancer seems to use de-facto standard X-Forwarded-* headers, so you need XForwardedHeaderSupport, not ForwardedHeaderSupport (note additional X).
So it seems either Google Cloud Load Balancer is sending the wrong headers or Ktor is reading the wrong headers or both.
I've tried
install(ForwardedHeaderSupport)
install(XForwardedHeaderSupport)
install(HttpsRedirect)
or
//install(ForwardedHeaderSupport)
install(XForwardedHeaderSupport)
install(HttpsRedirect)
or
install(ForwardedHeaderSupport)
//install(XForwardedHeaderSupport)
install(HttpsRedirect)
or
//install(ForwardedHeaderSupport)
//install(XForwardedHeaderSupport)
install(HttpsRedirect)
All these combinations are working on another project, but that project is using an older version of Ktor (this being the one that was released with 1.4 rc) and that project is also using an older Google Cloud load balancer setup.
So i've decided to roll my own.
This line will log all the headers coming in with your request,
log.info(context.request.headers.toMap().toString())
then just pick the relevant ones and build an https redirect:
routing {
intercept(ApplicationCallPipeline.Features) {
if (ENV.env != LOCAL) {
log.info(context.request.headers.toMap().toString())
// workaround for call.request.host that contains the wrong host
// and not redirecting properly to the correct https url
val proto = call.request.header("X-Forwarded-Proto")
val host = call.request.header("Host")
val path = call.request.path()
if (host == null || proto == null) {
log.error("Unknown host / port")
} else if (proto == "http") {
val newUrl = "https://$host$path"
log.info("https redirecting to $newUrl")
// redirect browser
this.context.respondRedirect(url = newUrl, permanent = true)
this.finish()
}
}
}

Remove a header based on query param with varnish

I want to remove a cache-control header from URL's with a specific query params. e.g. when the query paramater ajax=1 is present.
e.g
www.domain.com?p=3&scroll=1&ajax=1&scroll=1
These are getting cached by chrome browsers for longer than I would like and I would like to stop that in this specific case. I have tried with .htaccess which works for static files however not in action on the URL's mentioned above.
RewriteEngine on
RewriteCond %{QUERY_STRING} (^|&)ajax=1(&|$)
Header unset "Cache-Control"
I could use a cache buster in the next website release but difficult in production and worried it would unnecessarily cache lots of files in user browsers so would rather achieve server side.
My server has Cloudflare then NGINX terminating SSL to Varnish then Apache with a Magento 2 instance running on there. So thinking i could possibly achieve this with NGINX or Varnish configs, or even Cloudflare. I however couldn't seem to find a way to achieve this with page rules in Cloudflare, or could not find examples for Varnish or Nginx.
I'm assuming you don't want to cache when ajax=1 is part of your URL params?
You can do this in Varnish using the following VCL snippet:
sub vcl_backend_response {
if(bereq.url ~ "\?([^&]*&)*ajax=1(&[^&]*)*$") {
set beresp.http.cache-control = "private, no-cache, no-store";
set beresp.uncacheable = true;
}
}
This snippet will make sure Varnish doesn't cache responses where the URL contains an ajax=1 URL parameter. It will also make sure any caching proxy that sits in front will not cache, because of the Cache-Control: private, no-cache, no-store.
Is this what you're looking for?

cloudflare worker rewrite Host Header

How do I set up another Host Header in the cloudflare worker?
For example, I have set up a 1.2.3.4 ip for my site's www record
By default www requests are sent with the header www.ex.com but I want to send the www requests with the new.ex.com header
You need to configure a DNS record for new.ex.com so that it points to the same IP address. Then, you can make a fetch() request with new.ex.com in the URL.
If you cannot make new.ex.com point at the right IP, another alternative is to make a fetch() request using the resolveOverride option to specify a different hostname's IP address to use:
fetch("https://new.ex.com", {cf: {resolveOverride: "www.ex.com"}});
Note that this only works if both hostnames involved are under your zone. Documentation about resolveOverride can be found here.
You cannot directly set the Host header because doing so could allow bypassing of security settings when making requests to third-party servers that also use Cloudflare.
// Parse the URL.
let url = new URL(request.url)
// Change the hostname.
url.hostname = "check-server.example.com"
// Construct a new request
request = new Request(url, request)
Note that this will affect the Host header seen by the origin
(it'll be check-server.example.com). Sometimes people want the Host header to remain the same.
// Tell Cloudflare to connect to `check-server.example.com`
// instead of the hostname specified in the URL.
request = new Request(request,
{cf: {resolveOverride: "check-server.example.com"}})

Varnish, using URL for backend instead of IP address

I started setting up a reversed proxy server with varnish. I'm not experienced setting up varnish.
I am trying to use url of the backend instead of ip address with no luck:
1- Approach a:
backend default {
.host = "www.backend.mysite.com";
.port = "80";
}
Issue a: Restarting varnish keeps failing.
2- Approach b:
sub vcl_recv {
set req.http.Host = "www.backend.mysite.com";
...
}
Issue b: with this approach, when I enter mysite.com in browser bar, it gets redirected to www.backend.mysite.com.
I don't think this is an accepted behavior for this rule. Correct me if I am wrong.
Thanks,
Shab
your first try should work but your varnish server needs to have access to internet or at least to dns servers.
when you start varnish it will make a dns lookup and replace www.backend.mysite.com by the first ip it is given by dns.

How to tell varnish to cache specific file types

I have a dedicated server that host my own websites. I have installed varnish with the default VCL file. Now I want to tell varnish to do the following:
Cache only the following static file types (.js, .css, .jpg, .png, .gif, .jpg). Those are the server file types that are served, not URLs ended with the those extensions.
Do not cache files that are bigger than over 1M byte
Caching of any file should expire in 1 day (or whatever period).
Caching may happen only when Apache send 200 HTTP code.
Otherwise leave the request intact so it would be served by Apache or whatever backend.
What should I write in the VCL file to achieve those requirements ? Or what should I do ?
You can do all of this in the vcl_fetch subroutine. This should be considered pseudocode.
if (beresp.http.content-type ~ "text/javascript|text/css|image/.*") {
if (std.integer(beresp.http.Content-Length,0) < /* max size in bytes here */ ) {
if (beresp.status == 200) { /* backend returned 200 */
set obj.ttl = 86400; /* cache for one day */
return (deliver);
}
}
}
set obj.ttl = 120;
return (hit_for_pass); /* won't be cached */
What I did :
1- Isolate all static contents to another domain (i.e. domain to serve the dynamic pages is different from the domain the serve static contents.)
2- Assign another dedicated IP Address to the domain that serve static contents
3- Tell varnish to only listen to that IP (i.e. static contents IP) on port 80
4- Using Apache conf to control caching period to each static content type (varnish will just obey that headers)
cons:
1- Varnish will not even listen or process to the requests that it should leave intact. Those requests (for dynamic pages) are going directly to Apache since Apache listen to the original IP (performance).
2- No need to change default VCL default file (only if you want to debug) and that is helpful for those who don't know the principles of VCL language.
3- You are controlling everything from Apache conf.
Pros:
1- You have to buy a new dedicated IP if you don't have a spare one.
thanks