How to disable caching of a rewrite rule which proxies an internal server? - apache

I'm using an htaccess rule to proxy to an internal server, using the answer recommended on this question, "Can ProxyPass and ProxyPassReverse Work in htaccess". I'm using htaccess as that is all I have access to. The method suggested works, but when I make a change on one of the internal pages and reload (from the external server) I don't even see it hitting the internal server, even after clearing the cache on the browser. In fact, if I try to load the page from another browser which never has tried to load the page before, it too gets the old copy.
This suggests something is being cached on the server, but how to change this? The apparent caching is rather annoying as I am trying to fix some issues that only occur on the proxied page.
If I hit the internal server directly and reload after a change, I always get the latest page.
I have tried a <filesMatch ...> rule for the affected pattern (using the same pattern as used in the RewriteRule in the following manner:
<filesMatch "^/?somedir/(.*)$">
Header set Cache-Control "max-age=0, private, no-store, no-cache, must-revalidate"
</filesMatch>
My rewrite rule looks like this, and comes after the filesMatch directive:
RewriteEngine On
RewriteRule ^/?somedir/(.*)$ https://internal.local.net:8000/$1 [L,P]
But this has not had any effect. I have also tried "NoCache *" but this directive causes an error as it is not allowed in an .htaccess file.

The P-flag in your RewriteRule causes the request to be proxied to the internal server using mod_proxy. mod_proxy by itself does not cache content. The caching is probably a result of mod_cache being enabled as well on the server. The settings you need to disable caching for your internal server can unfortunately only be done in server or virtual-host config. The solution would be to add what you tried to the configuration of the internal server thus telling mod_cache that it should not cache any response from your internal server:
Using .htaccess
Header set Cache-Control "max-age=0, private, no-store, no-cache, must-revalidate"
or PHP
header('Cache-Control: no-cache, no-store, must-revalidate'); // HTTP 1.1.
header('Pragma: no-cache'); // HTTP 1.0.
header('Expires: 0'); // Proxies.

Try adding this in an htaccess file in your "somedir" directory:
ExpiresActive On
ExpiresDefault "now"

Related

Remove a header based on query param with varnish

I want to remove a cache-control header from URL's with a specific query params. e.g. when the query paramater ajax=1 is present.
e.g
www.domain.com?p=3&scroll=1&ajax=1&scroll=1
These are getting cached by chrome browsers for longer than I would like and I would like to stop that in this specific case. I have tried with .htaccess which works for static files however not in action on the URL's mentioned above.
RewriteEngine on
RewriteCond %{QUERY_STRING} (^|&)ajax=1(&|$)
Header unset "Cache-Control"
I could use a cache buster in the next website release but difficult in production and worried it would unnecessarily cache lots of files in user browsers so would rather achieve server side.
My server has Cloudflare then NGINX terminating SSL to Varnish then Apache with a Magento 2 instance running on there. So thinking i could possibly achieve this with NGINX or Varnish configs, or even Cloudflare. I however couldn't seem to find a way to achieve this with page rules in Cloudflare, or could not find examples for Varnish or Nginx.
I'm assuming you don't want to cache when ajax=1 is part of your URL params?
You can do this in Varnish using the following VCL snippet:
sub vcl_backend_response {
if(bereq.url ~ "\?([^&]*&)*ajax=1(&[^&]*)*$") {
set beresp.http.cache-control = "private, no-cache, no-store";
set beresp.uncacheable = true;
}
}
This snippet will make sure Varnish doesn't cache responses where the URL contains an ajax=1 URL parameter. It will also make sure any caching proxy that sits in front will not cache, because of the Cache-Control: private, no-cache, no-store.
Is this what you're looking for?

Why nginx does not forward Vary header sent by Apache in proxy mode?

I'm using Plesk (seems to be 17.8.11 provided by OVH) and nginx is configured as proxy. My PHP script returns images into WEBP format when the browser accept it, otherwise it returns orignal format (JPG or PNG).
In .htaccess I return header Vary: Accept so proxies know that the content depends on the Accept header.
In nginx settings of Plesk I only checked the 'Proxy mode' option, other checkboxes are cleared.
When I fetch the image the Vary: Accept is not present, I cannot imagine that nginx does not handle this header, please help me to figure this out.
For the Vary: header to be allowed and understood by nginx, you need the gzip on and gzip_vary on settings in your /etc/nginx/nginx.conf.
Plesk actually have a documentation about it, did you check the Plesk Support website ?
https://support.plesk.com/hc/en-us/articles/213380049-How-to-enable-disable-gzip-compression-in-nginx-on-a-Plesk-server
By the way, your Plesk version is quite old, I would recommend you update it.
I finally found the reason: I was not sending "Vary: Accept" header for ".webp" extension, only for ".jpg" and ".png". My URLs ends with .jpg or .png, never .webp and this is working good with Apache. Here was my htaccess directives:
<IfModule mod_setenvif.c>
SetEnvIf Request_URI "\.(jpe?g|png)$" REQUEST_image
</IfModule>
<IfModule mod_headers.c>
Header append Vary Accept env=REQUEST_image
</IfModule>
To fix it I added .webp in URLs filter:
<IfModule mod_setenvif.c>
SetEnvIf Request_URI "\.(jpe?g|png|webp)$" REQUEST_image
</IfModule>
<IfModule mod_headers.c>
Header append Vary Accept env=REQUEST_image
</IfModule>
Now it's all good.

Clear web browser cache programmatically

I am working on a website with PHP in backend and AngularJS in frontend. and it's served via apache2.4.
My problem is when I update my website to a new version some users cannot see the latest modifications, so I added this .htaccess to force cleaning the cache every 1 hour, but it doesn't work as I expected.
FileETag None
<ifModule mod_headers.c>
Header unset ETag
Header set Cache-Control "max-age=3600, must-revalidate, private"
</ifModule>
Could you give me the right cache configuration to force the browsers to get the last update whenever a new version is available?
Within your build process, you could append a query parameter to your static files such as JS / CSS like : app.js?1476109496 (where epoch is a unique reference such as deployment epoch, commit hash or similar) which would cause browsers to request a new version without needing to mess with your .htaccess.

How to disable X-Powered-By on OVH mutualized server using apache?

I tried to disable both X-Powered-By and Server for security concerns by adding the following to my .htaccess in a OVH mutualized server.
<IfModule mod_headers.c>
# Security disable headers. http://www.shanison.com/2012/07/05/unset-apache-response-header-protect-your-server-information/
Header unset Server
Header unset X-Powered-By
</IfModule>
But it doesn't work, I still get these headers when running HTTP requests. Why? It is not possible because somehow the mod_headers.c is not loaded on a mutualized server?
There is a PHP function as well which is able to do this:
<?php
header_remove("X-Powered-By");
?>
http://php.net/manual/en/function.header-remove.php
Hope this help
you can add these lines to your .htaccess
Header always unset X-Powered-By
Header unset X-Powered-By

Why is this FilesMatch not matching correctly?

We have been attempting to configure our server not to cache our .htm files as it is causing a few issues with our analytics package as well as not displaying the pages correctly if the visitor hits the back button in their browser.
We have attempted to tackle it by adding:
<FilesMatch "\.(htm)$">
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires "Wed, 11 Jan 1984 05:00:00 GMT"
Header set Warning "Testing"
</FilesMatch>
to our httd file but it does not appear to execute, however, when we move the Header set outside of the FilesMatch it appears to execute fine..
Anyone have any ideas where we are going wrong?
I recently needed to figure out the same kind of problem and, although this post pointed me in the right direction, I wanted to share some clarifying information for the edification of those who search on this topic in the future.
David, your initial FilesMatch was not working because FilesMatch only works on real, physical files that exist on your filesystem. http://httpd.apache.org/docs/current/sections.html states it as:
The Directory and Files directives, along with their regex counterparts, apply directives to parts of the filesystem.
This is also why your second post using LocationMatch resolved the issue. Also from http://httpd.apache.org/docs/current/sections.html, it states:
The Location directive and its regex counterpart, on the other hand, change the configuration for content in the webspace. < SNIP > The directive need not have anything to do with the filesystem. For example, the following example shows how to map a particular URL to an internal Apache HTTP Server handler provided by mod_status. No file called server-status needs to exist in the filesystem.
<Location /server-status>
SetHandler server-status
</Location>
The Apache docs summarizes this behavior with the following statement:
Use Location to apply directives to content that lives outside the filesystem. For content that lives in the filesystem, use Directory and Files. An exception is < Location / >, which is an easy way to
apply a configuration to the entire server.
For those that want to understand more of the mechanics, this is how I understand the internals:
Location directives match based on the HTTP request URI (e.g. example.com/this/is/a/uri.htm without the example.com part).
Directory and Files directives, on the other hand, match based on whether there is a directory path or file in the filesystem of the DocumentRoot that matches to respective part of the the HTTP request URI
The Apache docs summarizes this behavior as:
What to use When
Choosing between filesystem containers and webspace containers is actually quite easy. When applying directives to objects that reside in the filesystem always use Directory or Files. When applying directives to objects that do not reside in the filesystem (such as a webpage generated from a database), use Location.
[IMPORTANT!] It is important to never use Location when trying to restrict access to objects in the filesystem. This is because many different webspace locations (URLs) could map to the same filesystem location, allowing your restrictions to be circumvented.
This issue has now been resolved.
In order to get it to work we have changed from using FilesMatch to LocationMatch and now the headers are being set perfectly.
We believe this is because the page is being redirected from a JSP page to an HTML page.
<LocationMatch "\.(htm|html)$">
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires "Wed, 11 Jan 1984 05:00:00 GMT"
Header set Warning "Testing"
</LocationMatch>
Hopefully others will find this helpful.