How to add X-Robots-Tag "noindex" for multiple subdirectoris (from .htaccess / shared host) - apache

I have a list of folders (named as numbers) located in domain.com/user/uploaded/ directory (for example: ../435/, ../580/ etc.).
I'm trying to use Header set X-Robots-Tag "noindex" from .htaccess for these folders, for example:
domain.com/user/uploaded/435/
domain.com/user/uploaded/580/
etc. for other folders within /user/uploaded/{number} folders.
That means that directory named /435/, /580/ etc. should have 'X-Robots-Tag: noindex' added.
I only have access to .htaccess (it's shared host / litespeed). I tried to add this:
<FilesMatch "^user/uploaded/?$">
Header set X-Robots-Tag: "noindex"
</FilesMatch>
but it doesn't seem to work..

You should put a new .htaccess in user/uploaded/ directory. In this file you will be able to specify your .htaccess rule
Header set X-Robots-Tag: "noindex"
You don't need to use FilesMatch except if you want to target specific files.

Related

Why the files directive doesn't work in Apache's httpd.conf?

I had to noindex pdf files. I did it many times, so in this case, I used a files directive for adding noindex header with X-Robots-Tag, like Google recommends:
<Files ~ "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</Files>
When I have used this before, it worked like a charm. But in this case, I realized no X-Robots-Tag on its own, neither its content (noindex, nofollow) in header. Mod_headers was enabled.
I tried
<FilesMatch ~ "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>
with no luck.
After many further tries and errors I've got it working with
<LocationMatch ~ "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</LocationMatch>
But I don't really understand why the rule I used for years stopped working and the rule I blindly tried, suddenly works.
Could somebody explain it to me?
The documentation for Apache states that FilesMatch takes a regular expression pattern <FilesMatch regexp> and is preferred over using <Files ~ "regexp">
The <FilesMatch> directive limits the scope of the enclosed directives by filename, just as the <Files> directive does. However, it accepts a regular expression.
In my experience with RegEx, this means using a wildcard to match all, rather than the normal <Files> directive which matches on a substring.
As for matching all named files in an expression, that means a small tweak is required to your existing code:
<FilesMatch ".+\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>
If you expect to have a file named .pdf that you also need to exclude, replace + in that expression with *. This is due to how RegEx matches:
. Match any character, once.
+ The previous modifier or block must occur one or more times
* The previous modifier or block may occur zero or more times
This means .+ matches all files with at least one character before .pdf in the filename, and .* matches all files ending on .pdf.
As for an explanation on why your Files directive doesn't work:
The Files directive may be overridden by other Files directives appearing later in the same configuration or within a .htaccess file in the directory you're keeping the pdf files in. Furthermore, there's an order in which the directives are handled and they can all override previous steps:
Directory < Files in Directory < .htaccess < Files in .htaccess < Location. So it's most probably a different part of the configuration that ignores the Files directive

.htaccess to allow only pdf files in a subdirectory

I am trying to write an .htaccess file to only allow access to pdf files in a subdirectory. I'm going to deploy the file on a host that I don't control, so I can't make changes to the apache configuration.
I want to only allow access to .pdf files in the Foo directory. I have attempted:
Deny From All
<FilesMatch ".+\/Foo\/.+\.pdf$">
Allow From All
</FilesMatch>
However, when I attempt to access example.com/bar/Foo/baz.pdf, I am given an HTTP 403 Forbidden response.
How can I deny access to everything, except for pdf files in one particular directory?
Thanks
Create a this inside root .htaccess as your very first rule:
RewriteEngine On
# prohibit everything except .pdf files
RewriteRule ^foo/(?!.*\.pdf$) - [F,NC]

.htaccess deny all except some directories

I have a folder that I wish to deny access to, but I wish there to be a subdirectory (and all its files and any subdirectories) that is accessible.
Sample directory structure:
/modules/
/modules/gallery/public/manifest.xml
/modules/gallery/public/js/core.js
/modules/gallery/public/css/master.css
/modules/news/public/images/status.png
/modules/news/public/css/style.css
The .htaccess file needs to be in "modules" as its subdirectories are user provided (they are plugins to a CMS), each user provided folder might have a "public" directory and only files and folders in "public" should be accessible.
You can set an environment variable if the request contains a /public/, doing something like this in your htaccess file in the modules directory:
SetEnvIf Request_URI /public/ ispublic=1
Order Deny,Allow
Deny from all
Allow from env=ispublic
If you want to be even more restrictive, you can tweak the /public/ regex to include depth, for example, only 1 directory deep into modules:
SetEnvIf Request_URI ^/[^/]+/public/ ispublic=1

Set Content-Disposition header to attachment only on files in a certain directory?

I've got this this rule in my htaccess file to force linked files to download rather than open in the browser:
<FilesMatch "\.(gif|jpe?g|png)$">
ForceType application/octet-stream
Header set Content-Disposition attachment
</FilesMatch>
Is there a way to alter the RegExp so it only applies to files in a certain directory?
Thanks
Like #gumbo said, put the .htaccess file in the highest level folder you want to affect. and those settings will trickle down to sub folders. You may also want to make sure the headers module is enabled before using this in your htaccess file. The following line will generate an error if the headers module is not enabled:
Header set Content-Disposition attachment
here's an example that forces download of mp3 files only if the headers module is enabled:
<IfModule mod_headers.c>
<FilesMatch "\.(mp3|MP3)$">
ForceType audio/mpeg
Header set Content-Disposition "attachment"
Allow from all
</FilesMatch>
</IfModule>
Note: it does not enable the module, it just ignores anything inside the IfModule tags if the module is not enabled.
To enable apache modules you'll either need to edit your httpd.conf file or in wamp server you can click the wamp tray icon and select "Apache -> Apache Modules -> headers_module" or make sure it is checked.
You will probably need to put the directives in the .htaccess file in the particular directory.
Put it in a <Location> directive, and/or modify the regex to exclude slashes or as appropriate.

.htaccess allow one specific file format only in directory listing

I have a directory but only want one file type to be listed.
I've tried the following:
<FilesMatch "\.(?!ext).*$">
Order Allow,Deny
Deny from all
</FilesMatch>
However it gives me a 403.
Is there any way to do this?
Check out the IndexIgnore directive
The IndexIgnore directive adds to the
list of files to hide when listing a
directory. File is a shell-style
wildcard expression or full filename.
Multiple IndexIgnore directives add to
the list, rather than the replacing
the list of ignored files. By default,
the list contains . (the current
directory).
IndexIgnore README .htaccess *.bak *~