Hotlinking protecttion of only one directory - apache

I want to block access to only one directory say http://www.example.com/pictures
Every file in other directory should be available as normal.
I have gone through a referrer checking solution. That can be easily bypassed.
I Simply want to show images in pictures directory on a page say view.php
if images in this directory accessed from anywhere else. Just redirect to an image that says hotlinking not allowed"
Thanks

I have gone through a referrer checking solution. That can be easily bypassed.
Yes, it can. But so can any hotlink-prevention scheme. It'll deter casual/careless use, but that's about it.
For what it's worth, referrer-checking is pretty much the way to do it.

Related

Archiving an old PHP website: will any webhost let me totally disable query string support?

I want to archive an old website which was built with PHP. Its URLs are full of .phps and query strings.
I don't want anything to actually change from the perspective of the visitor -- the URLs should remain the same. The only actual difference is that it will no longer be interactive or dynamic.
I ran wget --recursive to spider the site and grab all the static content. So now I have thousands of files such as page.php?param1=a&param2=b. I want to serve them up as they were before, so that means they'll mostly have Content-Type: text/html, and the webserver needs to treat ? and & in the URL as literal ? and & in the files it looks up on disk -- in other words it needs to not support query strings.
And ideally I'd like to host it for free.
My first thought was Netlify, but deployment on Netlify fails if any files have ? in their filename. I'm also concerned that I may not be able to tell it that most of these files are to be served as text/html (and one as application/rss+xml) even though there's no clue about that in their filenames.
I then considered https://surge.sh/, but hit exactly the same problems.
I then tried AWS S3. It's not free but it's pretty close. I got further here: I was able to attach metadata to the files I was uploading so each would have the correct content type, and it doesn't mind the files having ? and & in their filenames. However, its webserver interprets ?... as a query string, and it looks up and serves the file without that suffix. I can't find any way to disable query strings.
Did I miss anything -- is there a way to make any of the above hosts act the way I want them to?
Is there another host which will fit the bill?
If all else fails, I'll find a way to transform all the filenames and all the links between the files. I found how to get wget to transform ? to #, which may be good enough. It would be a shame to go this route, however, since then the URLs are all changing.
I found a solution with Netlify.
I added the wget options --adjust-extension and --restrict-file-names=windows.
The --adjust-extension part adds .html at the end of filenames which were served as HTML but didn't already have that extension, so now we have for example index.php.html. This was the simplest way to get Netlify to serve these files as HTML. It may be possible to skip this and manually specify the content types of these files.
The --restrict-file-names=windows alters filenames in a few ways, the most important of which is that it replaces ? with #. This is needed since Netlify doesn't let us deploy files with ? in the name. It's a bit of a hack; this is not really what this option is meant for.
This gives static files with names like myfile.php#param1=value1&param2=value2.html and myfile.php.html.
I did some cleanup. For example, I needed to adjust a few link and resource paths to be absolute rather than relative due to how Netlify manages presence or lack of trailing slashes.
I wrote a _redirects file to define URL rewriting rules. As the Netlify redirect options documentation shows, we can test for specific query parameters and capture their values. We can use those values in the destinations, and we can specify a 200 code, which makes Netlify handle it as a rewrite rather than a redirection (i.e. the visitor still sees the original URL). An exclamation mark is needed after the 200 code if a "query-string-less" version (such as mypage.php.html) exists, to tell Netlify we are intentionally shadowing.
/mypage.php param1=:param1 param2=:param2 /mypage.php#param1=:param1&param2=:param2.html 200!
/mypage.php param1=:param1 /mypage.php#param1=:param1.html 200!
/mypage.php param2=:param2 /mypage.php#param2=:param2.html 200!
If not all query parameter combinations are actually used in the dumped files, not all of the redirect lines need to be included of course.
There's no need for a final /mypage.php /mypage.php.html 200 line, since Netlify automatically looks for a file with a .html extension added to the requested URL and serves it if found.
I wrote a _headers file to set the content type of my RSS file:
/rss.php
Content-Type: application/rss+xml
I hope this helps somebody.

Best robots.txt practice to hide secret folder

I have a secret folder in my website and I don't want search engines to know about it. I didn't put the folder name in the Disallow rule of robots.txt because writing this folder name in robots.txt means telling my visitors about that secret folder.
My question is, will search engines be able to know about this folder / crawl it even if I don't have any links published to this folder?
The only truly reliable way to hide a directory from everyone is to put it behind a password. If you absolutely cannot put it behind a password, one band-aid solution is to name the folder something like:
http://example.com/secret-aic7bsufbi2jbqnduq2g7yf/
and then block just the first part of the name, like this:
Disallow: /secret-
This will effectively block the directory without revealing its full name. It will prevent any crawler that obeys robots.txt from crawling the directory, but it won't make the directory easy for hostile crawlers to find. Just don't mistake this for actual security. This will keep the major search engines out. There are no guarantees beyond that. Again, the only truly reliable way to keep everyone out of a secret directory is to put the directory behind a password.
Yes they can crawl it.
Your folder is not "secret" at all. Do a quick search for a curl command line to download the whole site then try it on your site to convince yourself your security approach is invalid.
Here is a good example: download allfolders subfolders and files using wget
You can you .htaccess to prevent agents being able to request the directory listing, and this will probably protect you fairly well if you don't give your folder an obvious name like "site", but I'd test it.
see deny direct access to a folder and file by htaccess

Needs to ignore htaccess, when it contains wrong content

I will explain, what I want to do.
We have an eshop, which generates htaccess file (server is Unix). Sometimes, it crashes during generating htaccess file, then whole site (frontend, admin, cron scripts => everything) is returning server error 500, due to not finished content in htaccess file.
And here is my question. I created script, which will regenerate htaccess file. But this script cant be anywhere in root (or subdirectories), because it will also return 500. Also subdomains are in root in subdirectory /_sub.
Is there any change to put it somewhere, where our customer will be able to execute it himself?
I cant use another domain; so I am asking, if there is any chance to ignore htaccess file, when it contants wrong content?
Thanks a lot.
EDITED:
I know, maybe one solution is here, but I think it will consume much more time, to code it all, but maybe not:
on another domain code script, which will connect through socket to ftp of eshop domain
it will delete htaccess file and recreate it to basic content
executing that cron, which will regenerate whole htaccess file
This should work, I think.
The question is, how layman your customer are. Thw options: 1) you make this from cron 2) you put this in a cgi out.
But AFAIK the best were, if only a testing script from cron runs, and this reconstructed only the bad htaccess files.
I know, maybe one solution is here, but I think it will consume much more time, to code it all, but maybe not:
on another domain code script, which will connect through socket to ftp of eshop domain
it will delete htaccess file and recreate it to basic content
executing that cron, which will regenerate whole htaccess file
This should work, I think.

Can someone look into a web servers folders?

More a web server security question.
Is it possible for someone to "probe" into a folder on a web server, even if an index file is in place?
I assume they can't, but if I wanted to store .pdf applications as random names (93fe3509edif094.pdf) I want to make sure there's no way to list all the pdfs in the folder.
Thank you.
Just disable the directory listing in your web server
Generally, no. Instead of creating an "index" file, you may also unset the apache "Options Indexes"
Generally speaking, no. Especially if you explicitly turn off the directory listing for that specific directory.
<Directory /path/to/directory>
Options -Indexes
</Directory>
Source: http://httpd.apache.org/docs/1.3/misc/FAQ.html
However, you should be securing files through some sort of authentication process rather than just file names. What you propose can be found by simply brute forcing the file name. Also, people can share URLs, folks can sniff and find the URL, etc. Use a better method.
Web servers have a setting that controls whether or not the directory listing can be browsed. Apache's is called Options Indexes:
Indexes
If a URL which maps to a directory is requested, and the there is no DirectoryIndex (e.g., index.html) in that directory, then the server will return a formatted listing of the directory.
However, if anyone knows the URL in advance, or can easily guess the filename, they can still load the pdf.
Depends on the server. The server always decides what the client may and may not see. In your case, Apache, see Mitro's answer.

Updating Files on Apache

I'm having trouble with my Apache Web Server. I have a folder (htdocs\images) where I have a number of images already in place. I can browse them and see them on my web server (and access them via HTML). I added a new image in there today, and went to browse to it, and it can't be found. I double and triple checked the path and everything. I even restarted Apache and that didn't seem to help.
I'm really confused as to what's going on here. Anybody have any suggestions?
Thank you.
Edit I just turned on the ability for the images directory to be listed, browsed to it (http://127.0.0.1/images/) and I was able to see all the previous images that were in the folder, but not the new one.
Turn directory indexes on for htdocs\images, remove (or move out of the way) any index.* files, and point your browser at http://yoursite/images/
That should give you a full listing of files in that directory. If the file you're looking for isn't there, then Apache is looking at a different directory than you think it is. You'll have to search your httpd.conf for clues -- DocumentRoot, Alias, AliasMatch, Redirect, RedirectMatch, RewriteRule -- there are probably dozens of apache directives that could be causing the web server to get its documents from somewhere other than where you think it's looking.
make sure the caSE and spelling are 100% correct.
There is not magic in programming (some may disagree:), so look for silly errors. Wrong server? Case of your letters? Wrong extension?
There's a chance it could be due to the cookies stored on your device. I would delete all cookies to the website you're working on before you refresh again