S3 no trailing slash removes query parameters

S3 no trailing slash removes query parameters - amazon-s3

I have a website set up using S3 as a static website host. If I have a link such as "xxx.com/play?test=1", this gets a 302 redirect to "xxx.com/play/" with the query parameter stripped.
I am trying to find a way so that the query string parameters gets preserved. I cannot change the original link (xxx.com/play?test=1) - but it feels that within either the redirect rules, or within the objects themselves I can make this work. Is this possible?

Don't know if you found a solution by now... but here it is for future reference.
I'm guessing you have a "play" folder in your bucket and that's why it's redirected.
The solution it to create a "play" object/file (alongside the folder) without the ".html" extension and change the metadata to "text/html".

Related

Is it possible to get folder names from a root URL using vb.net

Say I have a root URL https://google.com/Files
I have 2 folders Temp1 and Temp2 under it (which can be accessed using https://google.com/Files/Temp1 and https://google.com/Files/Temp2. NOTE: I do not want to access them in this way)
How can I get the file path dynamically for these 2 folders using just the root URL? The folders might increase as and when required.
I had tried using GetDirectories method but realized it wont work on URLs.
Any leads will be appreciated, thanks!

Apache rewriting eats one level of escaping (%23)

I want to use fancy URLs for a tag filter on my website. The URLs should look like http://example.com/source/+tag1+tag2. This should filter for all items tagged with tag1 and tag2. I came up with the following rewrite rule for that, which I have saved to the root directory of the site:
RewriteRule ^([^+]+)(\+.+)$ $1?tags=$2 [L]
This works fine for all normal tag names, but it fails for the tag name "c#". I know that the hash character is not sent to the server, so the tag name is url-encoded and the link in the HTML page is like this: ./+c%23 But the target page will only see the "c" in its tags parameter, the rest and anything after the "#" is not there anymore.
I have enabled Apache's rewrite logging and saw that it already logs the incoming URL request like …/+c#. This made me think that another level of escaping could be required. So I tried with %2523 which actually passed the rewriting successfully and the whole string "c#" turned up in my page.
But then again, when I access the page with its internal URL like ./?tags=c%23, it already works, too. So why is Apache eating up one level of escaping? Is there a hidden rewrite flag I can use to avoid that? Do I need to use public URLs that are double-encoded for my fancy URLs to work? Or will it be too messy and I should instead just rename my tag to "csharp"?

I think you need the B flag (so use [L,B])

Hash character in URLs (accessing and redirecting in Apache)

It looks as though this question has been asked in part by some others, but I can't find the answer I'm looking for specifically, so I thought I'd pose my particular scenario in case anyone is able to help.
We have an old website (developed externally by a third party) that is due to be retired and replaced by a new site designed in house. For reasons best known to themselves, the developers of the old site used the hash character as part of the URL for the old site (www.mysite.com/#/my-content-stuff). To assist with the transition and help with SEO I need to set up 301 redirects for the top performing URLs from the old site. As I'm now discovering however, I'm not able to set up a simple redirect in the .htaccess file as I believe it takes the hash character to be a comment and ignores the remainder of the line. I've tried escape characters, using %23 instead, wildcard matching, nothing seems to work.
As a workaround, I wondered about simply creating dummy files with the same paths and URLs as the old site had, then simply creating HTML redirects within them to drive traffic to the correct new pages, but it looks as though the server is doing something similar regarding the hash character in the URL, and ignoring anything afterit. So, if I create a sub-folder on my news server called '#' and create a file in there called 'test.html', I expected to be able to just go to 'www.myNEWsite.com/#/test.html', but it just takes me to the default root file of my site.
Please can anyone shed any light on how I might get around this? I must admit I'm not that clued up on Apache so I'm having to learn a lot as I go.
Many thanks in advance for any pointers or info anyone can provide.
Cheers,
Rich

A hash character in the URL specifies the anchor, and it's not even sent to your webserver. A redirect is impossible on the server side, and the old developer probably did it using JavaScript. Implement fallback URLs without the hash instead, and have a global JavaScript script detect these URLs and redirect automatically.

Hash tags cannot be read by the server. They are regarded as locations within the document and are therefore not exposed to the server. The client is the only one whom see's these. The best you could do is use a "meta refresh" tag, or alternatively, you could use javascript to detect the url, and if its one which requires 301 redirection, use "window.location" to move the user to a full url where mod_rewrite or a php page can issue a 301 header.
However neither are SEO friendly and only really solve the issue for users that click onto an old link via an external site
<!-- Put in head tag so the page does not wait to load the content-->
<script type="text/javascript">
if(window.location.hash != "") {
var h = window.location.hash.match(/#\/?(.*)/i)[1];
switch(h) {
case "something_old":
window.location = "/something_new.html";
break;
case "something_also_old":
window.location = "/something_also_new.html";
break;
}
}
</script>

How to Apache Rewrite URL without redirect?

Is it possible to replace a url name like http://mysite.com/sub/ with http://sub.mysite.com using htaccess?
I don't want to make a redirect rather than just to map a sub-directory address to a sub-domain address. So when a person types an address like http//sub.mysite.com/image.jpg this address remains in the browser but it reads the content from http//mysite.com/sub/image.jpg

Yes it is possible but you should have root access to the server to start with, you will need to also make some DNS record changes so ensure you have this access also before starting.
I have used both methods previously and they both work, however I found using folders was the winner at the end of the day for our usage, this simplified things significantly for us and we didn't have to worry about changing linkages in scripts, e.g. from http://www.my-site.com/images to http://images.my-site.com depending on the code structure being used.
Instead of typing these long instructions out I am going to give you 2 references that have slightly different approaches depending on if you have a physical folder to use or if it is a variable in the URL. They say it probably as well as I can anyway ;-)
Physical folder method :: URL variable method
I hope this helps you

Difficulties with .htaccess and Blocking Specific File Extensions

I have a rather complicated situation where I run a personal blog where every Friday and Sunday, I will post up music on the blog by uploading the mp3s into a folder, where a Flash mp3 player accesses it and plays it for the world.
Recently, some website called Dizzler, which is like a spider for mp3 files (Like the ones I host on my server!) and lets people play them via their own proprietary player. Now, I normally wouldn't be against other people using my server for their own gain but this recently got out of hand. In the last week of December, they managed to rack up 100k hits on one song and used up 6GB of bandwidth.
In that last week of December, I edited my .htaccess file to remove access to mp3s on my server without taking away access to my mp3s (So "deny all" isn't an option!) and I used this code:
RewriteEngine on
RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !^(www\.)?mydomain.com [NC]
RewriteRule \.(mp3)$ - [NC,F]
Options -Indexes
It worked pretty well with one exception - it broke every Wordpress installation on my server. What I mean is that outside of the index page, if you clicked on an entry in Wordpress, it wouldn't be able to find it. My host's solution was to add "RewriteEngine on" to every .htaccess file for every installation and in the root of the web server root.
That was a great fix and all the pages work again - but it is no longer blocking my mp3 files in that folder.
What can I do?
PS. For clarification, the code above is in an .htaccess file in the folder containing the mp3s. Hope that helps!

Huge thanks to Vinko Vrsalovic for all the help, definitely helped point me in the right direction, currently using the following code:
SetEnvIfNoCase Referer www\.dizzler\.com bad_referer
SetEnvIfNoCase Referer ".*(dizzler|beemp3|skreemr).*" BlockedReferer
SetEnvIfNoCase REMOTE_ADDR ".*(220.181.38.82|202.108.23.172|66.232.150.219).*" BlockedAddress
# deny any matches from above and send a 403 denied
<FilesMatch "\.mp3$">
order deny,allow
deny from env=bad_referer
deny from env=BlockedReferer
deny from env=BlockedAddress
</FilesMatch>
Testing it out tonight, will report back tomorrow if it works!

I'm posting this as another answer instead of adding this to my other post because it approaches the problem from a different angle. Here I am assuming that all your mp3s are in the same folder.
The problem you are facing is due to sloppy coding on the part of whoever made the media-player thing that wordpress uses. What happens is that the player runs on the visiting user's machine, and actually downloads the mp3 and plays it locally. The problem arises because the player does not provide any useful headers at all: the useragent is that of your browser, the referrer is blank, etc. As such, it is completely impossible to tell if the request is coming from the player, or from a browser that clicked your link in an audio search engine. Really, the only way to protect your mp3s from being indexed is to change the link as often as possible.
Which is precisely the plan. In a nutshell, here is what we are going to do:
change the path to your mp3s. This stays SECRET.
create a script to proxy for the mp3s, which requires a valid key which changes every hour
change all your uses of the mp3 player to use the mp3 proxy script but with a placeholder key
create a script to proxy for your webserver, which replaces the key placeholder with the actual key
use .htaccess to rewrite all requests to your server to use the webserver proxy script.
The upshot of all of this is that your user experience will not change, but if a crawler crawls your links, they will only be valid until midnight of that day, at which point requests to that url will result in a snippy message (or even an mp3 of you asking them to please not download your stuff).
Ready? OK, lets go!
Step 1:
First things first, make sure you renamed your mp3s folder! This will break all existing links (and failing to do this will mean all the links already crawled will remain valid). Secondly, create a robots.txt file to stop google and other search engines from indexing your mp3s folder.
Now, create a file in your root directory called mp3serve.php with the following contents:
<?php
/* This script checks 'key', and if it's valid, serves the mp3
* A valid key is defined as the md5 of the current date in
* yyyy-mm-dd-hh format concatenated with the string
* "Hello there :)"
*
* The key can be anything so long as we are consistent in this
* and the viewer proxy thing we're going to make.
*/
// edit this variable to reflect your server
$music_folder = "/new/path/to/mp3s/";
// get inputs of 'file' and 'key'
// 'file' should be the filename of the mp3 WITHOUT the extension
$file = $_GET['file'];
$key = $_GET['key'];
// get todays date
$date = date("Y-m-d-H");
// calculate the valid key
$valid = md5($date+"Hello there :)");
if ($key == $valid)
{
// if the key is valid, get the song in the path:
print(file_get_contents("$music_folder/$file.mp3"));
}
else
{
// if the key is invalid, print an admonishing message:
print("Please don't try to download my songs, poopface.");
}
?>
What this does is it takes the filename of an MP3 and a key of some kind, and serves the file contents if the key is valid. Note that this script:
makes no checks at all that $file points to what you expect it to, other than the fact that it tries to make sure it will only ever return mp3 files.
does not return valid headers for mp3 files - they'll render as text in a browser. This is easy to fix but the correct header eludes me for the moment... and anyway the wordpress mp3 player doesn't care, so it's all good :)
Step 2:
Now for the slightly tricky part: we have to rewrite the links dynamically. The easiest way to do this is to write a "local-proxy" thing, which really is a lot easier than it sounds. What we will do is write a script that gets what your page would have outputted and corrects the mp3 links. In my example we will edit all of your articles with mp3s in them, but if you want to get fancy this is not completely necessary.
First, edit all of your articles with mp3-players in them. You could automate this, but unless WP has a "find/replace in all articles" function I would advise against it for the sole reason that you might screw up and destroy your articles. In any case, edit them and replace the mp3 links in the players from
/path/to/mp3s/<filename>.mp3
to
/mp3serve.php?file=<filename>&key=[{mp3_file_key}]
Now, create another php script in your root directory called proxyviewer.php with the following contents:
<?php
/*
* The purpose of this file is to act as a proxy in which we can dynamically
* rewrite the page contents. Specifically, we want to get the page that the
* user WOULD have seen, and replace all instances of our key placeholder
* with the actual correct key
*/
// get the requested path
$request = $_GET['req'];
// get what the source output WOULD have been
// NOTE: depending on your server's config, you -might- have to
// replace 'localhost' with your actual site-name. This will
// however increase page-load times. If localhost doesn't work
// ask your host how to access your site locally. To clarify,
// maybe show him this file.
$source = file_get_contents("http://localhost/$request");
// The reason we need to pass the request through apache (i.e. use the whole
// "http://localhost/" thing is because we need the PHP to be rendered, and
// I can't think of another way to do that using the original request uri
// calculate the correct key
$key = md5(date("Y-m-d-H")+"Hello there :)");
// replace all instances of "[{mp3_file_key}]" with the key
$output = str_replace("[{mp3_file_key}]",$key,$source);
//output the source
print($output);
?>
Step 3:
Now for the last part: set up your .htaccess file to redirect all requests from
http://yoursite/some/request/here
to
http://yoursite/proxyviewer.php?req=some/request/here
Unfortunately I'm really not good with .htaccess files so I won't be able to give you the exact code, but I imagine it shouldn't be too hard to do.
Congrats, you're done!
Disclaimer:
Please note that the code in here is not production-level code. First of all, I haven't tested it at all - although unless there's a typo somewhere they should all work, I would advise you to look through them carefully before going live with them. I have been fairly careful not to allow any Bad Things to happen, but it doesn't do any serious checking, and it's the wee hours of the morning here so I may have overlooked something.

FilesMatch is the directive you need:
<FilesMatch "\.mp3$">
Order Allow, Deny
Allow from localhost #Or the address of your player
Deny From All
</FilesMatch>

I think my other answer is much better, but this is still worth considering
Reading through some of the answers, I am struck by another idea: Have your page log the IP addresses of all visitors to your site within the last two (or however many) hours. Then, create a job that gets run ever 2 seconds or so which rewrites your .htaccess file to only allow access to mp3 files to those IP addresses in the log.
That way, only those users who have been served a page from your website in the last two hours will have access to your music. This, for the vast majority of people finding your mp3s in audio search-engines, will prove to be false.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas