How to use spaces in the URL path in mod_alias’ Redirect directive? - spaces

Following on from link text, I've also got problems with file paths that contain spaces. I was told to simply enclose the full path in quotes, and that should work, but I keep getting this page marked up in my Webmaster Tools error log as not being found:
redirect 301 /"News/Press Releases/Press_Release_Update_Oct_2006.pdf" /index.html
still gives me a 404.
Thanks Martin

Have you tried putting the quote before the /?
redirect 301 "/News/Press Releases/Press_Release_Update_Oct_2006.pdf" /index.html

Related

Redirect strange URLs with two periods in them

I had a static HTML website that I recently converted over to drupal. I have been monitoring my sites 404 errors in webmaster tools and drupal reports and have noticed that google has indexed strange urls. My guess is they came from relative links that were created improperly from the older static HTML site.
Here is an example:
www.example.com/items../item-page.html
The actual page is:
www.example.com/items/item-page.html
The new drupal site doesn't even have .html extensions. I am using URL redirect and path auto modules and have redirects setup for all of the old urls to make sure they are 301'd to the new URL structure (e.g. www.example.com/items/item-page.html would be 301 to www.example.com/items/item-page).
I have access to the server so I am doing my redirects in the apache httpd.conf file instead of .htaccess. I tried the following code to redirect ../ to / but I am not having any luck:
RewriteRule ^\.\./(.*) /$1 [R=301,NC,L]
This rule doesn't do anything when I go to a url with the ../ in it. Is there a rewriterule that can match ../ and remove it from any url?
NOTE: I have other redirects in apache httpd.conf that are working fine...such as:
RewriteRule ^items/pdf/(.*)$ /sites/default/files/documents/items/$1 [R=301,NC,L]
So I don't think its my server configuration.
EDIT:
I noticed that the rewrite rule above for rewriting the pdf directory works even with the .. in the URL. Example:
http://www.example.com/items../pdf/somedocument.pdf
redirects to
http://www.example.com/sites/default/files/documents/items/somedocument.pdf
So it looks like the .. is completely ignored in rewrite rules, which is why I can't get anything to work. Does anyone know a way around that?
I believe you need to escape your forward slash.
I believe rewriterule acts against the HTTP request URI, not the page link.
Consequently, I believe you will need to remove the carat to find a match, and analyze your intentional URLs to see if that would have a negative impact elsewhere.
You can use this
RewriteRule ^/items\.\.(.+)$ /items/$1 [L,R]
This will redirect /items..foobar to /items/foobar
I was not able to fix the issue using rewrite rules in apache due to the rewrite rules not finding ".." in the url for unknown reasons.
My solution was to create a custom drupal module that looks to see if ".." is in the URL. If the ".." string is found, then I have it set to redirect to the url without ".." in it using built-in drupal functions. Here is the code I used in my module.
function doubledot_fix_init() {
$destination = drupal_get_destination();
$alias = drupal_get_path_alias($destination['destination']);
$fixpath = str_replace("..", "", $alias, $count);
if ($count > 0) {
drupal_goto($fixpath, array(), 301);
}
}
I do not see any reason this fix will break anything, because ".." should never really be found in any URL. If anyone can think of a situation that this fix could cause an issue, or if you know of a better solution, please let me know.

Remove and block unwanted postback string

Google webmaster page found duplicate content due to the following:
If we take this dynamic search page example.com/armin-music-page-1
google found post back string after "page-1" as shown in example below
example.com/armin-music-page-1$dneix
example.com/armin-music-online-page-1&q=sa=x&ei=-a
example.com/music-dance-club-mix-page-1%balbla
example.com/armin-search-page-1#einx
and many random postback strings
My question, how do i remove or redirect to 404 anything that is generated after "page-1" via apache mod_rewrite .htaccess so google finds clean url only
Thank you in advance!
You can get rid of the stuff after page-1 by redirecting to the URL where that's removed:
RewriteRule ^(.+-page-1)(.+)$ /$1? [L,R=301]
(rule needs to be near the top of the htaccess file)
Or if you want to send to 404:
RewriteRule ^(.+-page-1)(.+)$ - [L,R=404]
But one thing you can't do is deal with requests that look like this:
example.com/armin-search-page-1#einx
because the #einx part of the URL is never sent to the server, so there's no way for the server to match against it. All apache and mod_rewrite sees is /armin-search-page-1.

how to remove trailing %20target= from url using htaccess

Recently i have been using stackoverflow to fix many of my server errors. Recently i have encountered the following error "NOT FOUND 404 error" for the following error.
I have seen from web master central that there are several hundred urls have trailing string "%20target=" I want to remove so that the url yield results.
here is the sample. i wanted to change the following url from
www.example.com/one/two/three/four/five/For-Sale-26264.html%20target=
to
www.example.com/one/two/three/four/five/For-Sale-26264.html
How can i achieve this using htaccess
The %20 is actually a URL encoded space, so you can use a lazy select (.+?) to match as few characters as possible before the space with \s and then target. The grouped match is available as a back reference with $1 for the rewrite.
RewriteEngine On
RewriteBase /
RewriteRule (.+?)\starget= $1 [L]

301 redirect special charecter issue

The URLs with special characters such as é, ü do not redirect correctly.
For example : Redirect 301 /de/pages/schuhe_mit_schmaler_füßbreite http://www.mydomain.com/de/schuhe-für-schmale-füße/l/
I already tried using the suggestions in this stack post, but it does not work for me.
working rule :
Redirect 301 /de/pages/duo_latest http://www.mydomain.com/de/entdecken-sie-duo/neues/
not working rule :
Redirect 301 /fr/pages/duo_latest http://www.mydomain.com/fr/découvrez-duo/nouvelles/
not working rule contain special character é
Make sure that the text editor you are using to save to your .htaccess file supports UTF-8 encoding. If you are using Notepad, configure the settings so that it is not saving as ANSI.
If you cannot get it to save to UTF-8, create a completely new .htaccess file that is saved in UTF-8 and replace the old file with the new one.
This works probably fine
RewriteRule ^türen/(.*) http://google.de?$1 [L]
What is your problem? Can you post some more details

How can I redirect people accessing my files as directories?

I have the following situation:
On my webserver I have an instance of websvn running, where specific repositories and revisions can be accessed by a URL like
http://www.myhost.com/listing.php?repname=repository1&path=%2Ftrunk%2Fbackend
Somehow, out there in the wild, a wrong URL is being used to access this
http://www.myhost.com/listing.php/?repname=repository1&path=%2Ftrunk%2Fbackend
(Notice the slash after listing.php)
Now, although the URL works and websvn still shows the webpage, images and stylesheets do not get loaded correctly, since they are referenced relative.
I tried to add an .htaccess file to the webroot to redirect people accessing the file as directory to the correct URL.
I have tried multiple variations and ended up with this file:
RewriteEngine on
RewriteRule ^/listing.php/ listing.php [R=301,QSA]
But, since I am writing here, you already guessed it: It doesn't work.
I also tried
RewriteEngine on
RewriteRule ^/listing.php(.*) listing.php$1 [R=301,QSA]
What am I doing wrong?
Perhaps among other things, a RewriteRule within .htaccess that starts with “^/” will never match anything at all. (Examples that include a leading slash are for the global configuration file.) Remove the leading forward slash and see if that helps.
Also, I recommend changing the 301 to a 307 until you get it working. Otherwise, your browser will cache the 301 result, redirecting on subsequent references without consulting your server at all and likely giving you very confusing results.