Apache 301 Redirect and preserving post data - apache

I have implemented SEO URLs using Apache 301 redirects to a 'redirect.cfm' in the root of the website which handles all URL building and content delivering.
Post data is lost during a 301 redirect.
Unable to find a solution so far, have tried excluding post method from rewrites - worst case scenario we could use the old type URLs for post methods.
Is there something that can be done?
Thanks

Using a 307 should be exactly what you want
307 Temporary Redirect (since HTTP/1.1)
In this case, the request should be repeated with another URI; however, future requests
should still use the original URI.[2] In contrast to how 302 was historically implemented,
the request method is not allowed to be changed when reissuing the original request. For
instance, a POST request should be repeated using another POST request
- Wikipedia

Update circa 2021
The original answer here was written before 307 status code redirect worked consistently across browsers. As per Hashbrown's answer below, the 307 status code should be used.
Old Answer
POST data is discarded on redirect as a client will perform a GET request to the URL specified by the 301. Period.
The only option is to convert the POST parameters to GET parameters and tack them onto the end of the URL you're redirecting to. This cannot be done in a .htaccess file rewrite.
One option is to catch POST requests to the url to be redirected and pass it off to a page to handle the redirect. You'd need to do the transposition of the parameters in code then issue the redirect header with the parameter appended new url that way.
Update: As pointed out in the comments to this answer, if you do redirect to another URL specifying POST parameters and that URL is also accessed without paramters (or the params are variable), you should specify a link to the canonical URL for the page.
Say the POST form redirects transposed to the following GET resource:
http://www.example.com/finalpage.php?form_data_1=123&form_data_2=666
You would add this link record to the head section of the page:
<link rel="canonical" href="http://www.example.com/finalpage.php" />
This would ensure all SEO value would be given to http://www.example.com/finalpage.php and avoid possible issues with duplicate content.

Using 301 redirects for general URL rewriting is not the way to go.
This is a performance issue (especially for mobile, but also in general), since it doubles the number of requests for your page.
Think about using a URL rewriting tool like Tuckey's URLrewriteFilter or apache mod_rewrite.
What Ray said is all true, this is just an additional comment on your general approach.

Related

Make the entire subdirectory 410

I have a subdomain which has been indexed in Google. The pages (a WordPress development project) are no longer there, so I want Google to realise that. I figured that a 410 is the way to go, but rather than putting them on individual posts that no longer exist, I was thinking maybe it could be a catch-all for the entire folder.
Would that be possible and would it be a good idea?
In the subdirectory that this subdomain resolves to you could use the following mod_rewrite directive that returns a 410 Gone for all requests:
RewriteEngine On
RewriteRule ^ - [G]
The ^ matches all requests. - indicates no substitution. The G flag results in a 410 Gone response being sent to the client (shorthand for R=410).
The default Apache 410 response will be sent to the client - unless you have defined a custom error document.
Yes, this is a good idea in order to get Google to drop the URLs from its search results in the shortest time possible. You could also consider using Google's URL removal tool as well.

How to rewrite a URL while keeping POST data?

I'm using Apache and its proxy settings to serve a web page over HTTPS (more detail here: click).
In the previous question, I was struggling with why the POST data was disappearing between my browser and my server. Now I know that it was caused by using Apache's RewriteRule. So I tried working around that with proxies, but this resulted in the web page sending out all other requests on the main domain, instead of the sub domain it's at. For example: My main web page is at myUrl.com/sprinklers. This goes through a proxy, which goes to localhost:8091. The main HTML page loads, but ALL other calls it makes, it makes at myUrl.com/any/path/it/needs, while it should be at myUrl.com/sprinklers/any/path/it/needs.
Sadly, I'm stuck in the middle:
Using RewriteRule means that everything works, but I lose the POST data, which I need.
Using proxies means that the POST data works, but also that I get a ton of 404's, because the web page somehow now expects things to be at the root of the domain, instead of the subdomain it's at.
The trailing slash needs to be there, since without it, the same happens as when I use proxies, I get a ton of 404's for all bits and pieces of the web page.
I tried using ProxyHTMLURLMap in all shapes and forms (all found online), but none worked.
TL;DR:
I need to enable two-way traffic between myUrl.com/sprinklers/.* and localhost:port/.*, while also retaining POST data. How do I do that?
As always, ask and you shall find the answer yourself...
It turned out to be a lot simpler than I imagined. Simply telling RewriteRule to use HTTP code 307 did the trick. Apparently, this is the same as the other redirection codes, but 307 also keeps the POST data.
For those wondering how to do this in Apache:
RewriteRule ^/sprinklers$ /sprinklers/ [R=307]
That's it, fixed.

htaccess 301 redirect - how to disable it?

I have added 301 redirect on my website by mistake (because I was doing maintenance). Now lots of people can't get back to my website, because they are still redirected to other page - eventhough I removed redirection (even deleted htaccess). As much as I searched around it's because htaccess (or 301 redirect) is cached in users browser and I wasn't able to find any solution for this. Is there any way to fix this, I can't just loose hundreds of visitors because of something like this?
This page explains what is going on in good detail:
301 Redirects: The Horror That Cannot Be Uncached
Basically, modern browsers cache the redirect response for 301 for some indeterminate amount of time and will not make an updated request to your old web page to refresh it. Users can manually clear the cache and, because it is a cache, data can be purged if the browser needs more space for other data (like other redirects).
This SuperUser question resolves the caching issue from the client's end:
How can I make Chrome stop caching redirects?
One interesting answer is:
//superuser.com/a/660522/178910
In this answer, the user points out that the browser treats http://example.com/ and http://example.com/? as two different URLs. You could go to the "new" site and setup an HTTP 302 redirect pointing back to the original page with a ? on the end and it should load. If they original page already had a query as part of the URL, you can simple add an & to the end to achieve the same result.
It's not perfect -- it is a different URL after all -- but at least they'll be able to view your old site.
Note that your web application may try to redirect empty queries or invalid queries back to a "clean" page, which you may have to disable to get the intended result.
UPDATE
One other option is to put a redirect from the new site back to the old site (make this a 302 or 307 redirect to avoid the 301 problem you're currently having). From my testing, Chrome will remove the old redirect when it does this. It may throw a "redirect loop" error, but only once. I was unable to reproduce the cached redirect problem at all with the latest version of Firefox. Other browsers' behavior is probably going to be inconsistent.

POST Requests seen as GET by server

Got a really strange problem here. When sending post requests to my PHP script
$_SERVER['REQUEST_METHOD']
returns "GET" instead of "POST".
It works fine for every other REST method
so this is what I get
GET -> GET
POST-> GET
PUT -> PUT
DELETE -> DELETE
It only happens on one of my servers so i'm assuming it's an apache problem and i've managed to figure out that it only happens if I add "www" to my url.
I.e
www.something.com
causes the problem but
something.com
does not
I have tested on different sites on the same server and I get the same thing so I'm assuming it's global config.
Any thoughts
As the HTTP spec says for response codes 301 and 302:
Note: For historic reasons, a user agent MAY change the request method
from POST to GET for the subsequent request. If this behavior is
undesired, the 307 (Temporary Redirect) status code can be used
instead
A third (but unlikely) possibility is you're getting a 303 response to the initial URI. The solution is twofold:
Configure the clients which are under your control to POST to the canonical URI so they are not redirected at all.
Configure your server to redirect using 307 in this case instead of 301/302.

Url rewrite without redirect in ASP.NET

We have a CMS system that creates long URLs with many parameters. We would like to change the way they are presented, to make them more friendly.
Since we have many sites already built on this CMS, it's a little difficult to rewrite the CMS to create friendly urls (although it's a method we're considering, if no alternative is found), we we're looking for a method that when a user clicks on a long url, the url will change into a friendly one - in the browser - without using Response.Redirect().
In Wordpress such a method exists (I'm not sure whether it's done in code or in Apache), and I'm wondering if it could be done in ASP.NET 2.0 too.
Another thing to take into consideration is that the change between the urls has to be done by accessing the DB.
UPDATE: We're using IIS6
If you're using ii7 the easiest way to do this is to use the URL Rewrite Module According to that link you can
Define powerful rules to transform
complex URLs into simple and
consistent Web addresses
URL Rewrite allows Web administrators
to easily build powerful rules using
rewrite providers written in .NET,
regular expression pattern matching,
and wildcard mapping to examine
information in both URLs and other
HTTP headers and IIS server variables.
Rules can be written to generate URLs
that can be easier for users to
remember, simple for search engines to
index, and allow URLs to follow a
consistent and canonical host name
format. URL Rewrite further simplifies
the rule creation process with support
for content rewriting, rule templates,
rewrite maps, rule validation, and
import of existing mod_rewrite rules.
Otherwise you will have to use the techniques described by Andrew M or use Response.Redirect. In any case I'm fairly certain all of these methods result in a http 301 response. I mention this because its not clear why you don't want to do Response.Redirect. Is this a coding constraint?
Update
Since you're using IIS 6 you'll need to use another method for URL rewriting.
This Article from Scott Mitchell describes in detail how to do it.
Implementing URL Rewriting
URL rewriting can be implemented
either with ISAPI filters at the IIS
Web server level, or with either HTTP
modules or HTTP handlers at the
ASP.NET level. This article focuses on
implementing URL rewriting with
ASP.NET, so we won't be delving into
the specifics of implementing URL
rewriting with ISAPI filters. There
are, however, numerous third-party
ISAPI filters available for URL
rewriting, such as:
ISAPI Rewrite
IIS Rewrite
PageXChanger
And many others!
The article goes on to describe how to implement HTTP Modules or Handlers.
Peformance
A redirect response HTTP 301 usually only contains a small amount of data < 1K. So I would be surprised if it was noticeable.
For example the difference in the page load of these urls isn't noticible
"https://stackoverflow.com/q/4144940/119477"
"https://stackoverflow.com/questions/4144940/url-rewrite-without-redirect-in-asp-net"
(I have confirmed using ieHTTPHeaders that http 301 is what is used for the change in URL)
Page Rank
This is what google's webmaster central site has to say about 301.
If you need to change the URL of a
page as it is shown in search engine
results, we recommended that you use a
server-side 301 redirect. This is the
best way to ensure that users and
search engines are directed to the
correct page.
In response to extra comments, I think what you need to do is bite the bullet and modify the CMS to write the new links out into the pages. You've already said that you have normal URL rewriting which can translate the new URLs to old when they're incoming. If you were to also write out the new URLs in your markup then everything should simply work.
From an SEO point of view, if the pages your CMS produces have the old links, then that's what the search engines will see and index. There's nothing much you can do about that, javascript, redirect or otherwise. (although a permanent redirect would get you a little way there).
I also think that what you must have been seeing in Wordpres was probably a redirect. Without finding an example I can't be sure though. The thing to do would be to use Fiddler or another http debugger to see what happens when you follow one of these links.
For perfect SEO, once you've got the new URLs working outbound and inbound, what you'd want to do is decide that your new URLs are the definitive URLs. Make the old URLs do a redirect to the new URLs, and or use a canonical link tag back to the new URL from the old one.
I'm not certain what you're saying here, but basically a page the user is already reading contains an old, long, URL, and you'd like it to change to the new, short URL, dynamically on the client side, before the browser requests the page from the server?
The only way I think this coule be done would be to use Javascript to change the URL in response to onclick or document.ready, but it would be pointless. You'd need to know the new short url for the javascript to re-write to, and if you knew that, why not simply render that url into the link in the first place?
It sounds more like you want URL routing, as included in ASP.Net 4 and 3.5?
Standard URL rewriting modifies the incoming request object on the server, so the client browser submits the new URL, and the downstream page handlers see the old URL. I believe the routing things extend this concept to the outgoing response too, rewriting old urls in the response page into new URLs before they're sent to the client.
Scott Gu covers the subject here:
http://weblogs.asp.net/scottgu/archive/2009/10/13/url-routing-with-asp-net-4-web-forms-vs-2010-and-net-4-0-series.aspx
Scott Gu also has an older post on normal URL rewriting outlining several different ways to do it. Perhaps you could extend this concept by hooking into Application_PreSendRequestContent and manually modifying all the href values in the response stream, but I wouldn't fancy it myself.
http://weblogs.asp.net/scottgu/archive/2007/02/26/tip-trick-url-rewriting-with-asp-net.aspx