Google webmaster page found duplicate content due to the following:
If we take this dynamic search page example.com/armin-music-page-1
google found post back string after "page-1" as shown in example below
example.com/armin-music-page-1$dneix
example.com/armin-music-online-page-1&q=sa=x&ei=-a
example.com/music-dance-club-mix-page-1%balbla
example.com/armin-search-page-1#einx
and many random postback strings
My question, how do i remove or redirect to 404 anything that is generated after "page-1" via apache mod_rewrite .htaccess so google finds clean url only
Thank you in advance!
You can get rid of the stuff after page-1 by redirecting to the URL where that's removed:
RewriteRule ^(.+-page-1)(.+)$ /$1? [L,R=301]
(rule needs to be near the top of the htaccess file)
Or if you want to send to 404:
RewriteRule ^(.+-page-1)(.+)$ - [L,R=404]
But one thing you can't do is deal with requests that look like this:
example.com/armin-search-page-1#einx
because the #einx part of the URL is never sent to the server, so there's no way for the server to match against it. All apache and mod_rewrite sees is /armin-search-page-1.
Related
So, I have a fully working CRUD. The problem is, because of my file structure, my URLs were looking something like https://localhost/myapp/resources/views/add-product.php but that looked too ugly, so after research and another post here, I was able to use a .htaccess file to make the links look like https://localhost/myapp/add-product (removing .php extension and the directories), and I'm also using it to enforce HTTPS. Now, most of the views are working fine, but my Mass Delete view uses POST information from a form on my index. After restructuring the code now that the redirect works, the Mass Delete view is receiving an empty array. If I remove the redirect and use the "ugly URLs" it works fine. Here's how my .htaccess file is looking like:
Options +FollowSymLinks +MultiViews
RewriteEngine On
RewriteBase /myapp/
RewriteRule ^resources/views/(.+)\.php$ $1 [L,NC,R=301]
RewriteCond %{DOCUMENT_ROOT}/myapp/resources/views/$1.php -f
RewriteRule ^(.+?)/?$ resources/views/$1.php [END]
RewriteCond %{HTTPS} off
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
I didn't actually write any of it, it's a mesh between answered questions and research. I did try to change the L flag to a P according to this post: Is it possible to redirect post data?, but that gave me the following error:
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator at admin#example.com to inform them of the time this error occurred, and the actions you performed just before this error.
More information about this error may be available in the server error log.
Apache/2.4.52 (Win64) OpenSSL/1.1.1m PHP/8.1.2 Server at localhost Port 443
POST information getting lost in .htaccess redirect
You shouldn't be redirecting the form submission in the first place. Ideally, you should be linking directly to the "pretty" URL in your form action. If you are unable to change the form action in the HTML then include an exception in your .htaccess redirect to exclude this particular URL from being redirected.
Redirecting the form submission is not really helping anyone here. Users and search engines can still see the "ugly" URL (it's in the HTML source) and you are doubling the form submission that hits your server (and doubling the user's bandwidth).
"Redirects" like this are only for when search engines have already indexed the "ugly" URL and/or is linked to by external third parties that you have no control over. This is in order to preserve SEO, just like when you change any URL structure. All internal "ugly" URLs should have already been converted to the "pretty" version. The "ugly" URLs are then never exposed to users or search engines.
So, using a 307 (temporary) or 308 (permanent) status code to get the browser to preserve the request method across the redirect should not be necessary in the first place. For redirects like this it is common to see an exception for POST requests (because the form submission shouldn't be redirected). Or only target GET requests. For example:
RewriteCond %{REQUEST_METHOD} GET
:
Changing this redirect to a 307/8 is a workaround, not a solution. And if this redirect is for SEO (as it only should be) then this should be a 308 (permanent), not a 307 (temporary).
Aside:
RewriteCond %{HTTPS} off
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
Your HTTP to HTTPS redirect is in the wrong place. This needs to go as the first rule, or make sure you are redirecting to HTTPS in the current first rule and include this as the second rule, before the rewrite (to ensure you never get a double redirect).
By placing this rule last then any HTTP requests to /resources/views/<something>.php (or /<something>) will not be upgraded to HTTPS.
The url showing in the address bar: www.testsite.com/news#tab-1
The url which I want to show: www.testsite.com/news
The url showing in the address bar: www.testsite.com/news#tab-2
The url which I want to show: www.testsite.com/events
I tried rewriting rule using htaccess
RewriteCond %{REQUEST_URI} /news#tab-2$
RewriteRule .* /news[L]
and
RewriteRule www.testsite.com/test www.testsite.com/news#tab-1
But it didnt work. Please help.
You can't rewrite Anchors with .htaccess. You need to use something client side, like javascript in order todo so.
This article i found in another similiar question you can read it here:
Remove fragment in URL with JavaScript w/out causing page reload
Client browsers do not send the character "#" to the server. If you have access to the server Logs you will see all the server gets is "GET /news" and omits the rest. "#" is a client side interpreted character.
You will have to hex encode it in the url if you insist on sending it to the server, but it is probably better if you use a more common URI path or even query string "?" if you want to do internal redirections from the server.
As a friendly side-note. Do not use .htaccess unless you are not the admin of the Apache HTTPD server. It is not necessary to redirect/rewrite as it complicates them and it produces bigger overhead to the server since the file needs to be constantly checked for changes.
Stackoverflow users,
I have an Apache application that needs to accept data POSTed to the following paths:
/sample/HostChange/Submit
/sample/HostChange/SubmittoAPI
I'm currently using the following 301 redirect rules. This is not what I want as as the POST gets redirected and the second request is a GET loosing all the data. I am seeing the 301 request go the correct url but the second request is a GET and causes a 405 response code.
.htaccess:
RewriteEngine On
Redirect 301 /sample/HostChange/Submit /event
Redirect 301 /sample/HostChange/SubmittoAPI /date
I'm sure using a Redirect is the issue. Can someone help me figure out the correct RewriteCondition I need to be using to redirect these POST hits to the new paths but keep the data being submitted to the application.
Thank you mucho.
I don't think you can redirect a POST (as a POST). Browsers just won't POST the data again.
You'd have to output some HTML with Javascript to make the browser rePOST the data to the new URL.
Or instead of redirecting, have some server-side code accept the POST data first and then dispatch it somehow (maybe be redirecting with an indentifying token in the URL) internally.
Or if the data is short, re-write the URL to include the data as query parameters.
Something along the lines of following should work, I'm not around an apache instance at the moment so regex is not tried but please check the rewrite log and see how it behaves.
RewriteEngine On
RewriteLog /var/log/httpd/rewrite.log
RewriteLogLevel 9
RewriteCond %{REQUEST_URI} /sample/HostChange/Submit
RewriteRule ^/sample/HostChange/Submit(.*) /event/$1 [P,L]
I some kind of the same problem, for me it helped not to really redirect the request, but to rewrite it. I don't now if this is applicable to your problem.
Here are the details and it worked for me:
PHP Rewrite url and preserve posted data
For some reason google indexed several pages of my website as:
http://myapp.com/index.php/this-can-be-enything/1234
Now, I want to redirect with apache .htaccess those pages to correct urls:
http://myapp.com/this-can-be-enything/1234
I've googled and tried many options but with no success.
Any tip will be helpful.
I've added to my .htaccess file following lines:
RewriteCond %{THE_REQUEST} ^.*index.php.*
RewriteRule ^(.*)index.php(.*)$ $1$2 [NC,R=301,L]
I don't know if this is best solution but works ok for me.
Two Parts of problem
To make Google aware that indexed page is moved to some other destination you need to handle that # apache level and issue 301 ( moved permanently )
Handler to handle the cached requested URL to new URL using the #1 handler itself.
I have a bunch of URLs from an old site that I recently updated to EE2. These URLs look like this:
http://www.example.com/old/path.asp?dir=Folder_Name
These URLs will need to redirect to:
http://www.example.com/new_path/folders/folder_name
Not all Folder_Name strings match up with folder_name URL segments, so these will most likely need to be static redirects.
I tried the following Rule for a particular folder called "Example_One" which maps to a page on the new site called "example1":
Redirect 301 /old/path.asp?dir=Example_One
http://www.example.com/new_path/folders/example1
But this doesn't work. Instead of redirecting I just get a 404 error telling me that http://www.example.com/old/path.asp?dir=Example_One cannot be found.
EDIT:
There's a secondary problem here too which may or may not be related: I have a catch-all rule that looks like this:
redirect 301 /old/path.asp http://www.example.com/new_path
Using the rule, requests like the first one above will be redirected to:
http://www.example.com/new_path?dir=Folder_Namewhich triggers a 404 error.
Just had to scour Google a bit more to find the proper syntax for mod_rewrite.
Using the example from above:
RewriteCond %{QUERY_STRING} ^dir=Example_One$
RewriteRule ^/old/path\.asp$ /new_path/folders/example1? [R=301,L]
This fixes both of the problems above -- as to why EE is 404ing with one parameter in the Query String, that problem is still unsolved, but this works as a workaround to that problem.
You can also redirect URLs to a specific page where the parameter may have a different value each time. One example of this is Google UTM Campaign tracking (in situations like this where the tracking query string triggers a 404):
Link: http://www.example.com/?utm_source=xxx&..... (triggers 404)
Should Redirect to: http://www.example.com
RewriteCond %{QUERY_STRING} ^utm_source=
RewriteRule ^$ http://www.example.com? [R=301,L]
Note: This will only redirect those requests to the homepage, as defined by ^$. If you want to redirect utm requests to a different page, you'll need to change the first part of that RewriteRule.