i think one of my sites recently got delisted from google because it found and started indexing my dev site. it is basically a replica of my main site. (dev.site.com & site.com)
anyway, is there a way to create one robot.txt that would prevent any traffic to dev.site.com from being indexed, leaving site.com to still be fully indexed.
i know i could just have separate robot files for each, but it would just be easier to have one that covers both. especially since i work with a whole of sites which have dev sites, and would just like to have an easy workflow and not have to change the robot files when i push new versions of site to live.
Perhaps you could serve the robots.txt file dynamically, e.g. via PHP:
<?php
if ($_SERVER['HTTP_HOST'] === 'dev.site.com') {
echo "...";
} else {
echo "...";
}
Another approach is to add a line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
This is advocated to be superior to the robots.txt as if there is a link to your dev site the search engines will report the link (even if they do not index your site). This is advocated here:
http://yoast.com/prevent-site-being-indexed/
It's part of the standard that each subdomain must have its own robots.txt (if being accessed from dev.site.com; you wouldn't need another for site.com/dev).
Related
I am currently migrating my website from Apache to nginx, but my .htaccess file is not working. My website is inside the /usr/share/nginx/html/mywebsite folder. How can I use .htaccess in my nginx server?
This is my .htaccess file:
RewriteEngine on
RewriteRule video/watch/([a-zA-Z0-9_#$*-]+)/?$ "videos-single.php?id=$1" [NC]
Nginx doesn't support .htaccess (see here: "You can’t do this. You shouldn’t. If you need .htaccess, you’re probably doing it wrong.").
You've two choices (as I know):
import your .htaccess to nginx.conf (maybe the htaccess to nginx converter helps you)
use authd-htpasswd (I didn't try it)
Disclosure: I am the author of htaccess for nginx, which is now open source software.
Over the past years, I created a plugin which implements htaccess behaviour into nginx, especially things like RewriteRule, Allow and Deny, which can be crucial for web security. The plugin is used in my own productive environments without a problem.
I totally share the point of efficiency and speed in nginx, and why they didn't implement htaccess.
However, think about it. You cannot make it worse if you're using nginx plus htaccess. You still keep the great performance of nginx, plus you can drive your legacy appliances effortlessly on one webserver.
This is not supported officially in nginx. If you need this kind of functionality you will need to use Apache or some other http server which supports it.
That said, the official nginx reasoning is flawed because it conflates what users want to do with the way it is done. For example, nginx could easily check the directories only every 10 seconds / minute or so, or it could use inotify and similar mechanisms. This would avoid the need to check it on every request... But knowing that doesn't help you. :)
You could get around this limitation by writing a script that would wait for nginx config files to appear and then copy them to /etc/nginx/conf.d/. However there might be some security implications - as there is no native support for .htaccess in nginx, there is also no support for limiting allowed configuration directives in config files. YMMV.
Using the config file is one option, but the cool thing about the .htaccess file is that it provided a way for a web developer to have some control over server settings without having root access to the server. There doesn't seem to be anything like this on nginx which is a real bummer.
I understand how the way it's setup on apache slows down response times, but hoped there could be an nginx way to do the same thing without the performance hit... At least a way to do rewrites with regex on urls if nothing else.
"Is there no nginx way to do bulk redirects using regular expressions that doesn't slow down response times."
Just edit your database with myphpmyadmin.
Open myphpmyadmin select your database then find your "yourprefix_Posts" table.
Open it then click the "Search" tab, then "Find and Replace".
Select "post_content" in the dropdown
In the "Find" field, type URL you want to change: "website.com/oldURL".
In the "Replace" field, type the new URL: "website.com/newURL".
(To use regular expression, tick the "Regular Expression" box.)
NOTE: You can test this out by simply leaving the "Replace" field blank.
ALWAYS BACKUP database before making changes. This might sound scary but its really not. Its super simple and can be used to quickly replace just about anbything.
Here's the problem: we have a family (approx. 8) of websites, each hosted on a different subdomain of a single domain common to every member of the family. E.g.,
ecommerce.my_domain.com
forums.my_domain.com
signup.my_domain.com
For various reasons, each subdomain is administered separately from the others--i.e., different servers, complete autonomously with respect to the others regarding nearly every development decision, including choice of web framework--for instance., two are Django, one is Zend, and so on (though all run Apache 2.2). We want to fix this, and someday we will, it just won't be anytime soon.
One direct consequence of this structure is that we have multiple Default page names. By 'Default Page' i'm referring to the page the server defaults to when no page on the subdomain is given--sometimes it's 'index.html', sometimes 'index.php', etc. (I know what they are, it's the fact that there are multiple pages that's a problem.)
(The Default page is the webpage to which your server defaults when no page on the domain is specified. For example, if the "index.html" page is served when you enter "www.my_domain.com", "index.html" is the Default page.)
Here's one problem it causes: our analytics code (javascript) will count page views to subdomain1.my_domain.com and subdomain1.my_domain.com/index.html as two separate pages, unless the correct default page is specified.By itself, this can cause a two-fold error in the basic page view measurement. In addition, the analytics system (Google Analytics) only allows a single Default Page to be specified.
After looking into this, it seems one way to do is at the Server (Apache 2.2) : (i) create a CGI directory without using ScriptAlias; (ii) use DirectoryIndex to specify a default document when only the directory is requested.
i suppose this can also be done within the Web Framework that supports each subdomain property, though given we have multiple different frameworks, that option is certainly less appealing.
I would be grateful for the Community's view on the preferred way to do this.
strong text*strong text*
What about using a .htaccess file to handle it?
Just add a line resembling the following with the order you want it to look for files to be your default:
DirectoryIndex index.html default.php default.htm foo.html index.cgi
.htaccess reference docs: http://httpd.apache.org/docs/current/howto/htaccess.html
Slightly strange question, I want my Wordpress blog to use subdomains as permalinks, a bit like the popular website "The Setup".
I already have Apache configured to load the site irrespective of the subdomain, I just need to work out how to make Wordpress load the right post. I have my permalink structure set as "/%postname%/" as I will manually ensure the post names are unique.
All I need to work out is how to get the subdomains mapped to the postname - I'm guessing it requires mod_rewrite? But I'm unsure how to proceed.
Thanks
I'm using nginx for The Setup, so I'm not sure if that will help you much at all (nor can I guarantee that this is the best way to do it) but my config looks something like this:
location = / {
if ($host ~* "^([a-z0-9+\.\-]+)\.usesthis\.com$") {
set $interview $1;
}
if ($interview !~* '^(www)?$') {
rewrite ^(.*)$ /interviews/$interview.html;
}
}
}
The best way to accomplish this with Wordpress core features seems to be the "Create a Network" feature - formerly Wordpress Multi-Site (MU).
All you have to do is add a line to your wp-config.php file to begin the setup options from the Wordpress dashboard.
There are details here: http://codex.wordpress.org/Create_A_Network
However, be advised that at this time they are slightly out of date in terms of what the screens look like, and the workflow. (For example, once you've gone through the process, now instead of entering the Network Admin interface from the top left link in the admin dash bar, the link is in the top right and looks a bit different).
A website I work on is currently running three different instances of Wordpress blogs, all with their own set of users & permissions, plugins and themes.
Unforauntely, I don't think Wordpress Network is a viable option due to the restrictions around where sub-blogs can be in terms of urls.
Here are the url structures of each blog:
sub.domain.com/blog-1 (lives in web_root/blog)
sub.domain.com/folder/blog-2 (lives web_root/blog-2, being aliased to folder via Apache)
sub.domain.com/blog-2 (lives in web_root/blog-2)
sub.domain.com is a Zend Framework website to complicate matters where all requests for files that don't physically exists are being rewritten to sub.domain.com/index.php
Any thoughts on how I can consolidate these instances into one Wordpress install? Thanks.
I am a huge fan of wordpress MU. There is no problem having separate user stores, and using separate plugins.
It will make updating Wordpress software 3 times easier.
Export all of the blogs and then import them into one Wordpress instance. The only strange one is sub.domain.com/folder/blog-2 but you can use the same apache mod_rewrite trick to redirect it to the lower folder.
We are putting up a company blog at companyname.com/blog but for now the blog is a Wordpress installation that lives on a different server (blog.companyname.com).
The intention is to have the blog and web site both on the same server in a month or two, but that leaves a problem in the interim.
At the moment I am using mod_rewrite to do the following:
http://companyname.com/blog/article-name redirects to http://blog.companyname.com/article-name
Can I somehow keep the address bar displaying companyname.com/blog even though the content is coming from the latter blog.companyname.com?
I can see how to do this if it is on the same server and vhost, but not across a different server?
Thanks
Rather than using mod_rewrite, you could use mod_proxy to set up a reverse proxy on companyname.com, so that requests to http://companyname.com/blog/article-name are proxied (rather than redirected) to http://blog.companyname.com/article-name.
Here are more instructions and examples.
There is functionality with ZoneEdit called webforwards which could probably do this and hide what you are actually doing (unless someone looked into it).
The only thing that mod_rewrite can do is send HTTP header redirects, and those redirects (across servers) always result in the browser address bar reflecting the reality.
You should instead consider writing a 404 script that 'reflects' the blog. This would essentially be a transparent proxy, and many are already written.
The script would find if the requested page (that was 404'd) started with http://mycompany.com/blog/ . If it did, it would download and then send onto the client the blog page and associated files (probably caching them as well).
So requesting http://mycompany.com/blog/article_xyz would cause the 404 script to download and send http://blog.companyname.com/article_xyz.
It's probably more work than it's worth, but you might be able to design a simple enough 404 script that it's worthwhile.
-Adam