I am redesigning my site and It is located in sub folder of website directory. And Google have indexed our new site from sub folder which is affecting my search results of live site.
Is there any specific way, that I can remove sub folder from google search index and google search results ?
e.g. My Live site is www.xyz.com and
I am redesigning on www.xyz.com/newsite
Is there anyway that I can remove /newsite from google search index and results ?
Refer http://www.robotstxt.org/robotstxt.html
Add this robots.txt file
User-agent: *
Disallow: /newsite/
or best suited, get access to Google Webmaster
https://www.google.com/webmasters/tools/url-removal?hl=en&siteUrl=
add your website url after =
For example:
https://www.google.com/webmasters/tools/url-removal?hl=en&siteUrl=http://www.techplayce.com/
Yes by uploading robots.txt file on your site directory...
User-agent: *
Disallow: /newsite/
add this code if you have wordpress site then install a plugin for robots.txt
Related
I have included robots.txt in the root directory of my application in order to tell Google bots that do not follow this http://www.test.com/example.aspx?id=x&date=10/12/2014 URL or the URL with the same extension but different query string values. For that I have used following piece of code:
User-agent: *
Disallow:
Disallow: /example.aspx/
But I found in the Webmaster Tools that Google is still following this page and has chached a number of URLs with the specified extension, is it something that query strings are creating problem because as far as I know that Google do not bother about query string, but just in case. Am I using it correctly or something else also needs to be done in order to achieve the task.
Your instruction is wrong :
Disallow: /example.aspx/
This is blocking all URLs in the direcory /example.aspx/
If you want to block all URLs of the file /example.aspx, use this instruction:
Disallow: /example.aspx
You can test it with Google Webmaster Tools.
is there any any to prevent google from indexing my folder? Not "Page" but folder
using htaccess or robot.txt or any other way?
add below to your robots.txt file
Disallow: /foldername/
let me know if that helps
I discovered that my robots.txt file on my site is causing Google's Webmaster Tools to not index my site properly. I tried and removed just about everything from the file (using WordPress so it will still generate it) but I keep getting the same error in their panel,
"Severe status problems were found on the site. - Check site status". And when I click on the site status it tells me that robots.txt is blocking my main page, which is not.
http://saturate.co/robots.txt - ideas?
Edit: Marking this as solved as it seems Webmaster Tools now accepted the site and is showing no errors.
You should try adding Disallow: to the end of your file. So it looks like this:
User-agent: *
Disallow:
As far as I searched for it, not able to find a proper answer for such kinda problem.
I have a few TLDs installed on the same cPanel account.
One of them is known as the main domain, and the rest are secondary domain.
cPanel automatically creates subdomains when you add a secondary domain somthing like;
http://secondary.maindomain.com
My problem is google indexed my pages both from 2 addresses.
Like:
secondary.com/blabla.html
secondary.maindomain.com/blabla.html
How can I remove those indexes from google? And
How can I avoid those subdomains being indexed for the future?
For this purpose you can add robots.txt to your document root path and add 'Disallow: ' to avoid any search engine or Google to index your files or directories.
For example to avoid indexing your subdomain in google add below entries in robots.txt and place robots.txt in document root path of you subdomain:
User-agent: Googlebot
Disallow: /
or for all search engines:
User-agent: *
Disallow: /
I am wondering if there is a way to include in my robots.txt a line which stops Google from indexing any URL in my website, that contains specific text.
I have different sections, all of which contain different pages. I don't want Google to index page2, page3, etc, just the main page.
The URL structure I have is as follows:
http://www.domain.com/section
http://www.domain.com/section/page/2
http://www.domain.com/section/article_name
Is there any way to put in my robots.txt file a way to NOT index any URL containing:
/page/
Thanks in advance everyone!
User-agent: Googlebot
Disallow: http://www.domain.com/section/*
or depending on your requirement:
User-agent: Googlebot
Disallow: http://www.domain.com/section/page/*
Also you may use the Google Webmaster tools rather than the robots.txt file
Goto GWT / Crawl / URL Parameters
Add Parameter: page
Set to: No URLs
You can directly use
Disallow: /page