I found out that google will only index the top level URL from my site but i have a blog in my site as well that i want it to be listed in blog directories.
so my question is:
since i name the blog under www.mydomain.com/blog.html it is certain that google will only take www.mydomain.com in to consideartion.
so what if i name my blog such a way: www.blog.mydomain.com?
would this be consider as another entity? if so how do i name it such way?
P/s: My blog and Site both has different content of the same topic and i want them to be listed in both web and blog directories.
You can use the Canoical Meta tag:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
For example: on your blog (in both places), add the meta tag: <link rel="canonical" href="http://blog.example.com/" />
It applies for static, and dynamic pages.
Subfolders have preference over subdomains.
So http://www.domain.com/blog is better for your SEO than http://blog.domain.com
The reason for this is that a subdomain is a completely different website in the eyes of Google, but a subfolder is an extension of the main website.
Therefor the subfolder will benefit from links pointing to the main website and the main website will also benefit from links pointing to the subfolder.
Related
I'm trying to resolve an "issue": if I google for the keyword that is my website's domain (just the company name), I get a lot of results related to the company except for the home page... here is the output I see in the first 10 pages (I didn't look further).
- Crunchbase page
- Linkedin page
- Facebook page
- Our zendesk help center page
- Some external blog references on VentureBeat, pocketgamer, Forbes, etc
The site is accessible with or without www. prefix. The site is also accessible on http and https. It is a wordpress powered website.
Here is the google URL of the query
I've then tried googling for 'site:trophit.com' according to google's recommendation when your website doesn't show up on searches. Interesting enough - I get results of some posts in my website but NOT PAGES. Here is that query as well
I browsed some pages that are missing from the search and found that they contain the following metadata:
<meta name="robots" content="noindex,nofollow"/>
I checked the Wordpress->settings->reading section, the "discourage robots" is unchecked. I am unaware of any other similar settings in WP.
Any idea why all pages have 'noindex'?
I am just a learn in the field of SEO and i have a main domain and an addon domains. Both have separate websites. Consider main.com is my main domain and addon.com is my addon domain name which is pointed to a sub directory called "addon".
I can access addon.com by using the following 3 ways.
addon.com
main.com/addon
addon.main.com
Are these urls are indexed separately by search engines? If so how can i prevent this?
Does Search engine think main.com/addon as a page in the main.com?
I am not sure i need to worry about all these things or just leave it as it is. I searched to google but couldn't find a right answer.
It may be too late to answer. However, it may benefit others.
Primarydomain and subdomain or addon-domain will not be linked by the search engines automatically, unless you link them purposefully or inadvertently. Except all conditions are true:
Your web root normally public_html has no index page
Directory indexing of your web root is opened, eventually
exposing/linking your sub-folder -which is attached to your
addon-domain- to google and entire world.
In that scenario robots.txt solution is not recommended, because search engines may ignore robot.txt rules.
Reference
Google will only index pages if they are linked to or listed in the sitemap. You can stop the addon.main.com or main.com/addon being indexed by using noindex tags:
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
or disallowing it in the robots.txt
The search engine will consider main.com/addon as a page of main.com - if sites are completely separate i'd recommend using a separate domain (preferably a keyword rich domain) but it's up to you really
We have three domain names with the same content. For the three domains, it will return a 200 OK HTTP code. It will look like duplicates of the same content. If there is a canonical tag on every page it will be better.
The best would be to create a redirection on the subdomain panel in cpanel so that at least addon.main.com would redirect to addon.com
Then, you can add a robots.txt to the root path of the primary domain and add
user-agent:*
disallow:/
so that no robot will visit main.com/addon
Google gives less weight to subdomain hosted site of another domain.
Superbad for SEO
If you are hosting for SEO and love the convenience of cPanel, then forget hosting domains as addon domains.
#Vasanthan R.P.
Its an excellent question, often overlooked by SEO professionals. +1 for you
If i generate a site map at http://www.xml-sitemaps.com/, for aromapersona.com, I get 9 pages, however there are a bunch more pages that should show up. For example, aromapersona.com/candle_holder is in the same "front" directory as the other 9 pages, but doesnt generate in sitemap. Is this because no other pages on my site link to it? Im trying to get these other URLs indexed, and I even edited the site map to include this URL as well as others and submitted to google via webmaster tools, and still nothing. Advice?
I'm not familiar with aromapersona.com but it will only be able to list pages that are linked to from the initial page you give it (or ones they link to) unless you provide the site with FTP access (which I presume you dont).
If you include the URL's in your sitemap for goggle it should eventually list them, but linking to them from other parts of your site is probably the most effective.
I have not checked the website, but do also take the cause is not because of noindex, nofollow, robots.txt, javascript links, mixing http/https etc.
In clear wording: There is no link pointing to the subpage "candle_holder", hence the XML site generator (which works by following links on your site) cannot detect it.
You can add it manually to the XML, but then again, it should be accessible from the site directly.
we have a blog in a sub-directory of the main url.
http://www.domain.com/blog/
the blog is run by wordpress and we are using Google Sitemap Generator to create the XML file.
We have an index of all of our sitemaps in the main sitemap.xml which leads to many sitemaps.
From an SEO standpoint would it be best to link directly to the sitemap that is under the blog directory:
e.g. http://www.domian.com/blog/sitemap.xml
or should be do a cron (daily) to copy the file to the main domain's directory:
e.g. http://www.domain.com/sitemap_blog.xml
which will be linked from the main index with the other sitemaps.
What is the best way from an SEO standpoint???
It doesn't matter where the sitemap is. you will want to register its location with the search engines you want to be able to find it. The main thing though is to have a link to your sitemap location in the robots.txt file using the following line:
Sitemap: <sitemap_location>
Your robots.txt file should be in your domain's root.
A few days ago we replaced our web site with an updated version. The original site's content was migrated to http://backup.example.com. Search engines do not know about the old site, and I do not want them to know.
While we were in the process of updating our site, Google crawled the old version.
Now when using Google to search for our web site, we get results for both the new and old sites (e.g., http://www.example.com and http://backup.example.com).
Here are my questions:
Can I update the backup site content with the new content? Then we can get rid all of old content. My concern is that Google will lower our page ranking due to duplicate content.
If I prevent the old site from being accessed, how long will it take for the information to clear out of Google's search results?
Can I use google disallow to block Google from the old web site.
You should probably put a robots.txt file in your backup site and tell robots not to crawl it at all. Google will obey the restrictions though not all crawlers will. You might want to check out the options available to you at Google's WebMaster Central. Ask Google and see if they will remove the errant links for you from their data.
you can always use robot.txt on backup.* site to disallow google to index it.
More info here: link text
Are the URL formats consistent enough between the backup and current site that you could redirect a given page on the backup site to its equivalent on the current one? If so you could do so, having the backup site send 301 Permanent Redirects to each of the equivalent pages on the site you actually want indexed. The redirecting pages should drop out of the index (after how much time, I do not know).
If not, definitely look into robots.txt as Zepplock mentioned. After setting the robots.txt you can expedite removal from Google's index with their Webmaster Tools
Also you can make a rule in your scripts to redirect with header 301 each page to new one
Robots.txt is a good suggestion but...Google doesn't always listen. Yea, that's right, they don't always listen.
So, disallow all spiders but....also put this in your header
<meta name="robots" content="noindex, nofollow, noarchive" />
It's better to be safe than sorry. Meta commands are like yelling at Google "I DONT WANT YOU TO DO THIS TO THIS PAGE". :)
Do both, save yourself some pain. :)
I suggest you to either add no index meta tag in all old page or just disallow by robots.txt. Best way to just blocked the by robots.txt. One thing more add the sitemap in new site and submit it in webmaster that improve your new website indexing.
Password protect your webpages or directories that you don't want web spiders to crawl/index by putting password protecting code in the .htaccess file (if present in your website's root directory on the server or create a new one and upload it).
The web spiders will never know that password and hence won't be able to index the protected directories or web pages.
you can block any particular urls in webmasters check once...even you can block using robots.txt....remove sitemap for your old backup site and put noindex no follow tag for all of your old backup pages...i too handled this situation for one of my client............