Suppose I have my live site at www.mywebsite.com, tracked and managed via Google Webmaster Tools. Then I want to add to the project list a subdomain like test.mywebsite.com which I use for testing purposes. Of course that subdomain shouldn't be tracked or indexed by Google, but I would like to use "fetch as Google" feature on it to see how the crawler manages the pages. Can I set up such a test environment without being indexed by Google?
Not had chance to test this, but I think if you add noindex tags to your site then it should still allow your site to be registered with webmaster tools, as it can still see the site's content in order to detect ownership.
I believe "fetch as google" then returns live results rather than what is already indexed (it wouldn't be very useful if it didn't allow you to check new pages or re-check updated pages), and so temporarily removing the noindex tag when you run it should allow this feature to be used (it may also return some useful information without removing it).
The fact "fetch as" has a separate "submit" button suggests to me that it will not automatically index pages found via this method, so that should not be a concern.
Adding canonical tags pointing to your main content would provide an additional security measure to stop it accidentally listing.
Google can't provide any information about your website if it's not indexed.
In other words, you can use Google Webmaster Tools without your website being indexed, but it will be pretty much useless, since will not provide any data.
Google webmaster tools won't let you do that but you can test a website for seo checkup or other errors like search description missing,image alt missing etc with bing webmaster tools
Related
In most of my web site I have a lot of external links to my other sites and other external sites.
I need to know when is better to use rel="nofollow" or rel="external" in a website?
You may use external for every link to a different website, no matter if it’s yours or not, if it’s on the same host or not.
You may use nofollow for every link that you don’t endorse (for example: search engines shouldn’t assume that it’s a relevant link and should not give any ranking credit to this link).
You may use both values for the same link:
Foobar
Note that external doesn’t convey that the link should be opened in a new window.
Note that search engine bots (that support nofollow) might still follow a nofollow link (it doesn’t forbid to follow it). FWIW, there is also the nofollow value for the meta-robots keyword (which may mean the same … or not, depending on which definition you follow).
nofollow links attributes do not allow search engine bots to follow link.
If you have rel="nofollow" then the juice stops.
rel="external" dosent act like nofollow. its DoFollow link.
For rel="external" it means the file is on a different site to the current one.
rel="external" is the XHTML valid version that informs search engine spiders that the link is external.
However, using this does not open the link in a new window. target="_blank" and target="_new" does this, but is not XHTML valid. I hope that helps.
I advise you to use Nofollow Links for the following content:
Links in Comments or on Forums - Anything that has user-generated content is likely to be a source of spam. Even if you carefully moderate, things will slip through
Advertisements & Sponsored Links - Any links that are meant to be advertisements or are part of a sponsorship arrangement must be nofollowed.
Paid Links - If you charge in any way for a link (directory submission, quality assessment, reviews, etc.), nofollow the outbound links
**
If you have an external link to your own site then use
Your Link
If you have external link to someone else's site you don't trust then you can combine both and use
Other Domain Link
If you have an external link to someone else's site and you consider it's trustworthy then use
External Useful Link
It depends what you mean by "better". Those are two comopletely different attribute.
rel = nofollow tells the Search engine crawlers not to look at this link (probably you don't want this to happen for your other websites, but you will use it for other's web sites). Documentation: rel=nofollow - https://support.google.com/webmasters/answer/96569?hl=en
rel = external tells that the link is not part of the web site and open the link in a new window (it's not working for older IE). It is used as a valid XHTML attribute instead of target="_blank". Here you can learn how to use it: http://www.copterlabs.com/blog/easily-create-external-links-without-the-target-attribute/
I have been advised by an SEO consultant to add the "google-site-verification" meta tag to every page of my site. This is to make sure that my pages are indexed by google.
However, I am reluctant to do this for a couple of reasons
1) My site is already verified using an alternative method of verification -by hosting a html verification file on the server.
2) I recall reading an article indicating that this meta tag does not impact crawling or page rank.
I do have some pages that are not indexed.
An example is
http://www.contractsforgeeks.com/TechJobs/Florida/Tampa.aspx
But I am making the assumption that adding this meta tag will not help the page get indexed.
Is there any value in adding the site verification meta tag to each page instead of uploading a single html verification file?
For example, what happens if I accidentally delete the verification file from my site (some time after the site has already been verified) . Does it need to be need to be re-verified. Or is the verification process a one time deal? In which case, it may be safer to include in each page (even though it does not help indexing?)
One method is enough to verify your site. If you choose the HTML file method, you don't need to put meta tag "google-site-verification" to every page.
Moreover, as assumption, this meta tags doesn't help your site to be indexed by Google. It doesn't impact crawling or PageRank.
If you want seeing your site indexed, you can submit to Google Webmaster Tools a sitemap.xml and put more links from other sites pointing to yours.
And if you delete the verification HTML file from your site, you'll need to verify again your site, this process is not a one time deal.
It does not help indexing. It does not help ranking. Its only purpose is to verify that you are the one claiming to be when registering at Google Webmaster Tools.
If you delete the verification, you'd need to verify your domain again. Otherwise it would be possible to still control a domain at GWT, although the owner changed in the meantime.
If you need to argue against the use of the corresponding meta element, you could point out that it could actually lower your ranking, -- of course this would have no real, measurable effect, only in theory! -- because Google prefers faster-loading pages.
I ran my website through a web tool that evaluates SEO weight of elements and in the report it says that certain parts, like Description and other meta tags are missing... Also as a thumbnail of my site it shows a default server page. At the same time it shows the list of other pages that are linked from index page.
I checked and this AGENT is not blocked in robots.txt
Now, how can that be?
Demo
I think that the description issue is caused by the the fact that you are using "META" instead of "meta" in your meta tags.
there are many sites out that can run similar tests on your site such as the one you provided. It is just that site showing old data, you may want to submit your sitemap.xml to Bing & Google Webmaster Tools. If your site doesn't have a sitemap.xml file you may want to consider creating one.
Is it possible to fine-tune directives to Google to such an extent that it will ignore part of a page, yet still index the rest?
There are a couple of different issues we've come across which would be helped by this, such as:
RSS feed/news ticker-type text on a page displaying content from an external source
users entering contact phone etc. details who want them visible on the site but would rather they not be google-able
I'm aware that both of the above can be addressed via other techniques (such as writing the content with JavaScript), but am wondering if anyone knows if there's a cleaner option already available from Google?
I've been doing some digging on this and came across mentions of googleon and googleoff tags, but these seem to be exclusive to Google Search Appliances.
Does anyone know if there's a similar set of tags to which Googlebot will adhere?
Edit: Just to clarify, I don't want to go down the dangerous route of cloaking/serving up different content to Google, which is why I'm looking to see if there's a "legit" way of achieving what I'd like to do here.
What you're asking for, can't really be done, Google either takes the entire page, or none of it.
You could do some sneaky tricks though like insert the part of the page you don't want indexed in an iFrame and use robots.txt to ask Google not to index that iFrame.
In short NO - unless you use cloaking with is discouraged by Google.
Please check out the official documentation from here
http://code.google.com/apis/searchappliance/documentation/46/admin_crawl/Preparing.html
Go to section "Excluding Unwanted Text from the Index"
<!--googleoff: index-->
here will be skipped
<!--googleon: index-->
Found useful resource for using certain duplicate content and not to allow index by search engine for such content.
<p>This is normal (X)HTML content that will be indexed by Google.</p>
<!--googleoff: index-->
<p>This (X)HTML content will NOT be indexed by Google.</p>
<!--googleon: index>
At your server detect the search bot by IP using PHP or ASP. Then feed the IP addresses that fall into that list a version of the page you wish to be indexed. In that search engine friendly version of your page use the canonical link tag to specify to the search engine the page version that you do not want to be indexed.
This way the page with the content that do want to be index will be indexed by address only while the only the content you wish to be indexed will be indexed. This method will not get you blocked by the search engines and is completely safe.
Yes definitely you can stop Google from indexing some parts of your website by creating custom robots.txt and write which portions you don't want to index like wpadmins, or a particular post or page so you can do that easily by creating this robots.txt file .before creating check your site robots.txt for example www.yoursite.com/robots.txt.
All search engines either index or ignore the entire page. The only possible way to implement what you want is to:
(a) have two different versions of the same page
(b) detect the browser used
(c) If it's a search engine, serve the second version of your page.
This link might prove helpful.
There are meta-tags for bots, and there's also the robots.txt, with which you can restrict access to certain directories.
I am a beginner web developer and i have a site JammuLinks.com, it is built on php. It is a city local listing search engine. Basically i've written search pages which take in a parameter, fetch the records from the database and display it. So it is dynamically generating the content. However if you look at the bottom of the site, i have added many static links where i have hard coded the parameters in the link like searchresult.php?tablename='schools'. So my question is
Since google crawls the page and also the links listed in the page, will it be crawling the results page data as well? How can i identify if it has. So far i tried site:www.jammulinks.com but it results the homepage and the blog alone.
What more can i add to make the static links be indexed by it as well.
The best way to do this is to create a sitemap document (you can even get the template from Google's webmaster portion of their sites, www.google.com/webmasters/ I believe).