hide text or div from crawlers [closed] - seo

lets say i have a text
<span class="hide">for real</span><h2 id='show'>Obama is rocking the house</h2>
<span class="hide">not real</span><h2 id='show'>Bill gates is buying stackoverflow</h2>
i need the crawler to just read the
<h2 id='show'>Obama is rocking the house</h2>
<h2 id='show'>Bill gates is buying stackoverflow</h2>
can we do that?
im a bit confused here say that a hidden div is readed by google
Does google index pages with hidden divs?
but when i google for a sec, i found out that google doesnt read hidden div. so which is right?
what i have in mind is to ofucate it like using css instead.,
i can put my text in a image. output it using image generator or something.

FYi, serving different content to users then to search engines is a violation of Google's terms of service and will get you banned if you're caught. Content that is hidden but can be accessed through some kind of trigger (navigation menu links is hovered over, the clicks on an icon to expand a content area, etc) is acceptable. But in your example you are showing different content to search engines specifically for their benefit and that is definitely what you don't want to do.

The best way to suggest that a webcrawler not access content on your site is to create a robots.txt file. See http://robotstxt.org. There is no way to tell a robot to not access one part of a page
If you are going to use CSS remember that robots can still read CSS files! You could include the CSS file in the robots.txt file, though to exclude it.
If you really must have indexed and non-indexed content on the same page, maybe you should use frames and have the non-indexed frame listed in the robots.txt file as not to be indexed.
Well behaved crawlers will follow the robots.txt guidance, e.g. Google, but naughty ones will not. So, there is no guarantee.

I can confirm that google does read the hidden div, while it's not showing up in the search results.
The reason I know: I admin a website that has backlinks on a highly respected non profit. As the non profit doesn't want to show up in search results for a company website, they hide the links.
However, if I check google's webmaster tools, I can see the backlinks form this non profit.


I have a network of about 200 blogs (Wordpress Multisite), and all of them show links to all the other ones on a sidebar on the right hand side (basically 200+ links on the right hand side of every single page). I have it set to rel="nofollow" now, but I was wondering if changing it to rel="noindex, nofollow" would be a good idea?
Thank you for any input.
nofollow only means that a bot should not follow this link. If you are concerned only about Google (as your tag suggests) this will probably be of help:
How does Google handle nofollowed links?
In general, we don't follow them. This means that Google does not
transfer PageRank or anchor text across these links. Essentially,
using nofollow causes us to drop the target links from our overall
graph of the web. However, the target pages may still appear in our
index if other sites link to them without using nofollow, or if the
URLs are submitted to Google in a Sitemap. Also, it's important to
note that other search engines may handle nofollow in slightly
different ways.
However, adding this attribute is in no way a hard restriction, there is no standard, and some bots may ignore it altogether. Also, search engines may still flag the page as a linkbuilding site depending on content/link ratio.
noindex is not used in links by Google (I do not know about others). It is meant for the robots <meta> attribute in the html header and applies to the whole page. So it is most likely no use for you. Example:
<meta name="robots" content="noindex"/>
200 links are however not very user-friendly either. You should seriously consider reducing the number of links by (for example) selecting those that have a similar topic.
As you read this, look to the right, yes, here on Stack Overflow, there is a "Box" titled Related. This is how you do it. Imagine them putting every single topic ever created in there... Not very useful.
Also if you do this with some logic like I suggested above and not just randomly selecting N links from the list, you can probably remove the nofollow, since the links will become useful and Google likes useful links.
You could then also add a "spotlight" for low-traffic sites (those would probably need the nofollow though)

Schema.org siteNavigationElement

I'm having trouble getting the Webmaster Tools rich snippet testing tool to properly return markup for schema.org's WebPageElement types.
Does anyone have a site that hosts this markup?
I'm looking for solutions for a website that has undesirable snippets returned on Google search. The website is an interactive library of slide presentations, with an advanced search function.
Many different search pages on this site are being dropped from the Google index every week. The snippet returned on these pages includes the navigation menu. There is no h1 tag and the first line of the navigation menu is in bold, so Google is identifying the menu as the main content of the page and returning this info in the search results.
I need Google to put the actual page content in the search results, to increase click through rate and resolve a probable duplicate content issue.
I thought it would be good to put an h1 tag on the site, and add schema for WebPageElement, SiteNavigationElement, WPHeader, WPFooter, and WebPage.
Does anyone have examples of this markup on their site?
In the past I've used the rich snippet tool and had it return error, and in every instance I found that my code did indeed contain an error, so I don't think it's the tool.
I have implemented several of the schema.org WebPageElement types in http://gamesforkidsfree.net/en/ including siteNavigationElement
You can check how it is being recognized by Google in Rich Snippets Testing Tool.
Also in Google Webmaster Tools, there is a section to check this kind of markup at "Optimization / Structured Data", for this case it shows:
Type Schema Items # Pages
ItemPage schema.org 109,657 6,866
WPAdBlock schema.org 20,727 6,973
SiteNavigationElement schema.org 7,350 7,322
WPHeader schema.org 7,319 7,319
WPFooter schema.org 7,319 7,319
WebPage schema.org 649 649
Regarding duplicate content you can have a look at one of the many Google support pages about canonicalization (isn't that duplicate content? :) e.g. canonicalization -> hints.
It would be easier to answer if you could show the actual website or a SERP screenshot. By the way I don't think that your problem can be solved using that kind of markup since there is no evidence that Google supports it even if Schema.org is a Google initiative.
For what I understand you have two different kind of issues:
Bad search snippets. Google shows in the search snippet a fragment of the on page text that is relevant to the user query. So what you see on the search snippet largely depends on the query you typed in the search box. If you see a piece of the navigation menu in the snippets it could be that there is no relevant text in the indexed page so Google does not have anything better to show than the text in the navigation menu
Search pages being dropped from the Google index. This is a different, and more serious, problem. Are those "search pages" a good and relevant result compared to the other pages ranking for the query you are typing? Is the main topic of the page clear and explicit (remember that sometimes you nee to spoon-feed the search engines)? I'm giving you more questions than answers but, as I stated before, is not easy to diagnose a SEO problem without seeing the web site.
All the above being said, google does show in its SERP when you define BREADCRUMP and schema.org as a whole is being made by the search engine giants so implementing it ensures some level of better understanding of the bots about your page. Search engines do not tell you everything they do but if you follow the main standards they produce together you pretty much ensure yourself good content availability within the SERPs.
You shouldn't count much on the impact from that though.
I suggest you focus mainly on pretty urls, canonical usage, title, description and proper implementation of schema.org itemprop for your main content type on the inner pages as well as H1 for your title.
Also try to render your main content as high as possible within the html and avoid splitting your title, summary and image… best case scenario they should be close to each other with H1, IMG and P elements and not be divided by divs, tables and so on.
You can have a look at this site http://svejo.net/1792774-protsesat-na-tsifrovizatsiya-v-balgariya-zapochva
It has a pretty good SEO on its article pages and shows up quite nicely and often in SERPs because of its on-page SEO.
I hope this helps you.

I’m building a webshop that sells tires. I think it would be best user-friendly-wise to hide my products behind a search form, where you can select tire dimension, price range etc.
I’ve been told that Google will never submit a form, when crawling a site, so if I “hide” the products by using a form, does Google ever index my products?
If no, how do I best work around this? I’ve been thinking about doing a regular menu with category submenus (By brand, price range, speed limit etc.), so that Google can crawl my links and then replace the menu with a form using javascript. Then Google will crawl the links and the user will browse by form. But if I have 3000 products, could it cause duplicate content, flag for link spam (if there is such a thing) etc. ?
If the only way to find your products is to complete and submit a form then, no, Google nor any other search engine will be able to find and index that content. To get around this you have a few options:
Have an HTML sitemap on your site that also links to your products. Besides being a good way to generate internal links with good anchor text, it also allows search engines an alternative means to find that content.
Submit an XML sitemap. This is similar to an HTML sitemap except it is in XML and not publicly visible.
Use progressive enhancement and have a menu available to users who don't have JavaScript turned on. Then using JavaScript recreate your form functionality (assuming this increases usability).
You shouldn't run into any duplicate content issues unless you can get to same product using more then one URL. None of the above should cause that to happen. But if how you implemented your products can cause this to happen just use canonical URLs to identify the main URL. Then if the search engines see multiple pages for the same content they know which one is the main page and to include it in their search results.
To avoid any on-site duplicate content issues, you can use the canonical tag to indicate the primary content page. This works quite well for ecommerce sites where there are often multiple ways to reach a product listing.
Another way having a separate page for each product is helpful for SEO is that this can be used to create a good link for your visitors to share via social networking, forums, blogs and so forth. So, instead of sharing something like mytirestore.com?q=24892489892489248&p=824824 they can share something like mytirestore.com/firestone/r78-13. This kind of keyword targeted external link will also work wonders for you SEO for product specific keywords.

In SEO people talk a lot about Google PageRank. It's kind of a catch 22 because until your site is actually big and you don't really need search engines as much, it's unlikely that big sites will link to you and increase your PageRank!
I've been told that it's easiest to simply get a couple high quality links to point to a site to raise it's PageRank. I've also been told that there are certain Open Directories like dmoz.org that Google pays special attention to (since they're human managed links). Can anyone speak to the validity of this or suggest another site/technique to increase a site's PageRank?
Have great content
Nothing helps your google rank more than having content or offering a service people are interested in. If your web site is better than the competition and solves a real need you will naturally generate more traffic and inbound links.
Keep your content fresh
Use friendly url's that contain keywords
Good: http://cars.com/products/cars/ford/focus/
Bad: http://cars.com/p?id=1232
Make sure the page title is relevant and well constructed
For example: Buy A House In France :. Property Purchasing in France
Use a domain name that describes your site
Good: http://cars.com/
Bad: http://somerandomunrelateddomainname.com/
Type car into Google, out of the top 5 links all 4 have car in the domain: http://www.google.co.uk/search?q=car
Make it accessible
Make sure people can read your content. This includes a variety of different audiences
People with disabilities: Sight, motor, cognitive disabilities etc..
Search bots
In particular make sure search bots can read every single relevant page on your site. Quite often search bots get blocked by the use of javascript to link between pages or the use of frames / flash / silverlight. One easy way to do this is have a site map page that gives access to the whole site, dividing it into categories / sub categories etc..
Down level browsers
Submit your site map automatically
Most search engines allow you to submit a list of pages on your site including when they were last updated.
Google: https://www.google.com/webmasters/tools/docs/en/about.html
Inbound links
Generate as much buzz about your website as possible, to increase the likely hood of people linking to you. Blog / podcast about your website if appropriate. List it in online directories (if appropriate).
Google Search Engine Ranking Factors, by an SEO company
Creating a Google-friendly site: Best practices
Wikipedia - Search engine optimization
Good content.
Update it often.
Read and digest everything at Creating a Google-friendly site: Best practices.
Be active on the web. Comment in blogs, correspond genuinely with people, in email, im, twitter.
I'm not too sure about the domain name. Wikipedia? What does that mean? Mozilla? What word is that? Google? Was a typo. Yahoo? Sounds like that chocolate drink Yoohoo.
Trying to keyword the domain name shoehorns you anyway. And it can be construed as a SEO technique in the future (if it isn't already!)
Answer all email. Answer blog comments. Be nice and helpful.
Go watch garyvee's Better Than Zero. That'll motivate you.
If it's appropriate, having a blog is a good way of keeping content fresh, especially if you post often. A CMS would be handy too, as it reduces the friction of updating. The best way would be user-generated content, as other people make your site bigger and updated, and they may well link to their content from their other sites.
Google doesn't want you to have to engineer your site specifically to get a good PageRank. Having popular content and a well designed website should naturally get you the results you want.
A easy trick is to use
Google webmaster tool https://www.google.com/webmasters/tools
And you can generate a sitemap using http://www.xml-sitemaps.com/
Then, don't miss to use www.google.com/analytics/
And be careful, most SEO guides are not correct, playing fair is not always the good approach. For example,everyone says that spamming .edu sites is bad and ineffective but it is effective.

At work, we have a dedicated SEO Analyst who's job is to pour over lots of data (KeyNote/Compete etc) and generate up fancy reports for the executives so they can see how we are doing against our competitors in organic search ranking. He also leads initiatives to improve the SEO rankings on our sites by optimizing things as best we can.
We also have a longstanding mission to decrease our page load time, which right now is pretty shoddy on some pages.
The SEO guy mentioned that semantic, valid HTML gets more points by crawlers than jumbled messy HTML. I've been working on a real time HTML compressor that will decrease our page sizes my a pretty good chunk. Will compressing the HTML hurt us in site rankings?
I would suggest using compression at the transport layer, and eliminating whitespace from the HTML, but not sacrificing the semantics of your markup in the interest of speed. In fact, the better you "compress" your markup, the less effective the transport layer compression will be. Or, to put it a better way, let the gzip transfer-coding slim your HTML for you, and pour your energy into writing clean markup that renders quickly once it hits the browser.
Compressing HTML should not hurt you.
When you say HTML compressor I assume you mean a tool that removed whitespace etc from your pages to make them smaller, right? This doesn't impact how a crawler will see your html as it likely strips the same things from the HTML when it grabs the page from your site. The 'semantic' structure of the HTML exists whether compressed or not.
You might also want to look at:
Compressing pages with an GZIP compression in the web server
Reducing size of images, CSS, javascript etc
Considering how the browser's layout engine loads your pages.
By jumbled HTML, this SEO person probably means the use of tables for layout and re-purposing of built in HTML elements (eg. <p class="headerOne">Header 1</p>). This increases the ratio of HTML tags to page content, or keyword density in SEO terms. It has bigger problems though:
Longer page load times due to increased content to download, why not use the H1 tag?
It's difficult for screenreaders to understand and affects site accessibility.
Browsers may take longer to render the content depending on how they parse and layout pages with styles.
I once retooled a messy tables-for-layout to xhtml 1.0 transitional and the size went from 100kb to 40kb. The images loaded went from 200kb to just 50kb.
The reason I got such a large savings was because the site had all the JS embedded in every page. I also retooled all the JS so it was correct for both IE6 and FF2. The images were also compiled down to an image-map. All the techniques were well documented on A List Apart and easy to implement.
Use gzip compression to compress the HTML in the transport stage, then just make sure that code validates and that you are using logical tags for everything.
The SEO guy mentioned that semantic,
valid HTML gets more points by
crawlers than jumbled messy HTML.
If a SEO guy ever tries to provide a fact about SEO then tell him to provide a source, because to the best of my knowledge that is simply untrue. If the content is there it will be crawled. It is a common urban-myth amongst SEO analysts that just isn't true.
However, the use of header tags is recommended. <H1> tags for the page title and <H2> for main headings, then lower down for lower headings.
I've been working on a real time HTML
compressor that will decrease our page
sizes my a pretty good chunk. Will
compressing the HTML hurt us in site
If it can be read on the client side without problem then it is perfectly fine. If you want to look up any of this I recommend anything referencing Matt Cutt's or from the following post.
FAQ: Search Engine Optimisation
Using compression does not hurt your page ranking. Matt Cutts talks about this in his article on Crawl Caching Proxy
Your page load time can also be greatly improved by resizing your images. While you can use the height and width attributes in the img tag, this does not change the size of the images that is downloaded to the browser. Resizing the images before putting them on your pages can reduce the load time by 50% or more, depending on the number and type of images that you're using.
Other things that can improve your page load time are:
Use web standards/CSS for layout
instead of tables
If you copy/paste
content from MS Word, strip out the
extra tags that Word generates
CSS and javascript in external
files, rather then embedded in the
page. Helps when users visit more
than one page on your site because
browsers typically cache these files
This Web Page Analyzer will give you a speed reports that shows how long different elements of your page take to download.
First you check on the code. The code is validate w3c standards like HTML & CSS