would lazy-loading img src negatively impact SEO

would lazy-loading img src negatively impact SEO - seo

I'm working on a shopping site. We display 40 images in our results. We're looking to reduce the onload time of our page, and since images block the onload event, I'm considering lazy loading them by initially setting img.src="" and then setting them after onload. Note that this is not ajax loading of html fragments. the image html along with the alt text is present. it's just the image src is deferred.
Does anyone have any idea as to whether this may harm SEO or lead to a google penalty box now that they are measuring sitespeed?

Images don't block anything, they are already lazy loaded. The onload event notifies you that all of the content has been downloaded, including images, but that is long after the document is ready.
It might hurt your rank because of the lost keywords and empty src attributes. You'll probably lose more than you gain - you're better off optimizing your page in other ways, including your images. Gzip + fewer requests + proper expires + a fast static server should go a long way. There is also a free CDN that might interest you.
I'm sure google doesn't mean for the whole web to remove their images from source code to gain a few points. And keep in mind that they consider anything under 3s to be good loading times, there's plenty of room to wiggle before resorting to voodoo techniques.

From a pure SEO perspective, you shouldn't be indexing search result pages. You should index your home page and your product detail pages, and have a spiderable method of getting to those pages (category pages, sitemap.xml, etc.)
Here's what Matt Cutts has to say on the topic, in a post from 2007:
In general, we’ve seen that users usually don’t want to see search results (or copies of websites via proxies) in their search results. Proxied copies of websites and search results that don’t add much value already fall under our quality guidelines (e.g. “Don’t create multiple pages, subdomains, or domains with substantially duplicate content.” and “Avoid “doorway” pages created just for search engines, or other “cookie cutter” approaches…”), so Google does take action to reduce the impact of those pages in our index.
http://www.mattcutts.com/blog/search-results-in-search-results/
This isn't to say that you're going to be penalised for indexing the search results, just that Google will place little value on them, so lazy-loading the images (or not) won't have much of an impact.

There are some different ways to approach this question.
Images don't block load. Javascript does; stylesheets do to an extent (it's complicated); images do not. However, they will consume http connections, of which the browser will only fire off 2 per domain at a time.
So, what you can do that should be worry-free and the "Right Thing" is to do a poor man's CDN and just drop them on www1, www2, www3, etc on your own site and servers. There are a number of ways to do that without much difficulty.
On the other hand: no, it shouldn't affect your SEO. I don't think Google even bothers to load images, actually.

We display 40 images in our results.
first question, is this page even a landing page? is it targeted for a specific keyword? internal search result pages are not automatically landing pages. if they are not a landingpage, then do whatever you want with them (and make sure they do not get indexed by google).
if they are a landingpages (a page targeted for a specific keyword) the performance of the site is indeed important, for the conversion rate of these pages and indirectly (and to a smaller extend also directly) also for google. so a kind of lazy load logic for pages with a lot of images is a good idea.
i would go for:
load the first two (product?) images in an SEO optimized way (as normal HTML, with a targeted alt text and a targeted filename). for the rest of the images make a lazy load logic. but not just setting the src= to blank, but insert the whole img tag onload (or onscroll, or whatever) into your code.
having a lot of broken img tags in the HTML for non javacript users (i.e.: google, old mobile devices, textviewer) is not a good idea (you will not get a penalty as long as the lazy loaded images are not missleading) but shitty markup is never a good idea.
for general SEO question please visit https://webmasters.stackexchange.com/ (stack overflow is more for programing related questions)

I have to disagree with Alex. Google recently updated its algorithm to account for page load time. According to the official Google blog
...today we're including a new signal in our search ranking algorithms: site speed. Site speed reflects how quickly a website responds to web requests.
However, it is important to keep in mind that the most important aspect of SEO is original, quality content.
http://googlewebmastercentral.blogspot.com/2010/04/using-site-speed-in-web-search-ranking.html

I have been added lazyload to my site (http://www.amphorashoes.ro) and i have better pagerank from google (maybe because the content is loading faster) :)

first,don't use src="",it may hunt your page,make a small loading image instead it.
second,I think it won't affect SEO, actually we always use alt="imgDesc.." to describe this image, and spider may catch this alt but not analyse this image what id really be.

I found this tweet regarding Google's SEO
There are various ways to lazy-load images, it's certainly worth
thinking about how the markup could work for image search indexing
(some work fine, others don't). We're looking into making some clearer
recommendations too.
12:24 AM - 28 Feb 2018
John Mueller - Senior Webmaster Trends Analyst
From what I understand, it looks like it depends on how you implement your lazy loading. And Google is yet to recommend an approach that would be SEO friendly.
Theoretically, Google should be running the scripts on websites so it should be OK to lazy load. However, I can't find a source(from Google) that confirms this.
So it looks like crawling lazy loaded or deferred images may not be full proof yet. Here's an article I wrote about lazy loading image deferring and seo that talks about it in detail.
Here's working library that I authored which focuses on lazy loading or deferring images in an SEO friendly way .
What it basically does is cancel the image loading when DOM is ready and continue loading the images after window load event.
...
<div>My last DOM element</div>
<script>
(function() {
// remove the all sources!
})();
window.addEventListener("load", function() {
// return all the sources!
}, false);
</script>
</body>
You can cancel loading of an image by removing it's src value or replacing it with a placeholder image. You can test this approach with Google Fetch
You have to make sure that you have the correct src until DOM is ready so to be sure that Google Fetch will capture your imgs original src.

Related

Hiding a page part from Google, does it hurt SEO?

We all know that showing inexistent stuff to Google bots is not allowed and will hurt the search positioning but what about the other way around; showing stuff to visitors that are not displayed for Google bots?
I need to do this because I have photo pages each with the short title and the photo along with textarea containing the embed HTML code. googlebot is taking the embed code and putting it at the page description on its search results which is very ugly.
Please advise.

When you start playing with tricks like that, you need to consider several things.
... showing stuff to visitors that are not displayed for Google bots.
That approach is a bit tricky.
You can certainly check User-agents to see if a visitor is Googlebot, but Google can add any number of new spiders with different User-agents, which will index your images in the end. You will have to constantly monitor that.
Testing of each code release your website will have to check "images and Googlebot" scenario. That will extend testing phase and testing cost.
That can also affect future development - all changes will have to be done with "images and Googlebot" scenario in mind which can introduce additional constraints to your system.
Personally I would choose a bit different approach:
First of all review if you can use any methods recommended by Google. Google provides a few nice pages describing that problem e.g. Blocking Google or Block or remove pages using a robots.txt file.
If that is not enough, maybe restructuring of you HTML would help. Consider using JavaScript to build some customer facing interfaces.
And whatever you do, try to keep it as simple as possible, otherwise very complex solutions can turn around and bite you.
It is very difficult to give you very good advise without knowledge of your system, constraints and strategy. But I hope my answer will help you out to choose good architecture / solution for your system.

Boy, you want more.
Google does not because of a respect therefore judge you cheat, he needs a review, as long as your purpose to the user experience, the common cheating tactics, Google does not think you cheating.

just block these pages with robots.txt and you`ll be fine, it is not cheating - that's why they came with solution like that in the first place

Crawling for Eternity

I've recently been building a new web app dealing with Recurring Events. These events can recur on a daily, weekly or monthly basis.
This all is working great. But when I started creating the Event Browser Page (which will be visible to the public internet) a thought came across my mind.
If a crawler hits this page, with a next and previous button to browse the dates, it will just continue forever ? So I opted out of using generic HTML links and used AJAX. Which means that bots will not be able to follow the links.
But this method means I'm losing any that functionality for users without Javascript. Or is the amount of users without Javascript too small to worry ?
Is there a better way to handle this ?
I'm also very interested in how bots like the Google Crawler detects black holes like these and what it does to handle them ?

Add a nofollow tag to the page, or to the individual links you don't want crawled. This can be in robots.txt or in the page source. See the Robots Exclusion Standard
You may still need to think about how to fend off ill-behaved bots which do not respect the standard.

Even a minimally functional web crawler requires a lot more sophistication than you might imagine, and the situation you describe is not a problem. Crawlers operate on some variant of a breadth-first search, so even if they do nothing to detect black holes, it's not a big deal. Another typical feature of web crawlers that helps is that they avoid fetching a lot of pages from the same domain in a short time span, because otherwise they would inadvertently be performing a DOS attack against any site with less bandwidth than the crawler.
Even though it's not strictly necessary for a crawler to detect black holes, a good one might have all sorts of heuristics to avoid wasting time on low-value pages. For instance, it may choose ignore a pages that don't have a minimum amount of English (or whatever language) text, pages that contain nothing but links, pages that seem to contain binary data, etc. The heuristics don't have to be perfect because the basic breadth-first nature of the search ensures that no single site can waste too much of the crawler's time, and the sheer size of the web means that even if it misses some "good" pages, there are always plenty of other good pages to be found. (Of course this is from the perspective of the web crawler; if you own the pages being skipped, it might be more of a problem for you, but companies like Google that run web crawlers are intentionally secretive about the exact details of things like that because they don't want people trying to outguess their heuristics.)

How can I overcome the SEO implications of rendering content using client-side JS with FireBase?

I'm interested in using FireBase as a data-store for the creation of largely traditional, occasionally updated websites and am concerned about the SEO implications of rendering content using client-side JavaScript.
I know Google has made headway into indexing some JavaScript content, but am wondering what my best course of action is. I know I have some options:
Render content using 100% client-side JS, and probably suffer some indexing trouble
Build static HTML files on the server side (using Node, most likely) and serve them instead
First, I'm not sure how bad the problem actually is doing everything client side (am I solving something that needs solved?). And second, I just wonder if I'm missing some other obvious way to approach this.

Unfortunately, rendering data on the client-side generally makes it difficult to do SEO. Firebase is really intended for use with dynamic data, such as user account info, game data, etc, where SEO is not a goal.
That being said there are a few things you can do to optimize for SEO. First, you can render as much of your site as possible at compile time using a templating tool like mustache. This is what we did on the Firebase.com website (the entire site is static except for the tutorial and examples).
Second, if your app uses hash fragments in the URL for navigation (anything after the "#!"), you can provide a separate set of static or server-generated pages that correspond to your dynamic pages so that crawlers can read the data. Google has a spec for doing this, which you can see here:
https://developers.google.com/webmasters/ajax-crawling/docs/specification

SEO Superstitions: Are <script> tags really bad? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
We have an SEO team at my office, and one of their dictums is that having lots of <script> blocks inline with the HTML is apocalypticly bad. As a developer that makes no sense to me at all. Surely the Google search engineers, who are the smartest people on the planet, know how to skip over such blocks?
My gut instinct is that minimizing script blocks is a superstition that comes from the early ages of search engine optimizations, and that in today's world it means nothing. Does anyone have any insight on this?
per our SEO guru, script blocks (especially those that are in-line, or occur before actual content) are very, very bad, and make the google bots give up before processing your actual content. Seems like bull to me, but I'd like to see what others say.

It's been ages since I've played the reading google's tea leafs game, but there are a few reasons your SEO expert might be saying this
Three or four years back there was a bit of conventional wisdom floating around that the search engine algorithms would give more weight to search terms that happened sooner in the page. If all other things were equal on Pages A and B, if Page A mentions widgets earlier in the HTML file than Page B, Page A "wins". It's not that Google's engineers and PhD employees couldn't skip over the blocks, it's that they found a valuable metric in their presence. Taking that into account, it's easy to see how unless something "needs" (see #2 below) to be in the head of a document, an SEO obsessed person would want it out.
The SEO people who aren't offering a quick fix tend to be proponents of well-crafted, validating/conforming HTML/XHTML structure. Inline Javascript, particularly the kind web ignorant software engineers tend to favor makes these people (I'm one) seethe. The bias against script tags themselves could also stem from some of the work Yahoo and others have done in optimizing Ajax applications (don't make the browser parse Javascript until is has to). Not necessarily directly related to SEO, but a best practice a white hat SEO type will have picked up.
It's also possible you're misunderstanding each other. Content that's generated by Javascript is considered controversial in the SEO world. It's not that Google can't "see" this content, it's that people are unsure how its presence will rank the page, as a lot of black hat SEO games revolve around hiding and showing content with Javascript.
SEO is at best Kremlinology and at worse a field that the black hats won over a long time ago. My free unsolicited advice is to stay out of the SEO game, present your managers with estimates as so how long it will take to implement their SEO related changes, and leave it at that.

There's several reasons to avoid inline/internal Javascript:
HTML is for structure, not behavior or style. For the same reason you should not put CSS directly in HTML elements, you should not put JS.
If your client does not support JS you just pushed a lot of junk. Wasted bandwith.
External JS files are cached. That saves some bandwith.
You'll have a descentralized javascript. That leads to code repetition and all the known problemns that comes with it.

I don't know about the SEO aspect of this (because I never can tell the mambo jambo from the real deal). But as Douglas Crockford pointed out in one of his javascript webcasts the browser always stops for parsing the script, at each element. So, if possible, I'd rather deliver the whole document and enhance the page as late as possible with scripts anyway.
Something like
<head>
--stylesheets--
</head>
<body>
Lorem ipsum dolor
...
...
<script src="theFancyStuff.js"></script>
</body>

I've read in a few places that Google's spiders only index the first 100KB of a page. 20KB of JS at the top of your page would mean 20KB of content later on that Google wouldn't see, etc.
Mind you, I have no idea if this fact is still true, but when combine it with the rest of the superstition/rumors/outright quackery you find in the dark underbelly of SEO forums, it starts to make a strange sort of sense.
This is in addition to the fact that inline JS is a Bad Thing with respect to the separation of presentation, content, and behavior, as mentioned in other answers.

Your SEO guru is slightly off the mark, but I understand the concern. This has nothing to do with whether or not the practice is proper, or whether or not a certain number of script tags is looked upon poorly by Google, but everything to do with page weight. Google stops caching after (I think) 150KB. The more inline scripts your page contains, the greater the chance important content will not be indexed because those scripts added too much weight.

I've spent some time working on search engines (not Google), but have never really done much from an SEO perspective.
Anyway, here are some factors which Google could reasonably use to penalise the page which should be increased by including big blocks of javascript inline.
Overall page size.
Page download time (a mix of page size and download speed).
How early in the page the search terms occurred (might ignore script tags, but that's a lot more processing).
Script tags with lots of inline javascript might be interpreted to be bad on their own. If users frequently loaded a lot of pages form the site, they'd find it much faster if the script was in a single shared file.

I would agree with all of the other comments but would add that when a page has more than just
<p> around the content you are putting your faith in Google to interpret the mark-up correctly and that is always a risky thing to do. Content is king and if Google can't read the content perfectly then it's just another reason for google to not show you the love.

This is an old question, but still pretty relevant!
In my experience, script tags are bad if they cause your site to load slowly. Site speed actually does have an impact on your appearance in SERPs, but script tags in and of themselves aren't necessarily bad for SEO.

Lots of activities in SEO is not recommended by search engine. You can use <script> tag but not excessively. Even Google Analytics snippet code in <script> tag.

HTML Compression and SEO? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
At work, we have a dedicated SEO Analyst who's job is to pour over lots of data (KeyNote/Compete etc) and generate up fancy reports for the executives so they can see how we are doing against our competitors in organic search ranking. He also leads initiatives to improve the SEO rankings on our sites by optimizing things as best we can.
We also have a longstanding mission to decrease our page load time, which right now is pretty shoddy on some pages.
The SEO guy mentioned that semantic, valid HTML gets more points by crawlers than jumbled messy HTML. I've been working on a real time HTML compressor that will decrease our page sizes my a pretty good chunk. Will compressing the HTML hurt us in site rankings?

I would suggest using compression at the transport layer, and eliminating whitespace from the HTML, but not sacrificing the semantics of your markup in the interest of speed. In fact, the better you "compress" your markup, the less effective the transport layer compression will be. Or, to put it a better way, let the gzip transfer-coding slim your HTML for you, and pour your energy into writing clean markup that renders quickly once it hits the browser.

Compressing HTML should not hurt you.
When you say HTML compressor I assume you mean a tool that removed whitespace etc from your pages to make them smaller, right? This doesn't impact how a crawler will see your html as it likely strips the same things from the HTML when it grabs the page from your site. The 'semantic' structure of the HTML exists whether compressed or not.
You might also want to look at:
Compressing pages with an GZIP compression in the web server
Reducing size of images, CSS, javascript etc
Considering how the browser's layout engine loads your pages.
By jumbled HTML, this SEO person probably means the use of tables for layout and re-purposing of built in HTML elements (eg. <p class="headerOne">Header 1</p>). This increases the ratio of HTML tags to page content, or keyword density in SEO terms. It has bigger problems though:
Longer page load times due to increased content to download, why not use the H1 tag?
It's difficult for screenreaders to understand and affects site accessibility.
Browsers may take longer to render the content depending on how they parse and layout pages with styles.

I once retooled a messy tables-for-layout to xhtml 1.0 transitional and the size went from 100kb to 40kb. The images loaded went from 200kb to just 50kb.
The reason I got such a large savings was because the site had all the JS embedded in every page. I also retooled all the JS so it was correct for both IE6 and FF2. The images were also compiled down to an image-map. All the techniques were well documented on A List Apart and easy to implement.

Use gzip compression to compress the HTML in the transport stage, then just make sure that code validates and that you are using logical tags for everything.

The SEO guy mentioned that semantic,
valid HTML gets more points by
crawlers than jumbled messy HTML.
If a SEO guy ever tries to provide a fact about SEO then tell him to provide a source, because to the best of my knowledge that is simply untrue. If the content is there it will be crawled. It is a common urban-myth amongst SEO analysts that just isn't true.
However, the use of header tags is recommended. <H1> tags for the page title and <H2> for main headings, then lower down for lower headings.
I've been working on a real time HTML
compressor that will decrease our page
sizes my a pretty good chunk. Will
compressing the HTML hurt us in site
rankings?
If it can be read on the client side without problem then it is perfectly fine. If you want to look up any of this I recommend anything referencing Matt Cutt's or from the following post.
FAQ: Search Engine Optimisation

Using compression does not hurt your page ranking. Matt Cutts talks about this in his article on Crawl Caching Proxy
Your page load time can also be greatly improved by resizing your images. While you can use the height and width attributes in the img tag, this does not change the size of the images that is downloaded to the browser. Resizing the images before putting them on your pages can reduce the load time by 50% or more, depending on the number and type of images that you're using.
Other things that can improve your page load time are:
Use web standards/CSS for layout
instead of tables
If you copy/paste
content from MS Word, strip out the
extra tags that Word generates
Put
CSS and javascript in external
files, rather then embedded in the
page. Helps when users visit more
than one page on your site because
browsers typically cache these files
This Web Page Analyzer will give you a speed reports that shows how long different elements of your page take to download.

First you check on the code. The code is validate w3c standards like HTML & CSS

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas