SEO Considerations with CQ/AEM Image Component - seo

It's come up recently at my job that the SEO guys for our customer are unhappy with the src attribute values being generated in our img tags on their CQ/AEM-based website. I know next-to-nothing about SEO, so I won't pretend to understand, but it seems they have a point. We're not using the out-of-the-box image component per se, but the behavior is the same.
The src attribute of the img tags gets the path of the image node, with the img selector and some other stuff appended to it. This of course causes the request to go through the image servlet, which is then responsible for drawing the image. If I understand correctly, it's done this way to support things like the crop/resize/etc tools available in the html5smartimage widget. The servlet applies these edits to the image and renders the altered image.
The complaint is that the actual file name for the images are nowhere to be seen in that src attribute. I'm operating on the assumption that this is a valid complaint, but I really don't know if it is. I'm likely going to be asked to jump through hoops to change this behavior so the src attribute references the image by its direct path in the DAM.
Are these valid complaints? If the complaints are valid, why would the image component work this way? Should the title/alt values be considered sufficient for SEO purposes? If my customer is not using the extra features from html5smartimage, is there any other reason why I should not just address the images by their explicit DAM path? I've already worked out what I think is the best solution, but I'd like to be armed with more information before taking that plunge.

image component as it is allows you to have server side modified layouts of the same image (with usual transformations like cropping, rotation, ...) customised for each usage of it, that is different content for each usage (with one original image, and different settings in each component).
This has the drawback as you mention to locate the src of the image in a rather unfriendly URL in terms of SEO (i.e. where the component content is)
If you only want ONE version of one image, you'd surely should refer directly to the DAM image (or whatever image hosting you use).

Related

ImageResizer downloads an image multiple times?

I followed this article: http://imageresizing.net/blog/2013/effortless-responsive-images.
My images are stored on a CDN and after installing all the nuget packages, I got resizing to work, but the problem I ran into was that I had to add style="max-width:100%" to most of the images.
Also, I have a page where the same image appears in multiple spots and I guess Image Resizer thinks that this these spots should contain different sizes of the image, so it downloads 3 different versions which sort of defeats the purpose. Is this how it is supposed to work naturally?
As an example, I have imageA.png on a page and it might be in the top, middle, and bottom. Image Resizer is downloading a different version for each section.
What is the best way to use imageresizer with srcset? I can't seem to find any thing on it.
If I use the DiskCache plugin, will this serve images to other users that request the same size or is it just for the current user requesting it?
I'll try to break apart your 4-question question.
style="max-width:100%" to most of the images
CSS like img {max-width:100%} can do this globally. This rule is present by default in many themes/frameworks.
If an image appears in multiple spots, and those spots require differently sized/cropped versions of the image, there will be multiple requests. This is how it is supposed to work.
ImageResizer responds to URLs like "image.jpg?width=100" Just use those URLs as you would when using srcset normally. Here's the webkit demo.
DiskCache is not per-user. It is a global cache. It does re-apply authorization rules before serving from the cache.

Dynamic User Interface in iOS

Im struggling with a thought here. Let's say a user has his own CMS where he can fill the content for our app. One of his options is to create a view by uploading images and typing text. Well keep it very simple and imagine he only uploads a image (320 x 20) and some text. So an image on top and some textlines below.
What would be the best way to let my app know of this layout and download the contents? I was thinking of a downloadable XML file which defines the layout but don't really know how to implement this or if its even the best way.
Oh and the content and layout must be downloadable for offline use too.
Another option what I was thinking of is showing the layout in a webview but I can't figure out how to download the mobile website for offline use.
A push in the right direction would be appreciated!
We use a custom XML and it is working good. All texts inside 'label tags' are in XHTML
remember to:
be specific when defining the xml to save some effort
write a limiting XSD! So nothing 'surprising' creeps into the xml
remember not include everything in ONE xml file as that would get rather large rather quickly. Choose a scheme for portioning the XML

There is a question mark in my prestashop window. It wasn't there before

I just opened my store today. Before hosting firm directed my domain name to my IP there was no problem.
When I the site is activated some people wanted to change some links. I did that in a hurry maybe that caused an error. Later when I check my site I saw a big question mark on the right. I put all of the files back from my backup but the problem still exists. Is there anyone who can help me?
Thank you
FERDA
This is a missing image. Whenever an image (products, categories etc) is missing in prestashop, this question mark image is shown. It is 404.gif (or any other image format) placed in the img folder of the prestashop.
There may be two reasons to this issue:
1) At header section, it seems like you have a missing image (i think one of the social icons). This is the recommended case. Please check your images. I think it is one of the icon for social images.
2) This may be not the reason, but whenever there is some problem in the paths for images, prestashop display the 404 or question mark image. But it seems like paths are fine as other images are shown.
Proposed Solutions : First check in F12 tool (dont know what is it called :P ) that specific element i.e. right click on that question mark image and then select the Inspect Element, you will have the F12 tool opened. Check there and share the HTML code. Also please share some code before and some after that, so that it can be easily checked.

Screen Scraping with HTTP Headers Issue - I Think

I've been trying to figure this one out for about a week now and just
can't come up with a good solution. So, I figured I would see if anyone could help me out. Here's one of the links that I'm trying to scrape:
http://content.lib.washington.edu/cdm4/item_viewer.php?CISOROOT=/alaskawcanada&CISOPTR=491&CISOBOX=1&REC=4
I right-clicked to copy image location.
This is the link that is copied:
(Can't paste this as a link because I'm new)
http:// content (dot) lib (dot) washington (dot) edu/cgi-bin/getimage.exe?CISOROOT=/alaskawcanada&CISOPTR=491&DMSCALE=100.00000&DMWIDTH=802&DMHEIGHT=657.890625&DMX=0&DMY=0&DMTEXT=%20NA3050%20%09AWC0644%20AWC0388%20AWC0074%20AWC0575&REC=4&DMTHUMB=0&DMROTATE=0
There is no clear image URL being displayed. Obviously that's
because the image is hidden behind some type of script. Through trial and
error I found that I can put ".jpg" after the "CISOPTR=491" and then the link becomes an Image URL. The problem is that this is not the high-resolution version of the image. To get to the
high-resolution version I have to change the URL even more. I found a lot of articles #Stackoverflow.com to mention trying to build a script using curl and PHP, I have even tried a few of them with no luck. "491" is the image number and I can change that number to find other images in the same directory. So, scraping a sequence of numbers should be pretty easy. But I'm still a noob at scraping and this one is kicking my butt. Here's what I've tried.
Get remote image using cURL then resample
also tried this.
http://psung.blogspot.com/2008/06/using-wget-or-curl-to-download-web.html
I also have Outwit Hub, and Site Sucker, but they don't recognize the URL as an image file and fo they just pass right ove it. I used SiteSucker overnight and it download 40,000 files and only 60 were jpegs, none of which were the ones I wanted.
The other thing I keep running into, is the files I have been able to download manually, the filename is always either getfile.exe or showfile.exe and then if I manually add ".jpg" as the extension I can view the image locally.
How can I reached the original high-res image file, and automate the download process so that I can scrape a couple hundred of these images?
I right-clicked to copy image location. This is the link that is
copied:
You noticed the title has ".exe" in there. Look at the stuff in the query string:
DMSCALE=100.00000
DMWIDTH=802
DMHEIGHT=657.890625
DMX=0
DMY=0
DMTEXT=%20NA3050%20%09AWC0644%20AWC0388%20AWC0074%20AWC0575
REC=4
DMTHUMB=0
DMROTATE=0
Strongly implies the original source of this image is in a database or something and it is being passed thru a server-side filter (not sure if that is what you meant by "some kind of script"). Ie, this is dynamically generated content, not static, and the same caveats apply as would to dynamic text content: you have to figure out what instructions to provide the server to get it to cough up what you want. Which you pretty much have in front of you...if SiteSucker or whatever won't deal with it properly, scrape the address yourself using an HTML parser.

Prepare your site images for google image search indexing

I'm trying to understand how can I do to let my site be reachable from google image search spiders.
I like how last.fm solution, and I thought to use a technique like his staff do to let google find artists images on their pages.
When I'm looking for an artist and I search it on google image search, as often as not I find an image from last.fm artists page, I make an example:
If I search the band Pure Reason Revolution It brings me here, the artist's image page
http://www.last.fm/music/Pure+Reason+Revolution/+images/4284073
Now if I take a look to the image file, i can see it's named:
http://userserve-ak.last.fm/serve/500/4284073/Pure+Reason+Revolution+4.jpg
so if I try to understand how the service works I can try to say:
http://userserve-ak.last.fm/serve/ the server who serve the images
500/ the selected size for the image
4284073/ the image id for database
Pure+Reason+Revolution+4.jpg the image name
I thought it's difficult to think the real filename for the image is Pure+Reason+Revolution+4.jpg for image overwrite problems when an user upload it, in facts, if I digit:
http://userserve-ak.last.fm/serve/500/4284073.jpg
I probably find the real image location and filename
I see this can be done with mod_rewrite engine, but with this tecnique, will the image be highly reachable from search engines and easily archived?
My question is, does exist some guide or tutorial to approach on this kind of tecniques, or something similar?
In my opinion, the best resource for your question is Google itself.
One of the guides targets at google images search and provides some guidelines:
Don't embed text inside images
Tell us as much as you can about the image
Give your images detailed, informative filenames
Create great alt text
Anchor text
Provide good context for your image
Think about the best ways to protect your images
Create a great user experience
Source: Images - Webmaster Tools Help.
As for last.fm, one of the suggestions is:
Give your images detailed, informative
filenames
The filename can give Google clues
about the subject matter of the image.
Try to make your filename a good
description of the subject matter of
the image. For example,
my-new-black-kitten.jpg is a lot more
informative than IMG00023.JPG.
Descriptive filenames can also be
useful to users: If we're unable to
find suitable text in the page on
which we found the image, we'll use
the filename as the image's snippet in
our search results.
So yes, last.fm uses mod_rewrite to give informative filename, which google likes.
There are few more guides out there. None of them is formal, but they can help you anyway:
http://www.tareeinternet.com/forum/seo/236-optimizing-google-image-search.html
http://www.doshdosh.com/how-to-optimize-for-google-images-for-more-traffic/
http://creativebits.org/webdev/optimize_your_site_for_google_image_search
http://www.pearsonified.com/2007/01/get_53_percent_more_searches_with_one_tweak.php
The article pointed out by Tim covers most of it but I'd like to add that the title attribute on <img> tags is important too (but don't abuse it!).
To sum up:
Name your files well. apple.jpg is better SEO wise than PIC2346.jpg. For spaces in filenames use a dash (-) and not an underscore (_). See Dashes vs. underscores for more info.
Alyays fill up the alt attibute. Keep in mind that most screen readers for blind people will read this tag.
Fill the title attribute when usefull. Use a short statement describing the image. Not a whole paragraph!
The context of the image (what is the content around it) is very important too. If the image fits the surrounding contents it will give you more SEO "points".