Google has all the wrong keywords - seo

I hope stackoverflow is the right part of the trinity to ask this kind of question ...
Google webmaster tools shows the keywords it considers important for my blog (blog.schauderhaft.de). But among the top 20% are all the month names (you know january and so on).
I actually have a two part question about this:
why does google think theses are important keywords?
how do I fix that?

It might have something to do with the whole list of archives in the head of your page: <link rel='archives' title='January 2008' and so on.
Do you think this will actually be a problem? These people don't seem to think so..

We used to have a big problem on one of our client websites with a similar problem country names appearing most important. On some pages we were running multiple forms where one could choose a country. Google was finding this all over the place and thus considered it important.
So if you have month names in archives/in dates of articles it might very well be a possibility. You have to ensure you tag each one properly if its a date you can maybe use the HTML5 code to identify that its a date; otherwise in case of archives what you can do is load this using AJAX; or calculate it using javascript.
In order to drop the counry names we had to use a jQuery trick to insert these dynamically into the page following page load. (so google no longer sees the list as important to our website)

Related

Trying to understand Google Results and meta tags

Note: this does NOT regard ranking, I just want the results to look better overall.
I'm working with a "news site" with a lot of articles, some dynamic, some static.
The developers haven't really given much thought about SEO but now want the Google Results to look a bit prettier - which landed on my table.
In the source code there's a few meta-tags, example:
<meta name="twitter:title" content="content">
<meta name="og:title" content="content">
Running it through Google Structured Data Testing Tool shows what I'd expect but it doesn't look like my search result for that specific link has the correct snippet.
Seems like it doesn't want to pick the og:description content all the time. Sometimes it does, and sometimes it also adds the title again in the snippet.
What I don't get: is Google using og:title for results or is that only for ex Facebook sharing? Do I simply need this one below, since that is actually missing from the code?
The description itself would be the same as og:description since they contain the same content.
<meta name="description" content="content">
As far as I understand it can be quite tricky to customize these sorts of things but could it really be that hard to have any sort of consistency throughout the results from our page?
There are two things you can do but both come with a caveat.
Google takes anything from your site as a suggestion. There is no way to program it to perform identically in all situations. If Google's algorithm believes there is a better way to present a result - it will ignore any direction you give it and auto-generate a new presentation for your page.
That said there's two things you can do:
Add meta tags with the exact text you'd like to appear on the SERP. The page title may or may not be appended with your brand/company name. If it already contains the company/brand name, Google is more likely to leave it where it is.
Google takes text from the page based on what it thinks is more important/relevant to the search. For News, using either HTML5 elements (nav, article, aside) or labelling your divs with a class using those key words will help Google understand what the real content is. Asides are less likely to be used while Articles will be focused upon.
I would also recommend having authors write their own custom descriptions and insert them with your CMS. They're likely much better at constructing a good summary than Google or an auto-summary script. Google will experiment with alt descriptions occasionally but once something solidifies itself as popular in terms of click rate, it'll stick.

How to create SEO-friendly paging for a grid?

I've got this grid (a list of products in an internet shop) for which I've no idea how big it can get. But I suppose a couple hundred items is quite realistic, especially for search results. Maybe even thousands, if we get a big client. :)
Naturally, I should use paging for such a grid. But how to do it so that search engine bots can crawl all the items too? I very much like this idea, but that only has first/last/prev/next links. If a search engine bot has to follow links 200 levels deep to get to the last page, I think it might give up pretty soon, and not enumerate all items.
What is the common(best?) practice for this?
Is it really the grid you want to have index by the search engine or are you afer a product detail page? If the last one is what you want, you can have a dynamic sitemap (XML) and the search engines will take it from there.
I run a number of price comparison sites and as such i've had the same issue as you before. I dont really have a concrete answer, i doubt anyone will have one tbh.
The trick is to try and make each page as unique as possible. The more unique pages, the better. Think of it as each page in google is a lottery ticket, the more tickets the more chances you have of winning.
So, back to your question. We tend to display 20 products per page and then have pagination at the bottom. AFAIK google and other bots will crawl all links on your site. They wouldnt give up. What we have noticed though is if your subsequent pages have the same SEO titles, H tags and is basically the same page but with different result sets then Google will NOT add the pages to the index.
Likewise i've looked at the site you suggested and would suggest changing the layout to be text and not images, an example of what i mean is on this site: http://www.shopexplorer.com/lcd-tv/index.html
Another point to remember is the more images etc... on the page the longer the page will take to load the worse your UI will be. I've also heard it affects quality on SEO ranking algorithms.
Not sure if i've given you enough to go on, but to recap:
i would limit the results to 20-30
I would use pagination but i would use text and not images
i would make sure the paginated pages have distinct enough 'SEO markers' [ title, h1 etc.. ] to be a unique page.
i.e.
LCD TV results page 2 > is bad
LCD TV results from Sony to Samsung > Better
Hopefully i've helped a little
EDIT:
Vlix, i've also seen your question ref: sitemaps. If you're concerned with that, i wouldnt be, then split the feed into multiple seperate feeds. Maybe on a category level, brand level etc... I'm not sure but i think google would want as many pages as possible. It will ignore the ones it doesnt like and just add the unique ones.
That at least, is how i understand it.
SEO is a dark art - nobody will be able to tell you exactly what to do and how to do it. However, I do have some general pointers.
Pleun is right - your objective should be to get the robots to your product detail page - that's likely to be the most keyword-rich, so optimize this page as much as you can! Semantic HTML, don't use images to show text, the usual.
Construct meaningful navigation schemes to lead the robots (and your visitors!) to your product detail pages. So, if you have 150K products, let's hope they are grouped into some kind of hierarchy, and that each (sub)category in that hierarchy has a managable (<50 or so) number of products. If your users have to go through lots and lots of pages in a single category to find the product they're interested in, they're likely to get bored and leave. Make this categorization into a navigation scheme, and make it SEO friendly - e.g. by using friendly URLs.
Create a sitemap - robots will crawl the entire sitemap, though they may not decide to pay much attention to pages that are hard to reach through "normal" navigation, even if they are in the sitemap.xml.
Most robots don't parse more than the first 50-100K of HTML. If your navigation scheme (with a data grid) is too big, the robot won't necessarily pick up or follow links at the end.
Hope this helps!

How can I present multiple pages with similar content (mostly images) without Google penalizing me?

I have a website that presents Q&As to mathematical problems, mostly for pupils aged approx. 16-18 years old. Due to the difficulties of presenting formulas on webpages, the Q&As (formulas) are presented as images. At the moment, each webpage contains one Q&A, and there are many questions and answers. Thus, with little in the way of text, every page looks almost identical. Therefore, Google might very easily see this as duplicate content. What is my best solution to this problem? Should I try put the Q&As in a database and present each different one on the same page (dynamically). Or should I keep things the way they are and prevent Google from seeing most of the Q&As? It is also difficult to make different titles, descriptions etc. as, for each topic, only the question number changes.
Many thanks for your time.
You're basically a ghost to google anyways if there is no text on each page. If you are worried about SEO you need to worry about text.
You should at the very least look into tagging the formulas or creating a title for the question which is relevant and putting that into a header tag above the question image.
Otherwise no one will find you by that content and that's what it's all about.
You said it: you can hide the QandA file/directory in the robots.txt file of your web server.
Disallow: /QAfolder
or
Disallow: /Q1.htm
Disallow: /Q2.htm
Disallow: /Q3.htm
or whatnot.
Normally, this would be a bad thing (preventing users from searching for question content) but as you said, they're images anyway.
Create descriptive useful page titles and meta descriptions.
Create textual representations of what is in the image using alt tags.
Use Different headers.
This could be a little hard to think about as in your context. but, you may be able probably use the question type description or name of the chapter its taken from. basically a text description relevant to the question.
One more thing you can do is. If you have empty space on your page, you can put in some text that describes your website and at the same time uses the right keywords(that you are targeting) in the right percentages Higher up in the page - You may writeup 2-3 different descriptions and alternate them between pages, i.e. if your design permits you.

Business Applications: What are the fundamental features of a search form?

In a typical business application it is quite common to have forms that are used for searching.
Some basic features are:
A pane that contains the search criteria
A grid to display the results
Sorting on the grid
A detail page that opens when an item is selected in the results grid
What other features would you expect in a business application's search functionality?
Maybe it's a bit trite but there is some sense in this picture:
removed dead ImageShack link
Do it as it shown at the second example, not as at the 3rd one.
There is a well known extreme programming principle - YAGNI. I think it's absolutely appliabe to almost any problem. You always can add something new if it's necessary, but it's much more difficult to remove something what is already exist because someone already uses it even if it's wrong.
How about the ability to save search criteria, in order to easily re-run a search later. Or, the ability to easily, cleanly, print the list of results.
If search refining is allowed (given a search result, limited future searches to the current results), you may also want to add a breadcrumb system, so that the user can see the sequence of refinements that lead you to the current result-set -- and by clicking on a breadcrumb, return to a previous refinement stage.
Faceted search:
(source: msdn.com)
This is displayed in the area in the right ellipse. There are filters and the engine shows the number of results that will remain after aplying the filter. This is very useful and can be done without pain in some search engines, such as Apache Solr. Of course, implement this only if filters make sense in your task.
Aggregate summary info, like total(s), count(s) or percentages.
One or more menus, like right click context for the grid, a ribbon or menu on top.
Your list for the UI elements is kinda good. Export, print (asking them whether it is really necessary to print this?), category/tag and language selection is worth to consider. Smart and working pagination (don't forget ordering).
Please do not force a search to open in a new (or even worse, always in the same window). Links of search results should be copy-pastable (always use GET),
But it really matters to have a functional (i.e. a really good) algorithm. Mostly I google company websites, because their search engine is, cough, awwwwkward. Looking for a feature chart, technical spec, pricing etc. one is not interested in press releases and vica-versa.
Search engine providers offer integration into company websites.
Use Auto-complete wherever possible on your text input fields.
If using selects or combo boxes with related information try and use chain selects to organise the information.
Where results depend on location try and serve relevant results.
Also remember to keep the search form as simple as possible even down to one text field. To refine the search you can have an alternate form as an "Advanced Search interface".
Printing, export.
A grid to display the results
Watch out not to display results a user is not authorized to see (roles / permissions / access rights).
A detail page that opens when an item is selected in the results grid
In case a user attempts to circumvent the search page links and enter some document directly, again, check out for permissions.
Validation, validation, validation.
It should be very hard, near impossible, for me to run a query that makes no sense. ie, start date occurring after an end date.
Export a numerical dataset (even if it only has one numeric column - so just make it so by default) to CSV for import into Excel (people love this function, even if only 1% of users seem to use it with any regularity. Just ask yourself when's the last time you highlighted something for copy-n-paste. Would it have been easier to open a CSV?
Refinable searches (think Google's use of site: -). People who use the search utility a lot will appreciate this. People who don't won't know it's not there.
The ability to choose to display 1 records, 5 records, 100 records, 1000 records, etc. "Paging" I believe is what we most commonly call it ;).
You mentioned sortable grids. Somebody else mentioned auto-sum or auto-count. Those are good if (once again) you have largely numeric data. But those are almost report-oriented functions.
Hope this helps.
One thing you can do is have a drop down of most common searches in plain english. e.g. "High value sales in New York in last 5 days". This is the equivalent of user selecting an amount, the city, date ranges etc. done conveniently for them.
Another thing is to have multiple search criteria tabs based on perspective of the user. Like "sales search", "reporting search", "admin search" etc.
ALso consider limiting the number of entries retrieved in the search and allow users to do more narrow searches. This depends on the business needs however.
The most commonly used search option listed first and in a prominent location.
I think your requirements are good. Take a cue from Google. Google got it right. One text box where you type whatever you want, and your engine spits out the answers. Most folks will try this, and if the answers are good enough, then that is what they will use. In the back-end, you'll probably want to flatten all of the data into a big honkin' table and then index it or use a SQL query with "LIKE" in it.
However, you will probably want to allow the user to refine the search. For this, have a link to "Advanced Search" and use a form there to specify filter criteria. This lets the user zero in on the results if basic search is not good enough. For the results on th is page, you will certainly want to have sorting on key fields, but do it after you have produced the initial result set.
It depends on the content that you are searching for.. make it relevant :) Search always look easy but can be incredibly difficult to get right.
Not mentioned yet, but very important I think - a search that actually works. This item is often neglected and makes the rest a bit moot.

Tool or methods for automatically creating contextual links within a large corpus of content?

Here's the basic scenario - I have a corpus of say 100,000 newspaper-like articles. Minimally they will all have a well-defined title, and some amount of body content.
What I want to do is find runs of text in articles that ought to link to other articles.
So, if article Foo has a run of text like "Students in 8th grade are being encouraged to read works by John-Paul Sartre" and article Bar is titled (and about) "The important works of John-Paul Sartre", I'd like to automagically create that HTML link from Foo to Bar within the text of Foo.
You should ask yourself something before adding the links. What benefit for users do you want to achieve by doing this? You probably want to increase the navigability of your site. Maybe it is better to create an easier way to add links to older articles in form used to submit new ones. Maybe it is possible to add a "one click search for selected text" feature. Maybe you can add a wiki-like functionality that lets users propose link for selected text. You probably want to add links to related articles (generated through tagging system or text mining) below the articles.
Some potential problems with fully automated link adder:
You may need to implement a good word sense disambiguation algorithm to avoid confusing or even irritating the user by placing bad automatic links with regex (or simple substring matching).
As the number of articles is large you do not want to generate the html for extra links on every request, cache it instead.
You need to make a decision on duplicate titles or titles that contain other title as substring (either take longest title or link to most recent article or prefer article from same category).
TLDR version: find alternative solutions that provide desired functionality to the users.
What you are looking for are text mining tools. You can find more info and links at http://en.wikipedia.org/wiki/Text_mining. You might also want to check out Lucene and its ports at http://lucene.apache.org. Using these tools, the basic idea would be to find a set of similar articles based on the article (or title) in question. You could search various properties of the article including titles and content or both. A tagging system a la Delicious (or Stackoverflow) might also be helpful. Rather than pre-creating the links between articles, you'd present the relevant articles in an interface much like the Related questions interface on the right-hand side of this page.
If you wanted to find and link specific text in each article, I think you'd need to do some preprocessing to select pertinent phrases to key on. Even then I think it would be very hard not to miss things due to punctuation/misspellings or to not include irrelevant links for the same reasons.