How can I present multiple pages with similar content (mostly images) without Google penalizing me? - seo

I have a website that presents Q&As to mathematical problems, mostly for pupils aged approx. 16-18 years old. Due to the difficulties of presenting formulas on webpages, the Q&As (formulas) are presented as images. At the moment, each webpage contains one Q&A, and there are many questions and answers. Thus, with little in the way of text, every page looks almost identical. Therefore, Google might very easily see this as duplicate content. What is my best solution to this problem? Should I try put the Q&As in a database and present each different one on the same page (dynamically). Or should I keep things the way they are and prevent Google from seeing most of the Q&As? It is also difficult to make different titles, descriptions etc. as, for each topic, only the question number changes.
Many thanks for your time.

You're basically a ghost to google anyways if there is no text on each page. If you are worried about SEO you need to worry about text.
You should at the very least look into tagging the formulas or creating a title for the question which is relevant and putting that into a header tag above the question image.
Otherwise no one will find you by that content and that's what it's all about.

You said it: you can hide the QandA file/directory in the robots.txt file of your web server.
Disallow: /QAfolder
or
Disallow: /Q1.htm
Disallow: /Q2.htm
Disallow: /Q3.htm
or whatnot.
Normally, this would be a bad thing (preventing users from searching for question content) but as you said, they're images anyway.

Create descriptive useful page titles and meta descriptions.
Create textual representations of what is in the image using alt tags.
Use Different headers.
This could be a little hard to think about as in your context. but, you may be able probably use the question type description or name of the chapter its taken from. basically a text description relevant to the question.
One more thing you can do is. If you have empty space on your page, you can put in some text that describes your website and at the same time uses the right keywords(that you are targeting) in the right percentages Higher up in the page - You may writeup 2-3 different descriptions and alternate them between pages, i.e. if your design permits you.

Related

How to prevent duplicate content issue when having same user comments in multiple content pages?

I have a question about duplicate content issue. I have pages with article, one page = one article. Below the article is discussion forum / comments box.
These articles have sometimes very similar subjects. So it usually happens that user comments / asks the same thing, which was already discussed in similar older article. But that's because the user doesn't know about the older article(s).
So for certain articles on the same subject I use one comment's box for multiple articles.
All the articles are my original content, however the second part of the page would be purposely duplicate content, because in this case - it's good for user ( that's what Google says webmaster should do - what's good for user ).
So my question is - should I fear this could be treated by search engines as duplicate content ? And if yes, what steps should I take to preserve this functionality and not to be penalized by Google and others ?
I don't think that you have to take care about this. Infact the duplicated content will be a little part of the whole content (which you say is brand new). Another fact you have to bear in mind is that also if the topic is the same it's very hard that people will write them in the same way.
Google covered this a few years ago:
The duplicate content "penalty" isn't to discourage webmasters from referencing the same content via more than one URL, but it is intended to prevent mass spamming of duplicate content from your (or worse, someone else's) site. You should be fine.

Does apparent filename affect SEO?

If I name my HTML file "Banks.html" located at www.example.com/Banks.html, but all the content is about Cats and all my other SEO tags are about Cats on the page, will it affect my page's SEO?
Can you name your files whatever you want, as long as you have the page title, description, and the rest of the SEO done properly?
Page names are often not very representative of the page content (I've seen pages named 7d57As09). Therefore search engines are not going to be particularly upset if the page names appear misleading. However, it's likely that the page name is one of many factors a search engine considers.
If there's no disadvantage in naming a page about cats, "cats.html", then do so! If it doesn't help your SEO, it will make it easier for your visitors!
If you want to be on better place when someone searchs for 'banks', then yes, it can help you. But unless you are creating pages about cats in banks I'm sure that this wont help you very much :)
It shouldn't affect your search engine ranking, but it may influence people who, having completed a search on Google (or some of the other great search engines, like um...uh...), are now scanning the results to decide where to click first. Someone faced with a url like www.dummy.com/banks.html would be more likely to click than someone faced with www.dummy.com/default.php?p_id=1&sessid=876492u942fgspw24z because most people haven't a clue what the last part means. It's also more memorable and gives people greater faith in getting back to the same site if you write your URLs nicely. No one that isn't Dustin Hoffman can remember the second URL without a little intense memory training, while everyone can remember banks.html. Just make sure your URL generation is consistent and your rewriting is solid, so you don't end up with loads of page not found errors which can detriment search engine ranking.
Ideally, your page name should be relevant to the content of the page - so your ranking may improve if you call the page "cats.html", as that is effectively another occurrence of the keyword in the page.
Generally, this is fairly minor compared to the benefits of decent keywords, titles, etc on the page. For more information take a look at articles around Url Strategy, for example:
"I’ve heard that search engines give some weighting to pages which contain keywords users are searching for which are contained within the page URL?"
Naming your pages something meaningful is a good idea and does improve SEO. It's another hint to the search engines what the page is about, in addition to the title and content. You would be surprised if you opened a file on your computer called "Letter to Grandma.doc" and it was actually your tax return!
In general, the best URLs are those that simply give a page name and hierarchical structure, without extensions or ID numbers. Keep it lowercase and separate words with dashes, like this:
example.com/my-cats
or
example.com/cats/mittens
In your case you will probably wanna keep the .html extension to avoid complexities with URL rewriting.
Under circumstances this can be considered a black-hat SEO technique. Watch out not to be caught or reported by curious users.
Google's PageRank algo has hundreds, thousands or even millions of variables and factors. From this point of view, you can be sure that the name of the files that you use on your website will affect your pagerank and/or your keyword targeting. Think about it.
There are few on-page elements that have significance. The URL, while it can be /234989782 is going to be more beneficial if it's named relevantly.
From any point of view, Google and all search engines like to see a coherence between everything: if you have a page named XYZ, then google will like it better if the text, meta, images, url, documents, etc, on the page to have XYZ in them. The bigger this synchronisation between the different elements on a page, the more the search engine sees how focused the content of that page is, resulting in more hits for you when someone looks up that focused search term.
If you have an image for example, you're better off having the same:
caption
description
name
alt text
(wordpress users will recognize that these are the four parameters that can be set for images on wordpress).
The same goes for all files you have on your website. Any parameter that can be seen by a search engine is better of optimized in regards to the content that goes with it, in sync with all the other parameters of this same thing.
The question of how useful this all is arises afterwards. Will I really rank lower if my alt text is different than the name of my image? Probably not by a lot. But either way, taking advantage of small subtleties like these can take you a long way in SEO. There are so many things we can't control in SEO (or that we shouldn't be able to control, like backlinks), that we have to use what we can control in the best way possible, to compensate.
It's also hard to tell if it is all useful after the Google Panda and Penguin. It definitely has less of an impact ever since those reforms (back then, this kind of thing was crucial), the question is simply how much of an impact it still has. But all in all, as I said, whenever possible, name your files according to your content.
Today algorithm is totally different when the SEO was introduce. The seo today is about content and its quality. It must produce a good reader and follower so any filename and description are no longer important.
Page name doesn't affects much in terms of SEO. but naming a page is also one of the Google 200 SEO signals.
Naming a url different sure will reduce your bounce rate a little. Because any user comes to your site through organic search results doesn't understand what the page has.
Even search engines loves when a page name is relevant to the topic in the page.

Hiding or Promoting specific content within a page to search engines

A bit of an SEO question here.
I've got a site with a ton of pages, of content. I know lots of the content is the same on each page.
I thought that Search Engines keyed off of the differences in page content so that they could promote the correct data, but when I look at the summary in google and bing, the summary shows my 'feedback' block (which is where I just ask for feedback).
Yahoo (and the summary in Facebook) shows my search options menu.
These aren't really things that are going to make a person want to click on the page.
So I'm wondering what the best way is to either hide this content from search engines, or improve the visibility of the other content that should get indexed.
The page structure is pretty consistent, so I thought it would have been easy for the search robots to pick this stuff out, but apparently not.
You may want to try using a meta tag like this.
< META NAME="description" CONTENT="Here is a short summary of the page" >
Search engines also prefer title and header tags over regular text.
Meta is the best way to do that.
However,Beware that your structure of page is a also important, which means search engines prefer to use metal tag, but they also weigh the structures, keywords, headers things like that.
I encountered such trouble couple of months ago. I found Google showed price and download rather than meta description. I solved that by reorganize meta description(more accurate and shorter,177 characters)eliminate tags from price and download tags. And made some slight adjustments to the structure. Now the Google summary is what I want.
Hope this helps you!

Tool or methods for automatically creating contextual links within a large corpus of content?

Here's the basic scenario - I have a corpus of say 100,000 newspaper-like articles. Minimally they will all have a well-defined title, and some amount of body content.
What I want to do is find runs of text in articles that ought to link to other articles.
So, if article Foo has a run of text like "Students in 8th grade are being encouraged to read works by John-Paul Sartre" and article Bar is titled (and about) "The important works of John-Paul Sartre", I'd like to automagically create that HTML link from Foo to Bar within the text of Foo.
You should ask yourself something before adding the links. What benefit for users do you want to achieve by doing this? You probably want to increase the navigability of your site. Maybe it is better to create an easier way to add links to older articles in form used to submit new ones. Maybe it is possible to add a "one click search for selected text" feature. Maybe you can add a wiki-like functionality that lets users propose link for selected text. You probably want to add links to related articles (generated through tagging system or text mining) below the articles.
Some potential problems with fully automated link adder:
You may need to implement a good word sense disambiguation algorithm to avoid confusing or even irritating the user by placing bad automatic links with regex (or simple substring matching).
As the number of articles is large you do not want to generate the html for extra links on every request, cache it instead.
You need to make a decision on duplicate titles or titles that contain other title as substring (either take longest title or link to most recent article or prefer article from same category).
TLDR version: find alternative solutions that provide desired functionality to the users.
What you are looking for are text mining tools. You can find more info and links at http://en.wikipedia.org/wiki/Text_mining. You might also want to check out Lucene and its ports at http://lucene.apache.org. Using these tools, the basic idea would be to find a set of similar articles based on the article (or title) in question. You could search various properties of the article including titles and content or both. A tagging system a la Delicious (or Stackoverflow) might also be helpful. Rather than pre-creating the links between articles, you'd present the relevant articles in an interface much like the Related questions interface on the right-hand side of this page.
If you wanted to find and link specific text in each article, I think you'd need to do some preprocessing to select pertinent phrases to key on. Even then I think it would be very hard not to miss things due to punctuation/misspellings or to not include irrelevant links for the same reasons.

SEO for Ultraseek 5.7

We've got Ultraseek 5.7 indexing the content on our corporate intranet site, and we'd like to make sure our web pages are being optimized for it.
Which SEO techniques are useful for Ultraseek, and where can I find documentation about these features?
Features I've considered implementing:
Make the title and first H1 contain the most valuable information about the page
Implement a sitemap.xml file
Ping the Ultraseek xpa interface when new content is added
Use "SEO-Friendly" URL strings
Add Meta keywords to the HTML pages.
The most important bit of advice anyone can get when optimizing a website for search engines and indeed for tools like Ultraseek is this...
Write your web pages for your human audience first and foremost. Don't do anything odd to try and optimize your website for a search engine. Don't stuff keywords into your URL if it makes the URL less sensible. Think human first.
Having said this, the following techniques usually make things better for both the humans and the machines!
Use headings (h1 through h6) to give your page a structure. Imagine them being arranged in a tree view, with a h1 containing some h2 tags and h2 tags containing h3 tags and so on. I usually use the h1 tag (there should be only one h1 tag) for the site name and the h2 tag for the page name, with h3 tags as sub-headings where appropriate.
Sitemaps are very useful as they contain a list of your pages, consider this a request of pages you would like included in any index. They don't normally contain much context though.
Friendly URL strings are great for humans. I'd much rather visit www.website.com/Category/Music/ than www.website.com?x=3489 - it does also mean that you give the machines some more context for your page. It especially helps if the URL matches your h1 and h2 tags. Like this:
www.website.com/Category/Music/
Website
Category: Music
Welcome to the music category!
Meta keywords (and description) are useful - but as per the above advice, you need to make sure that it all matches up. Use a small but targeted set of keywords that highlight what is specifically different about the page and make sure your description is a good summary of the page content. Imagine that it is used beneath the title in a list of search results (even though it might not be!)
Navigation! Providing clear navigation, as well as back links (such as bread-crumbs) will always help. If someone clicks on a search result, it might not be the exact page they are after, but it may well be very close. By highlighting where people have landed in your navigation and by providing a bread-crumb that tells them where they are, they will be able to traverse your pages easily even if the search hasn't taken them to the perfect location.