Is there a way that is more efficient than sitemap to add/force recrawl/remove your website's index entries in google? - seo

Pretty much that is the question. Is there a way that is more efficient than the standart sitemap.xml to [add/force recrawl/remove] i.e. manage your website's index entries in google?
I remember a few years ago I was reading an article of an unknown blogger that was saying that when he write news in his website, the url entry of the news will appear immediately in google's search result. I think he was mentioning about something special. I don't remember exactly what.. . some automatic re-crawling system that is offered by google themselves? However, I'm not sure about it. So I ask, do you think that I am blundering myself and there is NO OTHER way to manage index content besides sitemap.xml ? I just need to be sure about this.
Thank you.

I don't think you will find that magical "silver bullet" answer you're looking for, but here's some additional information and tips that may help:
Depth of crawl and rate of crawl is directly influenced by PageRank (one of the few things it does influence). So increasing your site's homepage and internal pages back-link count and quality will assist you.
QDF - this Google algorithm factor, "Query Deserves Freshness", does have a real impact and is one of the core reasons behind the Google Caffeine infrastructure project to allow much faster finding of fresh content. This is one of the main reasons that blogs and sites like SE do well - because the content is "fresh" and matches the query.
XML sitemaps do help with indexation, but they won't result in better ranking. Use them to assist search bots to find content that is deep in your architecture.
Pinging, especially by blogs, to services that monitor site changes like ping-o-matic, can really assist in pushing notification of your new content - this can also ensure the search engines become immediately aware of it.
Crawl Budget - be mindful of wasting a search engine's time on parts of your site that don't change or don't deserve a place in the index - using robots.txt and the robots meta tags can herd the search bots to different parts of your site (use with caution so as to not remove high value content).
Many of these topics are covered online, but there are other intrinsic things like navigational structure, internal linking, site architecture etc that also contribute just as much as any "trick" or "device".

Getting many links, from good sites, to your website will make the Google "spiders" reach your site faster.
Also links from social sites like Twitter can help the crawlers visit your site (although the Twitter links do not pass "link juice" - the spiders still go through them).
One last thing, update your content regularly, think of content as "Google Spider Food". If the spiders will come to your site, and will not find new food, they will not come back again soon, if each time they come, there is new food, they will come a lot. Article directories for example, get indexed several times a day.

Related

Linking together >100K pages without getting SEO penalized

I'm making a site which will have reviews of the privacy policies of hundreds of thousands of other sites on the internet. Its initial content is based on my running through the CommonCrawl 5 billion page web dump and analyzing all the privacy policies with a script, to identify certain characteristics (e.g. "Sells your personal info").
According to the SEO MOZ Beginner's Guide to SEO:
Search engines tend to only crawl about 100 links on any given page.
This loose restriction is necessary to keep down on spam and conserve
rankings.
I was wondering what would be a smart way to create a web of navigation that leaves no page orphaned, but would still avoid this SEO penalty they speak of. I have a few ideas:
Create alphabetical pages (or Google Sitemap .xml's), like "Sites beginning with Ado*". And it would link "Adobe.com" there for example. This, or any other meaningless split of the pages, seems kind of contrived and I wonder whether Google might not like it.
Using meta keywords or descriptions to categorize
Find some way to apply more interesting categories, such as geographical or content-based. My concern here is I'm not sure how I would be able to apply such categories across the board to so many sites. I suppose if need be I could write another classifier to try and analyze the content of the pages from the crawl. Sounds like a big job in and of itself though.
Use the DMOZ project to help categorize the pages.
Wikipedia and StackOverflow have obviously solved this problem very well by allowing users to categorize or tag all of the pages. In my case I don't have that luxury, but I want to find the best option available.
At the core of this question is how Google responds to different navigation structures. Does it penalize those who create a web of pages in a programmatic/meaningless way? Or does it not care so long as everything is connected via links?
Google PageRank does not penalize you for having >100 links on a page. But each link above a certain threshold decreases in value/importance in the PageRank algorithm.
Quoting SEOMOZ and Matt Cutts:
Could You Be Penalized?
Before we dig in too deep, I want to make it clear that the 100-link
limit has never been a penalty situation. In an August 2007 interview,
Rand quotes Matt Cutts as saying:
The "keep the number of links to under 100" is in the technical
guideline section, not the quality guidelines section. That means
we're not going to remove a page if you have 101 or 102 links on the
page. Think of this more as a rule of thumb.
At the time, it's likely
that Google started ignoring links after a certain point, but at worst
this kept those post-100 links from passing PageRank. The page itself
wasn't going to be de-indexed or penalized.
So the question really is how to get Google to take all your links seriously. You accomplish this by generating a XML sitemap for Google to crawl (you can either have a static sitemap.xml file, or its content can be dynamically generated). You will want to read up on the About Sitemaps section of the Google Webmaster Tools help documents.
Just like having too many links on a page is an issue,having too many links in a XML sitemap file is also an issue. What you need to do is paginate your XML sitemap. Jeff Atwood talks about how StackOverflow implements this: The Importance of Sitemaps. Jeff also discusses the same issue on StackOverflow podcast #24.
Also, this concept applies to Bing as well.

Is listing all products on the homepage's footer making a real difference SEO-wise?

I'm working on a website on which I am asked to add to the homepage's footer a list of all the products that are sold on the website along with a link to the products' detail pages.
The problem is that there are about 900 items to display.
Not only that doesn't look good but that makes the page render a lot slower.
I've been told that such a technique would improve the website's visibility in Search Engine.
I've also heard that such techniques could lead to the opposite effect: google seeing it as "spam".
My question is: Is listing products of a website on its homepage really efficient when it comes to becoming more visible on search engines?
That technique is called keyword stuffing and Google says that it's not a good idea:
"Keyword stuffing" refers to the practice of loading a webpage with keywords in an attempt to manipulate a site's ranking in Google's search results. Filling pages with keywords results in a negative user experience, and can harm your site's ranking. Focus on creating useful, information-rich content that uses keywords appropriately and in context.
Now you might want to ask: Does their crawler really realize that the list at the bottom of the page is just keyword stuffing? Well, that's a question that only Google could answer (and I'm pretty sure that they don't want to). In any case: Even if you could make a keyword stuffing block that is not recognized, they will probably improve they algorithm and -- sooner or later -- discover the truth. My recommendation: Don't do it.
If you want to optimize your search engine page ranking, do it "the right way" and read the Search Engine Optimization Guide published by Google.
Google is likely to see a huge list of keywords at the bottom of each page as spam. I'd highly recommend not doing this.
When is it ever a good idea to specify 900 items to a user? good practice dictates that large lists are usually paginated to avoid giving the user a huge blob of stuff to look through at once.
That's a good rule of thumb, if you're doing it to help the user, then it's probably good ... if you're doing it purely to help a machine (ie. google/bing), then it might be a bad idea.
You can return different html to genuine users and google by inspecting the user agent of the web request.
That way you can provide the google bot with a lot more text than you'd give a human user.
Update: People have pointed out that you shouldn't do this. I'm leaving this answer up though so that people know it's possible but bad.

How to Develop a Successful Sitemap

I have been browsing around on the internet and researching effective sitemap web pages. I have encountered these two sitemaps and questioning their effectiveness.
http://www.webanswers.com/sitemap/
http://www.answerbag.com/sitemap/
Are these sitemaps effective?
Jeff Atwood, (One of the guys who made this site) wrote a great article on the importance of sitemaps.
I'm a little aggravated that we have
to set up this special file for the
Googlebot to do its job properly; it
seems to me that web crawlers should
be able to spider down our simple
paging URL scheme without me giving
them an explicit assist.
The good news is that since we set up
our sitemaps.xml, every question on
Stack Overflow is eminently findable.
But when 50% of your traffic comes
from one source, perhaps it's best not
to ask these kinds of questions.
So yeah, effective for people, or effective for google?
I would have thought a HTML sitemap should be useful to a human, whereas these 2 sites aren't. If you're trying to target a search engine then a sitemap.xml file that conforms to sitemaps.org would be a better approach. Whilst the html approach would work it's easier to generate a xml file and have your robots.txt file pointing at this.

What are the common sense SEO practices that aren't dodgy or crap? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
In SEO there are a few techniques that have been flagged that need to avoided at all costs. These are all techniques that used to be perfectly acceptable but are now taboo. Number 1: Spammy guest blogging: Blowing up a page with guest comments is no longer a benefit. Number 2: Optimized Anchors: These have become counterproductive, instead use safe anchors. Number 3: Low Quality Links: Often sites will be flooded with hyperlinks that take you to low quality Q&A sites, don’t do this. Number 4: Keyword Heavy Content: Try and avoid too many of these, use longer well written sections more liberally. Number 5: Link-Back Overuse: Back links can be a great way to redirect to your site but over saturation will make people feel trapped
Content, Content, CONTENT! Create worthwhile content that other people will want to link to from their sites.
Google has the best tools for webmasters, but remember that they aren't the only search engine around. You should also look into Bing and Yahoo!'s webmaster tool offerings (here are the tools for Bing; here for Yahoo). Both of them also accept sitemap.xml files, so if you're going to make one for Google, then you may as well submit it elsewhere as well.
Google Analytics is very useful for helping you tweak this sort of thing. It makes it easy to see the effect that your changes are having.
Google and Bing both have very useful SEO blogs. Here is Google's. Here is Bing's. Read through them--they have a lot of useful information.
Meta keywords and meta descriptions may or may not be useful these days. I don't see the harm in including them if they are applicable.
If your page might be reached by more than one URL (i.e., www.mysite.com/default.aspx versus mysite.com/default.aspx versus www.mysite.com/), then be aware that that sort of thing sometimes confuses search engines, and they may penalize you for what they perceive as duplicated content. Use the link rel="canoncial" element to help avoid this problem.
Adjust your site's layout so that the main content comes as early as possible in the HTML source.
Understand and utilize your robots.txt and meta robots tags.
When you register your domain name, go ahead and claim it for as long of a period of time as you can. If your domain name registration is set to expire ten years from now rather than one year from now, search engines will take you more seriously.
As you probably know already, having other reputable sites that link to your site is a good thing (as long as those links are legitimate).
I'm sure there are many more tips as well. Good luck!
In addition to having quality content, content should be added/updated regularly. I believe that Google (an likely others) will have some bias toward the general "freshness" of content on your site.
Also, try to make sure that the content that the crawler sees is as close as possible to what the user will see (can be tricky for localized pages). If you're careless, your site may be be blacklisted for "bait-and-switch" tactics.
Don't implement important text-based
sections in Flash - Google will
probably not see them and if it does,
it'll screw it up.
Google can Index Flash. I don't know how well but it can. :)
A well organized, easy to navigate, hierarchical site.
There are many SEO practices that all work and that people should take into consideration. But fundamentally, I think it's important to remember that Google doesn't necessarily want people to be using SEO. More and more, google is striving to create a search engine that is capable of ranking websites based on how good the content is, and solely on that. It wants to be able to see what good content is in ways in which we can't trick it. Think about, at the very beginning of search engines, a site which had the same keyword on the same webpage repeated 200 times was sure to rank for that keyword, just like a site with any number of backlinks, regardless of the quality or PR of the sites they come from, was assured Google popularity. We're past that now, but is SEO is still , in a certain way, tricking a search engine into making it believe that your site has good content, because you buy backlinks, or comments, or such things.
I'm not saying that SEO is a bad practice, far from that. But Google is taking more and more measures to make its search results independant of the regular SEO practices we use today. That is way I can't stress this enough: write good content. Content, content, content. Make it unique, make it new, add it as often as you can. A lot of it. That's what matters. Google will always rank a site if it sees that there is a lot of new content, and even more so if it sees content coming onto the site in other ways, especially through commenting.
Common sense is uncommon. Things that appear obvious to me or you wouldn't be so obvious to someone else.
SEO is the process of effectively creating and promoting valuable content or tools, ensuring either is totally accessible to people and robots (search engine robots).
The SEO process includes and is far from being limited to such uncommon sense principles as:
Improving page load time (through minification, including a trailing slash in URLs, eliminating unnecessary code or db calls, etc.)
Canonicalization and redirection of broken links (organizing information and ensuring people/robots find what they're looking for)
Coherent, semantic use of language (from inclusion and emphasis of targeted keywords where they semantically make sense [and earn a rankings boost from SE's] all the way through semantic permalink architecture)
Mining search data to determine what people are going to be searching for before they do, and preparing awesome tools/content to serve their needs
SEO matters when you want your content to be found/accessed by people -- especially for topics/industries where many players compete for attention.
SEO does not matter if you do not want your content to be found/accessed, and there are times when SEO is inappropriate. Motives for not wanting your content found -- the only instances when SEO doesn't matter -- might vary, and include:
Privacy
When you want to hide content from the general public for some reason, you have no incentive to optimize a site for search engines.
Exclusivity
If you're offering something you don't want the general public to have, you need not necessarily optimize that.
Security
For example, say, you're an SEO looking to improve your domain's page load time, so you serve static content through a cookieless domain. Although the cookieless domain is used to improve the SEO of another domain, the cookieless domain need not be optimized itself for search engines.
Testing In Isolation
Let's say you want to measure how many people link to a site within a year which is completely promoted with AdWords, and through no other medium.
When One's Business Doesn't Rely On The Web For Traffic, Nor Would They Want To
Many local businesses or businesses which rely on point-of-sale or earning their traffic through some other mechanism than digital marketing may not want to even consider optimizing their site for search engines because they've already optimized it for some other system, perhaps like people walking down a street after emptying out of bars or an amusement park.
When Competing Differently In An A Saturated Market
Let's say you want to market entirely through social media, or internet cred & reputation here on SE. In such instances, you don't have to worry much about SEO.
Go real and do for user not for robots you will reach the success!!
Thanks!

Getting Good Google PageRank [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
In SEO people talk a lot about Google PageRank. It's kind of a catch 22 because until your site is actually big and you don't really need search engines as much, it's unlikely that big sites will link to you and increase your PageRank!
I've been told that it's easiest to simply get a couple high quality links to point to a site to raise it's PageRank. I've also been told that there are certain Open Directories like dmoz.org that Google pays special attention to (since they're human managed links). Can anyone speak to the validity of this or suggest another site/technique to increase a site's PageRank?
Have great content
Nothing helps your google rank more than having content or offering a service people are interested in. If your web site is better than the competition and solves a real need you will naturally generate more traffic and inbound links.
Keep your content fresh
Use friendly url's that contain keywords
Good: http://cars.com/products/cars/ford/focus/
Bad: http://cars.com/p?id=1232
Make sure the page title is relevant and well constructed
For example: Buy A House In France :. Property Purchasing in France
Use a domain name that describes your site
Good: http://cars.com/
Bad: http://somerandomunrelateddomainname.com/
Example
Type car into Google, out of the top 5 links all 4 have car in the domain: http://www.google.co.uk/search?q=car
Make it accessible
Make sure people can read your content. This includes a variety of different audiences
People with disabilities: Sight, motor, cognitive disabilities etc..
Search bots
In particular make sure search bots can read every single relevant page on your site. Quite often search bots get blocked by the use of javascript to link between pages or the use of frames / flash / silverlight. One easy way to do this is have a site map page that gives access to the whole site, dividing it into categories / sub categories etc..
Down level browsers
Submit your site map automatically
Most search engines allow you to submit a list of pages on your site including when they were last updated.
Google: https://www.google.com/webmasters/tools/docs/en/about.html
Inbound links
Generate as much buzz about your website as possible, to increase the likely hood of people linking to you. Blog / podcast about your website if appropriate. List it in online directories (if appropriate).
References
Google Search Engine Ranking Factors, by an SEO company
Creating a Google-friendly site: Best practices
Wikipedia - Search engine optimization
Good content.
Update it often.
Read and digest everything at Creating a Google-friendly site: Best practices.
Be active on the web. Comment in blogs, correspond genuinely with people, in email, im, twitter.
I'm not too sure about the domain name. Wikipedia? What does that mean? Mozilla? What word is that? Google? Was a typo. Yahoo? Sounds like that chocolate drink Yoohoo.
Trying to keyword the domain name shoehorns you anyway. And it can be construed as a SEO technique in the future (if it isn't already!)
Answer all email. Answer blog comments. Be nice and helpful.
Go watch garyvee's Better Than Zero. That'll motivate you.
If it's appropriate, having a blog is a good way of keeping content fresh, especially if you post often. A CMS would be handy too, as it reduces the friction of updating. The best way would be user-generated content, as other people make your site bigger and updated, and they may well link to their content from their other sites.
Google doesn't want you to have to engineer your site specifically to get a good PageRank. Having popular content and a well designed website should naturally get you the results you want.
A easy trick is to use
Google webmaster tool https://www.google.com/webmasters/tools
And you can generate a sitemap using http://www.xml-sitemaps.com/
Then, don't miss to use www.google.com/analytics/
And be careful, most SEO guides are not correct, playing fair is not always the good approach. For example,everyone says that spamming .edu sites is bad and ineffective but it is effective.