One of my client having website which is entirely based on API Content i.e. content coming from 3rd party website. He wants to do some seo on the data. I wonder if it is possible as there is data not available in his database and i think google crawler redirect to 3rd party website while crawling on such pages. We already asked for permission from that website owner to let us store API data on our end in order to do some SEO but he refused our request.
It will be highly appericited if you can suggest any other way that should not be against policies and guidelines.
Thank You
Vikas S.
Yes - with a huge BUT:
Google explains how parameters can be set within their Search Console (Google Webmaster) and how these can effect the crawler's behaviour.
#Nadeem Haddadeen is right with the canonical links between duplicates. There's also an issue if you don't have consistent content when calling up the same parameters. This essentially makes your page un-indexable as it's dynamic content. If you are dealing with dynamic content then you need to optimise a host page based around popular queries rather than trying to have your content rate itself.
It's not recommended to take the same content and post it on your website, its duplicate and Google will give you penalty.
If you still want to post it on your website, you have to make some changes on the original text and then post it on your website to look like its original.
Also if you want to keep it without any changes and to avoid any penalties from Google, you you have to add a link for the original article from your website or add a cross domain canonical link like the below example:
<link rel="canonical" href="https://example.com/original-article-url" />
Related
I am making a social network application where user will come and share the posts like facebook. But now I have some doubts like lets say a user is just shared a content by coping it from another site and same with the case of images. So does google crawler consider it as a duplicate content or not?
If yes then how I can tell to the google crawler that "don't consider it as a spam, its a social networking site and the content is shared by the user not by the me". Is there any way or any kind of technique that help me.
Google might consider it to be duplicate content, in which case the search algorithm will choose 1 version, which it believes to be the original or more important one and drop the other.
This isn't a bad thing per se - unless you see that most of your site's content is becoming duplicated.
You can use canonical URL declarations to do what you are saying, but i wouldn't advise it.
If your website belongs to one of these types - forum or e-commerce, it will not be punished for duplicate content issue. I think "social platform" is one type of forum.
If your pages are too similar, the result is that the two or more similar pages will scatter the click rate, flow etc, so the rank in SERPs may not look well.
I suggest do not use "canonical" because this instruction tell the crawlers do not crawl/count this page. If you use it, in the webmaster tool, you will see the indexed pages decrease a lot.
Do not too worry about the duplicate content issue. You can see this article: Google’s Matt Cutts: Duplicate Content Won’t Hurt You, Unless It Is Spammy
I've developed a service that allows users to search for stores on www.mysite.com.
I also have partners that uses my service. To the user, it looks like they are on my partners web site, when in fact they are on my site. I have only replaced my own header and footer, with my partners header and footer.
For the user, it looks like they are on mysite.partner.com when in reality they are on partner.mysite.com.
If you understood what I tried to explain, my question is:
Will Google and other search engines consider this duplicate content?
Update - canonical page
If I understand canonical pages correctly, www.mysite.com is my canonical page.
So when my partner uses mysite.partner.com?store=wallmart&id=123 which "redirects" (CNAME) to partner.mysite.com?store=wallmart&id=123, my server recognize my sub-domain.
So what I need to do, is to dynamically add the following in my <HEAD> section:
<link rel="canonical" href="mysite.com?store=wallmart&id=123">
Is this correct?
It's duplicate content but there is no penalty as such.
The problem is, for a specific search Google will pick one version of a page and filter out the others from the results. If your partner is targeting the same region then you are in direct competition.
The canonical tag is a way to tell Google which is the official version. If you use it then only the canonical page will show up in search results. So if you canonicalise back to your domain then your partners will be excluded from search results. Only your domains pages will ever show up. Not good for your partners.
There is no win. The only way your partners will do well is if they have their own content or target a different region and you don't do the canonical tag.
So your partners have a chance, I would not add the canonical. Then it's down to the Google gods to decide which of your duplicate pages gets shown.
Definitely. You'll want to use canonical tagging to stop this happening.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394
Yes. It will be considered as duplicate content by Google. Cause you have replaced only footer and header. By recent Google algorithm, content should be unique for website or even blog. If content is not unique, your website will be penalized by Google.
I have a website which when you first go to the website it will just display the normal domain so /. When they use the form they will get forwarded to lets say /question/DYNAMIC(question id).
So google has no way to see these links.
Is there a way to tell google about all of these links without manually putting these in and without having to keep this up-to-date as some question might be removed at a later date?
Submit an XML sitemap
I have a website and in my website I have, for example, a list of Audi models. I saw, using google webmaster tools, that my website appears in the google search by the word audi, but the target page was the 22nd page from my result set, not the first. I need my first page to appead, not my last (or middle), but I cannot tell google that this is a parameter, because my URLs are rewritten using mod rewrite. Any ideas?
BTW, I have read in a SEO forum, that it's a bad idea to use a cannonical tag. So is it really a bad idea in my case?
You can't force Google to do anything, however, they have made it easier to deal with pagination issues with a recent post on rel="next" and rel="prev".
But the primary problem you face is signalling to Google that your first (main) page is the starting point - this is achieved using internal link and back-link "juice" focussed on that page. You need to ensure that the first page of results is linked to properly from higher-value pages (like the home-page).
Google recently announced that you can use View All which will allow them to find and index entire articles that are normally broken up using pagination and display them all as one result.
I am using JBoss Seam on a Jetty web server and am having some issues with the query parameters breaking links when they appear in google searches.
The first parameter is one JBoss Seam uses to track conversations, cid or conversationId. This is a minor issue as Google is complaining I am submitting different urls with the same information.
Secondly, would it make sense to publish/remove urls via the Google Webmaster API instead of publishing/removing via the sitemap?
Walter
Hey Walter, I would recommend that you use the rel=canonical tag to tell the search engines to ignore certain parameters in your URL strings. The canonical tag is a common standard that Google, Yahoo and Microsoft have committed to supporting.
For example, if JBoss is creating URLs that look like this: mysite.com?cid=FOO&conversationId=BAR, then you can create a canonical tag in the section of your website like this:
<html>
<head>
<link rel="canonical" href="http://mysite.com" />
</head>
</html>
The search engines will use this information to normalize the URLs on your website to the canonical (or shortest & most authoritative) version. Specifically, they will treat this as a 301 redirect from the URL of the HTTP request to the URL specified in the canonical tag (as long as you haven't done anything silly, like make it an infinite loop, or pointed to a URL that doesn't exist).
While the canonical tag is pretty fricken cool, it is only a 90% solution, in that you can still run into issues with metrics tracking with all the extra parameters on your website. The best solution would be to update your infrastructure to trap these tracking parameters, create a cookie, and then use a 301 redirect to redirect the URL to the canonical version. However, this can be a prohibitive amount of work for that extra 10% gain, so many people prefer to start with the canonical tag.
As for your second question, generally you don't want to remove these URLs from Google if people are linking to them. By using the canonical tag, you achieve the same goal, but don't loose any value of the inbound links to your website.
For more information about the canonical tag, and the specific issues & solutions, check out this article I wrote on it here: http://janeandrobot.com/library/url-referrer-tracking.
Google Webmaster Tools will tell you about duplicate titles and other issues that Google see that are being caused by "duplicates" that are really the same page being served up with two different URL versions. I suggest trying to make sure the number of errors listed in Webmaster Tools account under duplicate titles is as close to zero as possible.