Cross-domain content duplication and SEO - seo

We have a family of sites (about games) with shared content. Each site has its own top level domain, and most content has a "home" domain, but all content is accessible on each domain. This allows a user who is logged in on, for example, the board game site, to page through their new subscribed content and see pages about RPGs or video games (content that is based in another of our domains) without having to jump to another domain.
I am concerned that this duplicate content will be used to penalize us in search engine rankings. Canonical links do not work across domains. Google recommends using 301 redirects to force all users to a single domain for a particular page, but we do not want to do that because we don't want to force users off their preferred domain. In addition, we have other content that genuinely belongs to multiple domains--lists that might include games from multiple domains,for example.
How can we continue to show our content in this way, without being penalized for having duplicate content across domains?

Have a read of this article, Google does support cross domain canonical. So just point it to the single source of truth!
http://searchengineland.com/google-supports-cross-domain-canonical-tag-32044

Related

301 Redirect to site without .htaccess (myshopify.com) & SEO rank issue

History/Context: I have a site, 'www.oldexample.com' built in 1998, HTML 4.01 transitional on Apache, cpanel server. Until last fall our main keywords got us to top 10. After mobile changes and Panda etc, Dropped to page 2 or 3 for all but one very specific keyword. The old site, 'www.oldexample.com' has many good back links and history in google and all main directories. I am rebuilding a test site now which is on 'mycompany.myshopify.com' as it addresses all my google errors issues on oldsite. I have set up my 'www.newexample.com' to redirect to the shopify site which is called up under 'www.newexample.com'.The myshopify.com URL does not show up at all.
Question: If I were to do cpanel 301 redirect of whole 'oldexample.com' to 'newexample.com' would I still benefit from the many links and history of oldsite?
When you say that the shopify URL doesn't show at all, do you mean it's not showing when you search for those keywords, or it's not indexed at all? If it's the latter, prompt Google to index it using Google Search Console. If it's the former, there are a number of things that could have affected this:
the authority of the new site - if you've just launched it, it naturally won't have the authority of the previous site and therefore is less likely to get visibility
you are correct that the backlinks would have played a major part in this. What you need to do is to redirect the old domain to the new one you want to appear in Google. For example, if you want to actually take people to newsite.shopify.com, you should redirect the old domain directly to that one. If you redirect the old one to newdomain.com, which you then redirect to newsite.shopify.com the result won't be the same. Link value is lost via redirects. Ideally, you should get in touch with as many 3rd party websites linking to your old domain and ask them to update the links to point to newsite.shopify.com
Even if you do that you might still not see those rankings because of various other factors. If you fancy posting the actual URLs and keywords in question, I can spare a few minutes to have a look.

Is serving the same website to multiple domains bad for SEO?

I'm working on a website which currently has two different domains pointing at it:
example1.com
example2.com
I have read that serving identical content to multiple domains can harm rankings.
The website being served is largely the same with the exception of item listings (think of an e-commerce site) and a few other minor tweaks (title, description, keywords, etc). Depending on the domain used it will adapt to serve different items.
Does this resolve the issue of serving duplicated content across multiple domains thus not harming the rankings?
Or would I be better to 301 redirect to a single domain and go from there?
If both your URLs show the same styled product listing then it will definitely affect the search engine result. Give a different look to both your websites in terms of displaying product or changing navigation menu. Put a slightly different image and add different descriptions to display your product.
If you run a website with same content and design on two different domains even with modified title, description and keywords, it is bad SEO practice and your website will be penalized by search engines.
Best option would be making a new website design with original content for the second domain and optimize it. Other wise you can make a 301 redirect for pointing domain 2 to the domain 1, this will not harm you nor help you!
I have also seen multiple domains having same website, content, title and description.. But to my surprise that domain is ranking well.. Crazy search engines!

SEO: Allowing crawler to index all pages when only few are visible at a time

I'm working on improving the site for the SEO purposes and hit an interesting issue. The site, among other things, includes a large directory of individual items (it doesn't really matter what these are). Each item has its own details page, which is accessed via
http://www.mysite.com/item.php?id=item_id
or
http://www.mysite.com/item.php/id/title
The directory is large - having about 100,000 items in it. Naturally, on any of the pages only a few items are listed. For example, on the main site homepage, there are links to about 5 or 6 items, from some other page there links to about a dozen different items, etc.
When real users visits the site, they can use search form to find item by keyword or location - so there would be a list produced matching their search criteria. However when, for example, a google crawler visits the site, it won't even attempt to put a text into the keyword search field and submit the form. Thus as far as the bot is concern, after indexing the entire site, it has covered only a few dozen items at best. Naturally, I want it to index each individual item separately. What are my options here?
One thing I considered is to check the user agent and IP ranges and if the requestor is a bot (as best I can say), then add a div to the end of the most relevant page with links to each individual item. Yes, this would be a huge page to load - and I'm not sure how google bot would react to this.
Any other things I can do? What are best practices here?
Thanks in advance.
One thing I considered is to check the user agent and IP ranges and if
the requestor is a bot (as best I can say), then add a div to the end
of the most relevant page with links to each individual item. Yes,
this would be a huge page to load - and I'm not sure how google bot
would react to this.
That would be a very bad thing to do. Serving up different content to the search engines specifically for their benefit is called cloaking and is a great way to get your site banned. Don't even consider it.
Whenever a webmaster is concerned about getting their pages indexed having an XML sitemap is an easy way to ensure the search engines are aware of your site's content. They're very easy to create and update, too, if your site is database driven. The XML file does not have to be static so you can dynamically produce it whenever the search engines request it (Google, Yahoo, and Bing all support XML sitemaps). You can find out mroe about XML sitemaps at sitemaps.org.
If you want to make your content available to search engines and want to benefit from semantic markup (i.e. HTML) you should also make sure your all of content can be reached through hyperlinks (in other words not through form submissions or JavaScript). The reason for this is twofold:
The anchor text in the links to your items will contain the keywords you want to rank well for. This is one of the more heavily weighted ranking factors.
Links count as "votes", especially to Google. Links from external websites, especially related websites, are what you'll hear people recommend the most and for good reason. They're valuable to have. But internal links carry weight, too, and can be a great way to prop up your internal item pages.
(Bonus) Google has PageRank which used to be a huge part of their ranking algorithm but plays only a small part now. But it still has value and links "pass" PageRank to each page they link to increasing the PageRank of that page. When you have as many pages as you do that's a lot of potential PageRank to pass around. If you built your site well you could probably get your home page to a PageRank of 6 just from internal linking alone.
Having an HTML sitemap that somehow links to all of your products is a great way to ensure that search engines, and users, can easily find all of your products. It is also recommended that you structure your site so more important pages are closer to the root of your website (home page) and then as you branch out gets to sub pages (categories) and then to specific items. This gives search engines an idea of what pages are important and helps them organize them (which helps them rank them). It also helps them follow those links from top to bottom and find all of your content.
Each item has its own details page, which is accessed via
http://www.mysite.com/item.php?id=item_id
or
http://www.mysite.com/item.php/id/title
This is also bad for SEO. When you can pull up the same page using two different URLs you have duplicate content on your website. Google is on a crusade to increase the quality of their index and they consider duplicate content to be low quality. Their infamous Panda Algorithm is partially out to find and penalize sites with low quality content. Considering how many products you have it is only a matter of time before you are penalized for this. Fortunately the solution is easy. You just need to specify a canonical URL for your product pages. I recommend the second format as it is more search engine friendly.
Read my answer to an SEO question at the Pro Webmaster's site for even more information on SEO.
I would suggest for starters having an xml sitemap. Generate a list of all your pages, and submit this to Google via webmaster tools. It wouldn't hurt having a "friendly" sitemap either - linked to from the front page, which lists all these pages, preferably by category, too.
If you're concerned with SEO, then having links to your pages is hugely important. Google could see your page and think "wow, awesome!" and give you lots of authority -- this authority (some like to call it link juice" is then passed down to pages that are linked from it. You ought to make a hierarchy of files, more important ones closer to the top and/or making it wide instead of deep.
Also, showing different stuff to the Google crawler than the "normal" visitor can be harmful in some cases, if Google thinks you're trying to con it.
Sorry -- A little bias on Google here - but the other engines are similar.

2 sites each in a different country with 1 set of content (cloaking)

I have a question re: cloaking.
I have a friend who has a business in Canada and the UK.
Currently the .ca site is hosted on Godaddy. The co.uk domain is registered (with uk ip address) with domainmonster and is using a cloaked/framed redirect to the .ca site.
As a result (my assumption) the .ca site is indexed fine by google, the .co.uk is not.
The content is generic for both sites. How do I point the .co.uk site directly to the content independently (preferably without duplicating the content hosting in the UK), so that for instance if the .ca domain was taken away altogether the .co.uk domain would remain an entity in itself from Google's point of view?
Does Google index a generic set of content and then associate different country domains with that content?
I hope I have explained this ok.
Thanks,
Greg
What exactly do you mean by cloaked/framed redirect? Implementation of this may vary and this will result in different states of your site with search engines.
Best way to see how Google has indexed your site is to run site:youdomain.co.uk query and see what results are returned(check cached versions, etc.). Also make sure to create Google WebMaster Tools account and look through the info there.
If only one of your sites is indexed I suggest first to create 2 different accounts in Webmaster tools, specifying different geo targeting for them and removing the redirect, such that each site returns 200 response code and doesn't do any type of cloaking/redirecting.
If one of the sites is failed to be indexed, put a link to it from the other one, and a bit of simplest link building(submit to DMOZ, Yahoo Directory for instance) as well make sure you submit the different sitemap for both sites(again via Google Webmaster Tools).
Hope this answers your question.

How do I convince the Googlebot that two formerly aliased sites are now separate?

This will require a little setup. Trust me that this is for a good cause.
The Background
A friend of mine has run a non-profit public interest website for two years. The site is designed to counteract misinformation about a certain public person. Of course, over the last two years those of us who support what he is doing have relentlessly linked to the site in order to boost it in Google so that it appears very highly when you search for this public person's name. (In fact it is the #2 result, right below the public person's own site). He does not have the support of this public person, but what he is doing is in the public interest and good.
The friend had a stroke recently. Coincidentally, the domain name came up for renewal right when he was in the hospital and his wife missed the email about it. A domain squatter snapped up the domain, and put up content diametrically opposed to his intent. This squatter is now benefitting from his Google placement and page rank.
Fortunately there were other domains he owned which were aliased to point to this domain, i.e. they used a DNS mapping or HTTP 301 redirect (I'm not sure which) to send people to the right site. We reconfigured one of the alias domains to point directly to the original content.
We have publicized this new name for the site and the community has now created thousands of links to the new domain, and is fixing all the old links. We can see from the cache that Google has in fact crawled the original site at the new address, and has re-crawled the imposter site.
The Problem
Even though Google has crawled both sites, you can't get the site to appear in relevant searches under the new URL!
It appears to me that Google remembers the old redirect between the two names (probably because someone linked to the new domain back when it was an alias). It is treating the two sites as if they are the same site in all results. The results for the site name, and using the "link:" operator to find sites that link to this site, are entirely consistent with Google being convinced they are the same site.
Keep in mind that we do not have control of the content of the old domain, and we do not have the cooperation of the person that these sites relate to.
How can we convince the Googlebot that domain "a" and domain "b" are now two different sites and should be treated as such in results?
EDIT: Forward was probably DNS, not HTTP based.
Google will detect the decrease in links to the old domain and that will hurt it.
Include some new interesting content on the new domain. This will encourage Google to crawl this domain.
The 301 redirects will be forgotten, in time. Perhaps several months. Note that they redirected one set of URLs to another set, not from one domain to another. Get some links to some new pages within the site, not just the homepage, as these URLs will not be in the old redirected set.
Set up Google Webmaster Tools and submit an XML sitemap. Thoroughly check everything in Webmaster Tools about once per week.
Good luck.
Time heals all wounds...
Losing control of the domain is a big blow, and it will take time to recover. It sounds like you're following all the correct procedures (getting people to change links, using 301s, etc.)
Has the content of the original site changed since being put up again? If not, you should probably make some changes. If Google re-crawls the page and finds it substantially identical to the one previously indexed, it might consider it a copy and that's why it's using the original URL.
Also, I believe that Google has a resolution process for just such situations. I'm not sure what the form to fill out is or who to contact, but surely some other SO citizens could help.
Good luck!