Duplicate Page errors with sub domains with same content? - seo

A website I'm working on has sub domains with duplicate content on it.
Example:
florida.example.com/info
california.example.com/info
All have the same content on it and moz and web master tools are throwing crawl errors.
This is a wordpress site. Any ideas on how to stop the crawl errors?
Thanks!

Related

Google 404 soft error on index page that is working fine

A friend of mine has been having trouble getting her site indexed by google and asked me to have a look, but that is not something I really know much about and was hoping for some assistance.
Looking at her search console, google crawl shows an error of soft-404 on the index page. I marked this as fixed a few times, because the site looks fine to me but it keeps coming back.
If I fetch the site as google it seems to be working fine, although it is showing the mobile version instead of the desktop.
It keeps giving another reoccurring 404 of a page http://www.smeyan.com/new-page, which doesn't exist anywhere I can see including server files or sitemaps.
Here is what I know about this site:
It used to be a wix site and was moved to a host gator shared server 2-3 months ago.
It's using JavaScript/jQuery .load to get page content outside the index.html template.
It has 2 sitemaps one for the URLs and one for both URLs and images
http://www.smeyan.com/sitemap_url.xml http://www.smeyan.com/sitemap.xml
It has been about 2 months since it was submitted for indexing and google has not indexed any of the content when you search for site:www.smeyan.com it shows some old stuff from the wix server. Although search console says it has 172 images indexed.
it has www. as a preference set in search console.
Has anyone experienced this and has an direction for a fix?
How long time was set for this site in Cache-Control header? If long, you should use "google removals" for obsolete snippets and cache. I simulated Google visit on your webpage. Correct 404 return code. Correct headers. Thus. Report google removals for "not found" pages. You must request visit of Googlebot and keep calm and wait for reaction.
BTW: For permanently removed content use 410 Gone for Google or... report via Removals.
https://support.google.com/webmasters/answer/1663419?hl=en
The only download error that I saw while using Chrome's Inspect function pertains to a SCRIPT tag with a Facebook url as the source (src) file.
This is the error as reported by Inspect.
This is the SCRIPT tag that caused the error.
I am not sure that this is the cause of the reoccurring 404 error, but it is an issue that needs attention on this website.
I checked your site with Tor Browser which has... DISABLED SCRIPTS. You should provide any content on your site with use of <noscript/> tag. It doesn't have to be beautiful but should be visible for bots. <a href... ></a>, <img/> etc. and... TEXT. Without it the site is NOT OPTIMIZED for search bots. Read about SEO. The sitemap content can be never indexed if the content will be never linked.
Probably your webpage also doesn't meet requirements for screen readers (for blind people).
Note: The image with "SMEYAN" caption is visible on webpage and is indexed.
second image on the webpage (in source): <img class="gallery-full-image" src="./galleries/home_gallery/smeyan_home-1.jpg" /> and indexed
The menu also doesn't work without scripts.
I thought the step is good implemented.
Please use <noscript/> element and implement version for blind people (without scripts, provide alt tag for images) and for noscript browsers. You can test it via disabling script or via NOSCRIPT extension for Firefox.
BTW. You should use HTML, CSS (including animations) and... use the JS ONLY if it is needed. Or... <noscript/> method.
Google bot currently use web rendering service (WRS) that is based on old Chrome 41 (M41), so it may fail where browsers succeed.
To learn how google boot works read this.
Add this code to the page to see the real error.
You can see the error using Url Inspector live, from google search console. It will show at more info tab.
Note: if the bot gets a 301 code or if the page is too little to have significant content it will return a soft 404 error, and won't preview or show any other error.

we were unable to access your site's robots.txt file

I verified my site using google webmaster. I have made my website in Wordpress and I also added robots.txt. Now google is showing green tick mark on DNS and Server Connectivity but and yellow warning mark on robots.txt fetch..
My robots.txt file is look like this:
robots file
Also when I run robots.txt test in webmaster it gives allowed result.. My site is not even being searched in google..
when i submit my site in webmaster that time its not showing error but now its showing.
Please help to slove this problem.
If you made your website with wordpress
It will automatically generate an robots.txt file for you
Why you Did not use it ?!

non secure items on secure page

I have an e-commerce site with an SSL installed. I have made sure all links are https:// to avoid getting the browser error saying there are non secure items on the page.
However, I have a news and press feed in my footer which links to another websites which is not secure and do not have https:// available. Is there any way to fix this or are there any tricks to making this work?
Links to HTTP URLs should not cause browser errors about non-secure items on a page. Check your browser console (F12) to see what specific items are triggering the non-secure warnings.
Your page is loading three images from https://192.99.37.125/ which is the wrong url, as it does not match to the used certificate. That is causing the error.

Has google changed crawlers in a way that could lead to the 404 growth?

Since yesterday i'm seeing growing number of 404 errors on our website. It is very strange because we don't have those pages which are reported as missing. Also we didn't released any code changes on that day.
Google Webmaster tool is reporting those errors, but when I look into the pages which are linking to the missing urls - there is no a such links. Could this be a Google Crawlers issue?
404 URL:
http://www.justanswer.co.uk/boat/home-improvement/homework/writing
Linked from:
http://www.justanswer.co.uk/boat/home-improvement/homework
http://www.justanswer.co.uk/boat/home-improvement/hvac
It seems that You have CORS issues doing cross-domain javascript.
https://www.facebook.com/connect/ping?client_id=172525162793917&domain=www.justanswer.co.uk&origin=1&redirect_uri=http%3A%2F%2Fstaticxx.facebook.com%2Fconnect%2Fxd_arbiter.php%3Fversion%3D42%23cb%3Df316e5bca883b5%26domain%3Dwww.justanswer.co.uk%26origin%3Dhttp%253A%252F%252Fwww.justanswer.co.uk%252Ff50e0366c05c14%26relation%3Dparent&response_type=token%2Csigned_request%2Ccode&sdk=joey
is saying that
Given URL is not allowed by the Application configuration: One or more of the given URLs is not allowed by the App's settings. It must match the Website URL or Canvas URL, or the domain must be a subdomain of one of the App's domains.

Website redirecting inproperly

I have the following issue, I'm building a website and it's deployed inside a subdirectory of my server, the website is a referral for sales services. Then on the homepage I have two links that references the two main categories of the products sold in it. Now the first link href's is /es/sports/ if I click on it I get a 404 error but if I copy paste the url in the browser then the page is shown correctly.
Note, when I click the link it is redirected to /sports/ instead of /es/sports/ as corresponds.
Maybe some htaccess configuration on the root of the public folder?
It's a laravel powered website.
The website url is the following. http://entrenamiento.com/es/ the links are the ones on the left sidebar.
First of all thanks for all comments and help. The thing is as follows, the parent website on the htaccess file has no support for trailing slashes after the URL. So once the trailing slash rule is enabled then the website works as expected. Thanks again