Google BOT visit and site languages - seo

In my site, I have different languages in the same page, I mean, that if you select your lang, the the same file, show you the content in your language.
According to your browser language, my site load the content in that language, and in the HTML tags I add:
(In Spanish lang for example)
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="**es**" xml:lang="**es**">
for Google bot detect the language, but I have a problem. For example, in Google.es (Google Spain) the Google bot is indexing English content, I think that it is because, the google bot browser or config is in english.
I want to know how to tell to Google bot, that in Spain have to save spanish content, in Brazil, portugues content etc..
Thanks!

You need an own URL for each language version.
For users: You should offer an language switcher (link to all translations on each page), using hreflang attribute
For bots: You should link the translations of each page with <link rel="alternate" href="…" hreflang="es" /> (and so on)
If you site is example.com, and (nearly) all pages are translated, you could use the following URL design:
example.com/en/… (for English)
example.com/es/… (for Spanish)
…
If someone visits example.com/, you could redirect to the language version that is configured in the browser (but if someone visits a specific language version, don't redirect to another language!).

Related

How to optimize my site for Google if I host complete english content on a .de url?

In a hosting package I have a free url, yet specifically .de ending (German). I now want to do a website under this url with complete english (US) language content. Only one language: english.
Is there a way to avoid ranking damage because of the „wrong“ url ending?
What can I do in my webpage headers?
Is:
<meta name="Content-Language" content="EN-US">
sufficient or can I do anything else?
Thanks for help!
why you using german domain.However my opinion is use any other domain or it will be helped to you to SEO purpose,A German site has German keywords obviously and you should have these words in your Url if you want an optimized site. if you can best thing is use .Com

Company name search yields many related results but no home page

I'm trying to resolve an "issue": if I google for the keyword that is my website's domain (just the company name), I get a lot of results related to the company except for the home page... here is the output I see in the first 10 pages (I didn't look further).
- Crunchbase page
- Linkedin page
- Facebook page
- Our zendesk help center page
- Some external blog references on VentureBeat, pocketgamer, Forbes, etc
The site is accessible with or without www. prefix. The site is also accessible on http and https. It is a wordpress powered website.
Here is the google URL of the query
I've then tried googling for 'site:trophit.com' according to google's recommendation when your website doesn't show up on searches. Interesting enough - I get results of some posts in my website but NOT PAGES. Here is that query as well
I browsed some pages that are missing from the search and found that they contain the following metadata:
<meta name="robots" content="noindex,nofollow"/>
I checked the Wordpress->settings->reading section, the "discourage robots" is unchecked. I am unaware of any other similar settings in WP.
Any idea why all pages have 'noindex'?

Why is my old URL still active?

I have a ecommerce site with hundreds of products. I recently changed permalinks and their base. Using Wordpress and Woocommerce plugin, I removed /shop/%product-category% from the URL. However, my old URLs are still active. Check out the following example:
greenenvysupply.com/shop/accessories/gro1-1-3mp-usb-led-digital-microscope-10x-300x/
greenenvysupply.com/gro1-1-3mp-usb-led-digital-microscope-10x-300x/
The first URL is old. Why does it still work? Shouldn't I get a 404 page?
Here is code from page source related to the canonical:
href="https://www.greenenvysupply.com/shop/feed/" />
<link rel='canonical' href='https://www.greenenvysupply.com/gro1-1-3mp-usb-led-digital-microscope-10x-300x/' />
<meta name="description" content="The 1.3 Mega-Pixel USB LED Digital Microscope is great for identifying pests and diseases on your plants so you can accurately resolve the problem."/>
<link rel="canonical" href="https://www.greenenvysupply.com/gro1-1-3mp-usb-led-digital-microscope-10x-300x/" />
Because the old URL is still active and not redirecting, my entire website is being seen as having duplicate content. Google crawlers are not being redirected. Why is the URL with /shop/ in it still active even though I have changed the permalink? There has got to be an easy fix for this.
A canonical URL or other metadata in your response is not the same as a redirect. To accomplish a redirect, your server needs to return a 3xx status code (typically a 301 or 308 for a permanent move as you have here or a 302 or 307 for a temporary move) and return a "Location" header that indicates the URL to which to redirect. How exactly you make your server do this is dependent on the type of server or server framework that you happen to be using for your website.
How to accomplish a redirect is somewhat independent of your implicit SEO question about whether to prefer a redirect over a canonical URL, which I'm afraid I cannot answer. Regardless of the approach you use, though, you should be aware that search engines -- Google or otherwise -- may not reflect the changes from your website immediately, so don't panic if you don't see the desired search engine change you were looking for immediately following a change to your website.

How to prevent Google from crawling UserDir URLs (but not the real domain)?

We have clients who build their site on a UserDir URL before their real domain goes live. The UserDir URL is always in the format:
http://1.2.3.4/~johndoe
Sometimes, Google crawls these UserDir URLs and the temporary site will show up in results even after the site is live on http://johndoe.com
So, once a client is live on http://johndoe.com, how can I prevent Google from crawling the UserDir address?
(of course, I need Google to crawl the real domain because SEO is important to our clients)
I use the canonical tag for this purpose. If you put the canonical tag on the index.html file like such:
<link rel="canonical" href="http://johndoe.com/" />
Then when Googlebot finds it at http://1.2.3.4/~johndoe it will know that it is a duplicate of http://johndoe.com/ and Google will index the correct one. Googlebot will see the same tag when it crawls the real site and not have a problem with the self-referential canonical.

How can I get site's favicon using Selenium

I need to get web site's favicon.
How can I do that?
You won't be able to get the favicon with Selenium you would have to use another program to grab it. The only way you would be able to get it is if your website rendered the favicon.ico as a link such as
<link rel="shortcut icon"
href="http://example.com/myicon.ico" />
However typically websites just store the favicon.ico in the root directory and on page request the browser retrieves it and drops it in the address bar or tab or wherever favicons are used. If this is how your favicon is rendered then there will be no code or anything to search for with Selenium.
Also the above code while it does work has some buggy support for IE7.
You don't need Selenium.
Just request the site's home page and use an HTML parser to find a <link rel="shortcut icon" href="..."> tag.
If you don't find any such tag, try /favicon.ico.
Here is a bit crazy but working solution:
get the favicon image opened in a web page (and hence reachable with selenium) with the help of "http://www.google.com/s2/favicons". There are other services which provide a similar functionality, see:
Get website's favicon with JS
use cssneedle package to compare the resulting favicon with a pre-saved one
Needle is a tool for testing your CSS with Selenium and nose.
It checks that CSS renders correctly by taking screenshots of portions
of a website and comparing them against known good screenshots. It
also provides tools for testing calculated CSS values and the position
of HTML elements.
In other words, we are going to compare favicon images.
Example implementation (python):
from needle.cases import NeedleTestCase
class FavIconTestCase(NeedleTestCase):
def test_so(self):
self.driver.get('http://www.google.com/s2/favicons?domain=www.stackoverflow.com')
self.assertScreenshot('img', 'so-favicon')