What are the most used Accept-Language in HTTP header? - http-headers

I make website and want to use the Accept-Language in HTTP header to help visitor find their language. However, I have a hard time to find statistics about the use of Accept-Language.
Will most visitor have something set as their Accept-Language? Some places it is written things like "most modern browsers support Accept-Language", but do anyone have overview of which specific browser versions that support it? And will usually browser language be set as Accept-Language by default if the user don't actively change their own Accept-Language settings? I guess most people don't change these settings, but that doesn't mean that Accept-Language is left blank?
Do anyone have statistics for the most used language codes set inside Accept-Language? I can make mapping system to map them with my site languages, but I also have problem to find some good statistics about most used codes. It would help a lot to get the overview for how to make this work better!
Thanks in advance!

Browsers send an Accept-Language header field out of the box. By default, the same language is requested, that is used for the user interface of the browser.

As Oswald said, by default browsers set this to the language used by the browser UI. So no, the default setting isn't blank, it's something like "en-US,en".
The only figures that I have found are on https://panopticlick.eff.org/results?#fingerprintTable . That page tests for the amount of information contained in HTTP requests. On the test result page, after clicking on "Show full results for fingerprinting", for various pieces of information it shows their frequency in column "one in x browsers have this value".
In row "HTTP_ACCEPT Headers" it shows the frequency of a combination of some Accept header values given by the browser. For example, it says that one in 5.25 browsers send the value "text/html, /; q=0.01 gzip, deflate, br en-US,en;q=0.5". Unfortunately, that value seems to be the concatenation of the values of headers "Accept" (somewhat stripped), "Accept-Encoding", and "Accept-Language", with a "br" thrown in for good measure.
As I wrote, when I experimented with panopticlick, it showed "one in 5.25 requests" for "en-US,en". This value is one of the more common ones, if not the most common one. One in 295.2 requests had just "en-US", one in 547.18 requests had just "en" and one in 1076.94 requests had "en,en-US" (which should have the same effects as "en", so it does not really make sense to use it).
Varying only the configuration of accepted languages, we can infer the frequency of the languages as seen by panopticlick. A more direct way would of course be to simply write to them and ask them for a table. I'm sure that the sample set of panopticlick is not representative for the entire internet, but at least it's a start.

Related

How do I specify the language of my content when I POST to a web api? [duplicate]

I have seen the HTTP headers of Content-Language and Accept-Language, could someone explain what these are for and the difference between them? I have a multilingual site and wondering should I be setting both to the sites current selected language, by the user.
Content-Language, an entity header, is used to describe the language(s) intended for the audience, so that it allows a user to differentiate according to the users' own preferred language. Entity headers are used in both, HTTP requests and responses.1
Accept-Language, a request HTTP header, advertises which languages the client is able to understand, and which locale variant is preferred.2 There can be multiple languages, each with an optional weight or 'quality' value. For example:
Accept-Language: da, en-GB;q=0.8, en;q=0.7
(The default weight is 1, so this is equivalent to da;q=1, en-GB;q=0.8, en;q=0.7).
You're going to have to parse the values and weights to see if an appropriate translation is available, and provide the user the translation in the highest preferred language weight.
It is recommended you give the users an alternative, such as a cookie set value, to force a certain language for your site. This is because some users may want to see your site in a certain language, without changing their language acceptance preferences.
Content-Language is the language of the page you're serving.
Accept-Language is a list of languages you PREFER to accept.
Content-Language describes the language that a particular piece of content is intended for. Accept-Language is the list of languages that a user agent wants content in. The best way to think of this is that Content-Language describes content and Accept-Language conveys a preference.
The Content-Language entity-header field describes the natural language(s) of the intended audience for the enclosed entity. Note that this might not be equivalent to all the languages used within the entity-body.
The Accept-Language request-header field restricts the set of natural languages that are preferred as a response to the request
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
The Content-Language entity header is used to describe the language(s) intended for the audience, so that it allows a user to differentiate according to the users' own preferred language.
Header type Entity header
Forbidden header name no
CORS-safelisted response-header yes
CORS-safelisted request-header yes
— MDN Web Reference - HTTP Headers - Content-Language
The Accept-Language request HTTP header advertises which languages the client is able to understand, and which locale variant is preferred. (By languages, we mean natural languages, such as English, and not programming languages.)
Header type Request header
Forbidden header name no
CORS-safelisted request-header yes
— MDN Web Reference - HTTP Headers - Accept-Language
The Accept-Language request HTTP header indicates the natural language and locale that the client prefers. The server uses content negotiation to select one of the proposals and informs the client of the choice with the Content-Language response header. Browsers set required values for this header according to their active user interface language. Source
In Accept-Language client informs the server in what language wants to get a response.
Content-Language contains information about response culture.
The service is trying to convert the response into a given language in the Accept-Language header. However, there can be a situation where the server is not able to answer in a given language. In that case, the response can be in the default language. Information about the response language will be in the Content-Language header.

How to add content-language for a single page in http header

My index page is tri-lingual... in this scenario, W3 informs us that the original 'ID solution' was dropped, without a replacement......
W3 does suggest the use of HTTP headers, but fails to explain how this is accomplished.
Can stackoverflow solve this problem?
Background
W3 suggests that this code is not good/should not be used:
<meta http-equiv="Content-Language" content="de, fr, en">
However, they then say that there is nothing to replace it:
One implication of HTML5 dropping the meta element for declaring
language is that there is now no obvious way to provide metadata about
the document inside the document itself.
That's a painful statement, but... they then go on to suggest that "content-language" should be specified in a HTTP header.
This information is associated with a particular page by settings on
the server or by server-side scripting.
Fantastic... they even show a typical example... great!
HTTP/1.1·200·OK
Date:·Sat,·23·Jul·2011·07:28:50·GMT
Server:·Apache/2
Content-Location:·qa-http-and-lang.en.php
Vary:·negotiate,accept-language,Accept-Encoding
TCN:·choice
P3P:·policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
Connection:·close
Transfer-Encoding:·chunked
Content-Type:·text/html; charset=utf-8
Content-Language:·en
But where is this file... and why is this character "·" used?
Why not use comma separated en, fr, de ?
Rant (after hours of researching):
If website programmers are advised not to use in-doc programming, it would be better if we were told exactly how to edit the HTTP header for any given page.
Therefore the question is simple?
Using CPanel, or Filezilla (and perhaps notepad++)... How do I modify the HTTP header for index.html to show that it contains English, French, German?
Note: I am currently using the bad code PLUS 'lang tags' eg:
<li lang="fr">
I'm trying to do what is right, but after looking on 'HTTP header help-sites', I never once found a statement re:
Exact file location
Filename and extension
Can anybody help solve this mystery?
If you didn't manage to find this, the HTTP Headers are what you are after as they describe the language you are expecting your target audience to use, and it can be multiple languages. HTTP Headers are set on your web server and apply to every page in your website.
If you are using Apache find the .htaccess file and add something along the lines of:
Header set Content-Language "en"
If you are using IIS then:
navigate to your website in the IIS GUI
double-click 'Http Response headers'
click 'Add...'
the name is Content-Language, the value is the language you want to use, for example use en for any kind of English, use commas to seperate multiple
Click OK
I got most of my info from here:
https://www.w3.org/International/questions/qa-html-language-declarations#metadata
Here is a list of the subtags you can use:
http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
thickguru supplied the .htaccess solution above, many thanks, his answer is here:
Language not declared Ideally

How to detect whether a web page is multilingual

Given a URL, I need to detect whether the target is available only in one language, or many languages (just that - true/false, no need to detect which languages). Is there any smarter method than sending two requests with different Accept-Language fields?
I've thought on this for quite some time, and searched for any useful info, to no avail. I assume there is no information on language in the URL itself. Perfect solution would use just one HTTP request, as that's what bothers me with the only solution I came up with - multiple HTTP requests. The detection feature is to be used in an AJAX call, so each additional HTTP request is a big hit on performance.
Any additional tips, e.g. how to choose which language to use in Accept-Language, are welcome.
Thanks!
You can specify more than one language in Accept-Language:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

Is using Content-Language header appropriate to localize a side effect of an HTTP POST?

I'm designing a REST API where content in the form of HTML is being posted to an endpoint. I'm using the lang attribute in the HTML to specify the language of the document or sections thereof. That is working nicely.
However, the content can be posted to a 'default' pseudo-resource, whose user-visible name is automatically generated, and thus needs to be localized. I need a way to specify which language to use when creating this name on the fly as a side effect of a first POST to the default section. Unfortunately, I'm not able to derive my user's preferred language from their login profile.
Does it seem reasonable to use the Content-Language header to specify this? There could be a clash with the languages(s) of the actual HTML content, and it is not strictly the language of the entity being POSTed.
Would it even make sense to treat the side effect as a type of 'response' and thus use Accept-Language instead?
A content-language is subject of content negotiation as well as content-type. Browsers automatically generate Accept-Language values from user's settings. e.g.
Accept-Language: en-US,en;q=0.8,cs;q=0.6
so you will only get user's language preference and that's all.
You can also use content-language (note that multiple languages are supported, e.g. content-language: en, de) to denote language(s) of content.
I would discourage you from giving users an ability to affect final URLs, but I guess that you are doing it because of SEO, right? If yes, common practice is to use something like mod_rewrite to strip dynamically generated 'nice URL' e.g.
http://example.org/some really nice name to be indexed/232323
to
http://example.org/?id=232323
Add your question: There can be probably only clash with built-in browsers translators as I'm not aware of any other component utilizing Content-Language.

What are Content-Language and Accept-Language?

I have seen the HTTP headers of Content-Language and Accept-Language, could someone explain what these are for and the difference between them? I have a multilingual site and wondering should I be setting both to the sites current selected language, by the user.
Content-Language, an entity header, is used to describe the language(s) intended for the audience, so that it allows a user to differentiate according to the users' own preferred language. Entity headers are used in both, HTTP requests and responses.1
Accept-Language, a request HTTP header, advertises which languages the client is able to understand, and which locale variant is preferred.2 There can be multiple languages, each with an optional weight or 'quality' value. For example:
Accept-Language: da, en-GB;q=0.8, en;q=0.7
(The default weight is 1, so this is equivalent to da;q=1, en-GB;q=0.8, en;q=0.7).
You're going to have to parse the values and weights to see if an appropriate translation is available, and provide the user the translation in the highest preferred language weight.
It is recommended you give the users an alternative, such as a cookie set value, to force a certain language for your site. This is because some users may want to see your site in a certain language, without changing their language acceptance preferences.
Content-Language is the language of the page you're serving.
Accept-Language is a list of languages you PREFER to accept.
Content-Language describes the language that a particular piece of content is intended for. Accept-Language is the list of languages that a user agent wants content in. The best way to think of this is that Content-Language describes content and Accept-Language conveys a preference.
The Content-Language entity-header field describes the natural language(s) of the intended audience for the enclosed entity. Note that this might not be equivalent to all the languages used within the entity-body.
The Accept-Language request-header field restricts the set of natural languages that are preferred as a response to the request
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
The Content-Language entity header is used to describe the language(s) intended for the audience, so that it allows a user to differentiate according to the users' own preferred language.
Header type Entity header
Forbidden header name no
CORS-safelisted response-header yes
CORS-safelisted request-header yes
— MDN Web Reference - HTTP Headers - Content-Language
The Accept-Language request HTTP header advertises which languages the client is able to understand, and which locale variant is preferred. (By languages, we mean natural languages, such as English, and not programming languages.)
Header type Request header
Forbidden header name no
CORS-safelisted request-header yes
— MDN Web Reference - HTTP Headers - Accept-Language
The Accept-Language request HTTP header indicates the natural language and locale that the client prefers. The server uses content negotiation to select one of the proposals and informs the client of the choice with the Content-Language response header. Browsers set required values for this header according to their active user interface language. Source
In Accept-Language client informs the server in what language wants to get a response.
Content-Language contains information about response culture.
The service is trying to convert the response into a given language in the Accept-Language header. However, there can be a situation where the server is not able to answer in a given language. In that case, the response can be in the default language. Information about the response language will be in the Content-Language header.