How to detect whether a web page is multilingual - http-headers

Given a URL, I need to detect whether the target is available only in one language, or many languages (just that - true/false, no need to detect which languages). Is there any smarter method than sending two requests with different Accept-Language fields?
I've thought on this for quite some time, and searched for any useful info, to no avail. I assume there is no information on language in the URL itself. Perfect solution would use just one HTTP request, as that's what bothers me with the only solution I came up with - multiple HTTP requests. The detection feature is to be used in an AJAX call, so each additional HTTP request is a big hit on performance.
Any additional tips, e.g. how to choose which language to use in Accept-Language, are welcome.
Thanks!

You can specify more than one language in Accept-Language:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

Related

Best practice for pagination in API REST

I'm new in API developement and I wanted to know what is the best choice to create pagination :
GET resquest with query params (sort, limit, etc)
POST request with params in the body (sort, limit, etc)
I was more on the GET but my coworkers thinks POST is a better choice, so I just wanted your opinion.
GET would be the usual choice.
General purpose components will understand that the semantics of GET are safe, which means they are also idempotent. If a GET request receives no response, you can automatically retry it without any concerns about loss of property.
Furthermore, if all of the information you need to identify the resource is included in the URI, then you can bookmark the URI, or paste it into an email, or link to it in a document, and it will all "just work".
Also, using GET -- with all of the relevant details encoded into the resource identifier -- means that the response can be cached and re-used. The constraints on caching POST requests mean that you can't capture the information in the request body.
At some point in the future, HTTPWG will register a new HTTP method to cover the safe method with a body case, which may change some of the answers.
In the meantime, it is okay to use GET.
GET is the recommended way to do this, because the answer can be cached and the goal is reading not writing. You can use the query string or range headers for pagination. The exact solution depends on your needs. There are a few standard query languages for this, like OData, but they are overkill for a simple API. Building a custom solution on top of URI templates might be a better choice, or there are non-standard query languages too like RQL, which can be completely or partially implemented in your solution.

Single API endpoint pros and cons

I am creating API and trying to figure out is planned approach any good.
That API is not public and it will be used by SPA and mobile app that I build.So I am thinking of GraphQL-like design but without posting json and with regular HTTP methods.
Something like this for GET methods:
Example 1 - get users with specific fields(_join indicates sql table join), ordering and limit:api.com?table=users&displayFields=id,name,email,address,tel,country_join&orderBy=asc&orderColumn=name&offset=0&limit=10
Example 2 - get users based on search parameters with all fields, ordering and limit:api.com?table=users&search=John&searchFields=name,email&orderBy=asc&orderColumn=name&offset=0&limit=10
I assume this is bad since REST is standard, otherwise I would see much more examples of this approach.
But why is this bad? For me it seems easier to develop and more flexible to use.
Is proper REST API for examples I provided easier to implement, safer, easier to use or cache?
The main difference I see between putting the variables in the url vs the request body are:
the length of the data as the url length is limited while the request body is not
special characters to be escaped in the url which can lead to long and unclear url
These are 2 pros in favor of data in request body, but I agree that data in url is much simpler to test and use as tou don't need an http client tool like curl or postman to validate your endpoints.
REST however has stricter conventions if you want to fully implement it:
use the right http requests (get, post, patch, delete and put) to implement crud operations on one single endpoint
return the right http code as a result
use standard data format for input and output (json or XML)
For better interoperability between systems it's advised to comply with REST and microservices design patterns.
For small applications we can follow some shortcuts and not comply fully. I have integrated several services so far and each time I can tell you no one of them implements standard REST :-)

REST API: Is it a really bad practice to create custom HTTP response codes?

Is it a bad practice when writing a RESTful API to use custom HTTP response codes like:
417 - Password not provided
418 - Database error
I see there is a list of standard HTTP response codes. However, from looking at Twitter's API, it appears Twitter tries to return standard HTTP response codes when available but their own error codes when they cannot align the error with a standard HTTP response (correct me if I am wrong).
What is the best practice for response codes (especially for errors) while creating a RESTful API? Any comments on the practice which Twitter chose to use?
Yes, yes it is bad practice... mostly.
One of the tenets of REST is that you work with the underlying protocols, as such HTTP has already defined a good set of response codes.
However, not every situation is catered for perfectly. Take Twitters 'arrest your calm', that response code is used when the request was valid, it simply is not being handled due to too many request being made.
I don't see another response code that quite matches that. The other two options are to either lie, and tell the user the request failed for some other response or give a generic 400 'you did something bad' (then in the body give a more detailed explanation).
I would favour using the generic X00 codes, and use headers or the body to add more detail about what actually went wrong. This at least conforms better to standards and less brittle.
Note though, it is terrible to take an existing error code, and repurpose it. 404 should always be used only for 'not found' errors. Don't start using it because the user can't make that request at this time of day.
The problems in using your own codes are:
The code you choose may get officially assigned to something completely different, and that could break your API in the future. (e.g. compare a 306 with a 301)
Intermediaries don't know what your code means, so cannot optimise anything. The internet works so well because it is a distributed system, not an end-to-end system.
There are generic responses for each category, x00, which should be used if nothing better exists.
You can send your own more specific error code in either the response body or (not as good) a response header. There should be no need to make up your own codes. If you have truly found something that would benefit the rest of the internet and no-one else has thought of until now, you can always submit an Internet Draft to the IETF (this is fairly easy to do).
I would not hold up Twitter as a shining example of good internet practice, though. :)

What are the most used Accept-Language in HTTP header?

I make website and want to use the Accept-Language in HTTP header to help visitor find their language. However, I have a hard time to find statistics about the use of Accept-Language.
Will most visitor have something set as their Accept-Language? Some places it is written things like "most modern browsers support Accept-Language", but do anyone have overview of which specific browser versions that support it? And will usually browser language be set as Accept-Language by default if the user don't actively change their own Accept-Language settings? I guess most people don't change these settings, but that doesn't mean that Accept-Language is left blank?
Do anyone have statistics for the most used language codes set inside Accept-Language? I can make mapping system to map them with my site languages, but I also have problem to find some good statistics about most used codes. It would help a lot to get the overview for how to make this work better!
Thanks in advance!
Browsers send an Accept-Language header field out of the box. By default, the same language is requested, that is used for the user interface of the browser.
As Oswald said, by default browsers set this to the language used by the browser UI. So no, the default setting isn't blank, it's something like "en-US,en".
The only figures that I have found are on https://panopticlick.eff.org/results?#fingerprintTable . That page tests for the amount of information contained in HTTP requests. On the test result page, after clicking on "Show full results for fingerprinting", for various pieces of information it shows their frequency in column "one in x browsers have this value".
In row "HTTP_ACCEPT Headers" it shows the frequency of a combination of some Accept header values given by the browser. For example, it says that one in 5.25 browsers send the value "text/html, /; q=0.01 gzip, deflate, br en-US,en;q=0.5". Unfortunately, that value seems to be the concatenation of the values of headers "Accept" (somewhat stripped), "Accept-Encoding", and "Accept-Language", with a "br" thrown in for good measure.
As I wrote, when I experimented with panopticlick, it showed "one in 5.25 requests" for "en-US,en". This value is one of the more common ones, if not the most common one. One in 295.2 requests had just "en-US", one in 547.18 requests had just "en" and one in 1076.94 requests had "en,en-US" (which should have the same effects as "en", so it does not really make sense to use it).
Varying only the configuration of accepted languages, we can infer the frequency of the languages as seen by panopticlick. A more direct way would of course be to simply write to them and ask them for a table. I'm sure that the sample set of panopticlick is not representative for the entire internet, but at least it's a start.

RESTful - Different URI of same resource to get different forms of same resource

I'm pushed into a peculiar situation where I'm not able to decide what is wrong and what is right.
I've a resource called Invoice. To get a JSON or XML representation I use below URI
somedomain.com/inovices/{InvoiceNumber} - Invoice number is numeric
Accept: application/xml
When I want a PDF of same resource I use below URI:
somedomain.com/inovices/{InvoiceNumber} - Invoice number is numeric
Accept: application/pdf
Both the above url's are served for authenticated requests. We also want to support same resource using a GUID for unauthenticated requests, hence we want to use below URL
somedomain.com/inovices/{GUID}
Accept: application/pdf
Above URL is like a permanent URL and anybody can access this URL any number of times. My confusion is whether providing URL as above is RESTful or not. Because in one URL I'm using invoice number which is numeric and for permanent URL I'm replacing it with GUID.
Reason why I felt this is wrong is same resource is represented with two different URI's (number and GUID) even though they are returning same resource. Or is it just my assumption that it is wrong? Is it against any REST cosntraint is what I'm not able to understand?
There's no problem at all different URIs to point at the same resource. It's not only ok, sometimes it's also recommended, if that would add value to the user.
Think of these examples:
GET /api/users/543
GET /api/users/bob-marley
or, as SO does:
GET /questions/16637720/restful-different-uri-of-same-resource-to-get-different-forms-of-same-resource
GET /q/16637720/1118323
There are similar examples everywhere. You may want to add this "unnessecary" information if it's going to help users, or SEO, and still keep the short version available. Or imagine senarios, where you would want to add more ways of accessing resources without breaking existing ones. It sounds quite common to me and you don't break any rules by having more than one URI for the same resource.
If you are worried that a user might think that the two resources are different since she used different URI, you can have one URI redirect to the other, to make it explicit that it's exactly the same resource (as SO does when you hit the short link, or the link without the thread heading).
Here is a relevant answer.