REST Handling of Removed Backwards-compatible Functionality - api

How does Semantic Versioning define the version change required when functionality is removed but the client will not necessarily break?
For example, if I have a resource that accepts a sort param:
/person?sort=name
If I removed the ability to sort, the existing clients could still consume the service (sort just would not be honored). Does SemVer consider this a backwards incompatible change? If not, what rule specifically addresses this situation?

From my perspective existing clients will break - they expect for the result set to be returned in a sorted fashion and it won't be. That means that, quite likely, some web page or app-screen somewhere will display data in a non-sorted fashion even though the user clicked a button asking for stuff to be sorted by name. The client's user experience changes negatively and unexpectedly, even though the application doesn't fall over. As such, you're talking about a breaking-change and so a major version increase.
If you could be absolutely sure that no client used the API element then it might be a different thing - e.g. if you work in a closed world where you can inspect and verify all your clients then you might choose to pretend that the API element never existed and only consider the change a minor or patch change. But 99% of the time it should be a version increase and even in the closed-world situation I'd think that would be an extreme approach that would only be justified in certain circumstances.

Related

How to address multi-vendor ATM support in a Windows application

After reading about the CEN/XFS programming reference I thought it would be "easy" to write ATM software that will be supported in all ATMs. At first view, the whole standard seems reasonable to me in terms of portability.
However, to my great surprise, I have had access to some ATMs from well known vendors that do not have even the Microsoft XFS manager (msxfs.dll, etc.) installed. I thought this would be a very rare case.
I have been told that some vendors have their own XFS manager. Is it true? I thought JXFS or a vendor specific layer would depend on the CEN/XFS manager under the hood.
If so, do I have to be aware of all vendor dependant APIs? I refuse to believe this industry works like this.
Sad truth is that generig software doesn't work that well on any of the ATMs out there.
Generally speaking I belive every vendor creates their own XFS manager. The used XFS manager is pretty generic though so who ever the XFS manager provider is is not that a big deal. Actual device and service provider implementations are the real differences.
So you could write your software to a common subset of the features and you could even get a decent level of operability using that aproach. Well until you need to start and handle the error cases that is. The limitations would at this point create situations that just make that generic software useless in practice.
Reason to that is simply because all the devices are so different on implementation level and thus can do different things during and after error conditions.
So even though the CEN/XFS error codes might be the same for two vendors the required operations can be quite a bit different as their responses may indicate different severity or the error condition might be even self clearing on one, but may require operator intervention on an other one.
Because you naturally want all the available benefits from the hardware you have so at that point we start to need configuration options that are just outside the scope of CEN/XFS. After you go that way you start to get the benefits of the hardware, but that also means higher complexity to your software. Oh and you'll need lots and lots of testing as sadly you can't really trust vendor documentation either...

In the Diode library for scalajs, what is the distinction between an Action, AsyncAction, and PotAction, and which is appropriate for authentication?

In the scala and scalajs library Diode, I have used but not entirely understood the PotAction class and only recently discovered the AsyncAction class, both of which seem to be favored in situations involving, well, asynchronous requests. While I understand that, I don't entirely understand the design decisions and the naming choices, which seem to suggest a more narrow use case.
Specifically, both AsyncAction and PotAction require an initialModel and a next, as though both are modeling an asynchronous request for some kind of refreshable, updateable content rather than a command in the sense of CQRS. I have a somewhat-related question open regarding synchronous actions on form inputs by the way.
I have a few specific use cases in mind. I'd like to know a sketch (not asking for implementation, just the concept) of how you use something like PotAction in conjunction with any of:
Username/password authentication in a conventional flow
OpenAuth-style authentication with a third-party involved and a redirect
Token or cookie authentication behind the scenes
Server-side validation of form inputs
Submission of a command for a remote shell
All of these seem to be a bit different in nature to what I've seen using PotAction but I really want to use it because it has already been helpful when I am, say, rendering something based on the current state of the Pot.
Historically speaking, PotAction came first and then at a later time AsyncAction was generalized out of it (to support PotMap and PotVector), which may explain their relationship a bit. Both provide abstraction and state handling for processing async actions that retrieve remote data. So they were created for a very specific (and common) use case.
I wouldn't, however, use them for authentication as that is typically something you do even before your application is loaded, or any data requested from the server.
Form validation is usually a synchronous thing, you don't do it in the background while user is doing something else, so again Async/PotAction are not a very good match nor provide much added value.
Finally for the remote command use case PotAction might be a good fit, assuming you want to show the results of the command to the user when they are ready. Perhaps PotStream would be even better, depending on whether the command is producing a steady stream of data or just a single message.
In most cases you should use the various Pot structures for what they were meant for, that is, fetching and updating remote data, and maybe apply some of the ideas or internal models (such as the retry mechanism) to other request types.
All the Pot stuff was separated from Diode core into its own module to emphasize that they are just convenient helpers for working with Diode. Developers should feel free to create their own helpers (and contribute back to Diode!) for new use cases.

What is the benefit of versioning a REST api by date as Twilio does?

Basically, I think it's a good idea to version your REST api. That's common sense. Usually you meet two approaches on how to do this:
Either, you have a version identifier in your url, such as /api/v1/foo/bar,
or, you use a header, such as Accept: vnd.myco+v1.
So far, so good. This is what almost all big companies do. Both approaches have their pros and cons, and lots of this stuff is discussed here.
Now I have seen an entirely different approach, at Twilio, as described here. They use a date:
At compilation time, the developer includes the timestamp of the application when the code was compiled. That timestamp goes in all the HTTP requests.
When the request comes into Twilio, they do a look up. Based on the timestamp they identify the API that was valid when this code was created and route accordingly.
It's a very clever and interesting approach, although I think it is a bit complex. It can be confusing to understand whether the timestamp is compilation time or the timestamp when the API was released, for example.
Now while I somehow find this quite clever as well, I wonder what the real benefits of this approach are. Of course, it means that you only have to document one version of your API (the current one), but on the other hand it makes traceability of what has changed more difficult.
Does anyone know what the advantages of this approach are, so why Twilio decided to do so?
Please note that I am aware that this question sounds as if the answer(s) are primarily opinion-based, but I guess that Twilio had a good technical reason to do so. So please do not close this question as primariliy opinion-based, as I hope that the answer is not.
Interesting question, +1, but from what I see they only have two versions: 2008-08-01 and 2010-04-01. So from my point of view that's just another way to spell v1 and v2 so I don't think there was a technical reason, just a preference.
This is all I could find on their decision: https://news.ycombinator.com/item?id=2857407
EDIT: make sure you read the comments below where #kelnos and #andes mention an advantage of using such an approach to version the API.
There's another thing I can think about that makes this an interesting approach is if you are the developer of such api.
You have 20 methods, and you need to introduce a breaking change in 1 of those.
Using semver (v1, v2, v3, etc) you need a v2 api.
All your 20 methods now needs to respond to v2, but in reality, those methods aren't changed at all, aren't new.
Using dates, you can keep your unchanged methods as is, and when the request comes in, it just pick the best match.
I don't know how is this implemented, any information on that will be really welcome.
I used to work for a company that used date versioning (as in each api call had param of the API date desired ?v=20200630) and loved it.
It lets you be less strict than with the traditional versioning (v1, v2, v3) as client developers don't need to even care about the version number and just use the current build time. Everything else is pretty much the same as as with the traditional versioning + small benefit from seeing date checks in the server code - you can easily see how old this or that code path is.
I believe the situation would have been different if we had to support a number of external clients and for example fix a bug in ?v=20200630 - there is no elegant way to specify something like ?v=20200630.1. As you can see from Twilio's experience they were just changing what API version 2010-04-01 was - thus client couldn't be sure which version exactly it was seeing.
So my outcome from this:
date based version seems easier and more flexible when you are a typical startup or a small company with a few of apps (e.g. frontend, iOS, Android) and no or few 3rd party clients. Date-based versioning makes it a bit easier for client developers to "just write code" and since you control all the code, most of the time you can fix old API bugs by just releasing a new version and asking clients to switch to it
Once you start having the real need to maintain the old API versions (AKA when you have a number of important clients who are not likely to update quickly), then semver versioning becomes more reliable

Ruminations on highly-scalable and modular distributed server side architectures

Mine is not really a question, it's more of a call for opinions - and perhaps this isn't even the right place to post it. Nevertheless, the community here is very informed, and there's no harm in trying...
I was thinking about ways to create a highly scalable and, above all, highly modular back-end architecture. For example, an entire back-end ecosystem for a large site that had the potential for future-proof evolution into a massive site.
This would entail a very high degree of separation of concerns, to the extent that not only could (say) the underling DB be replaced (ie from Oracle to MySQL) but the actual type of database could be replaced (ed SQL to KV, or vice versa).
I envision a situation where each sub-system exposes its own API within the back-end ecosystem. In this way, the API could remain constant, whilst the implementation could change (even radically) over time.
The system must be heterogeneous in that it's not tied to a specific language. It must be able to accommodate modules or entire sub-systems using different languages.
It then occurred to me that what I was imagining was simply the architecture of the web itself.
So here is my discussion point: apart from the overhead of using (mainly) text-based protocols is there any overriding reason why a complex back-end architecture should not be implemented in the manner I describe, or is there some strong rationale I'm missing for using communication protocols such as Twisted, AMQP, Thrift, etc?
UPDATE: Following a comment from #meagar, I should perhaps reformulate the question: are the clear advantages of using a very simple, flexible and well-understood architecture (ie all functionality exposed as a series RESTful APIs) enough to compensate the obvious performance hit incurred when using this architecture in a back-end context?
[code]the actual type of database could be replaced (ed SQL to KV, or vice versa).[/code]
And anyone who wrote a join between two tables will be sad. If you want the "ability" to switch to KV, then you should not expose an API richer than what KV can support.
The answer to your question depends on what it is you're trying to accomplish. You want to keep each module within reasonable reins. Use proper physical layering of code, use defined interfaces with side-effect contracts, use test cases for each success and failure case of each interface. That way, you can depend on things like "when user enters blah page, a user-blah fact is generated so that all registered fact listeners will be invoked." This allows you to extend the system without having direct calls from point A to point B, while still having some kind of control over widely disparate dependencies. (I hate code bases where you can't find-all to find all possible references to a symbol!)
However, the fact that we put lots of code and classes into a single system is because calling between systems is often very, very expensive. You want to think in terms of code modules making requests of each other where you can. The difference in timing between a function call and a REST call is something like one to a million (maybe you can get it as low as one to ten thousand, if you only count cycles, not wallclock time -- but I'm not so sure). Also, anything that goes on a wire in a datacenter may potentially suffer from packet loss, because there is no such thing as a 100% loss-free data center, no matter how hard you try. Packet loss means random latency spikes in the response time for your application.

Mocks... and Verifiers?

currently, I am looking deeper into testing techniques, even though I am not sure if I still reside in the unittest-land or left it into the land of integration tests already.
Let me elaborate a bit, Given two components A and B and A uses B, then we have a certain "upwards-contract" for B and a certain "downwards-contract" for A. Basically this means: If A uses B correctly and B behaves correctly, then both contracts will be fulfilled and things will work correctly.
I think mocks are a way to guarantee a subset of an upwards-contract that is required for a given testcase. For example, a database connection might have the upwards contract to retrieve data records if they have been inserted earlier. A database connection mock guarantees to return certain records, without requiring their insertion into the database.
However, I am currently wondering if there is a way to verify the downwards-contract as well. Given the example of the database connection, the downwards-contract might be: You must connect to the database and ensure the connection exists and works and enter correct SQL-Queries.
Does anyone do something like this? Is this worth the work for more complicated contracts? (For example, the database connection might require an SQL-parser in order to completely verify calls to the database layer)
Greetings, tetha
This is really the difference between mocks and stubs - mocks verify exactly that (or at least can do so - you can use mocks as stubs with most frameworks). Essentially, mocks allows you to do protocol testing rather than just "if you call X I'll give you Y". Every mocking framework I've used allows you to easily verify things like "all these calls were made" and "these calls happened in a particular order".
The more protocol you enforce between components, the more brittle the tests will be - sometimes that's entirely appropriate (e.g. "you must authenticate before you perform any other operations") but it's easy to end up with tests which have to be changed every time you change the implementation, even in a reasonable way.
Does anyone do this?
Yes, I sometimes use mocks to verify "downward-contracts".
E.g. You can use the DB mock to check, if the correct
credentials where used for the login.
Especially, if You have interfaces to other subsystems,
then You can mock them up and let the mockup check for usage
violations.
E.g. if a subsystem requires an initialization
call or some kind of registration, then Your mock-up of the
subsystem interface could enforce this, too.
Is it worth the work?
It depends on how deep You want Your tests to be, let me give You some examples of different "deepness":
If You want to check the proper sequence of calls to an interface, then a simple state machine may be sufficient.
If You want to verify the proper usage of an interface language (SQL in Your example), You have to use a parser.
If You want to verify that it actually works with the real subsystem, then make an integration test (cannot be done with mockups).
Conclusion:
If appropriate, it should be done. However, You cannot expect a mockup to find each and every wrong use of the interface. E.g. a database mockup hardly detects, if there will be a deadlock caused by two concurrent transactions.