I came across the term "Opinionated API" when reading about the ssl.create_default_context() function introduced by Python 3.4, what does it mean? What's the style of such API? Why do we call it an "opinionated one"?
Thanks a lot.
It means that the creator of the API makes some choices for you that are in her opinion the best.
For example, a web application framework could choose to work best with (or even bundle or work exclusively with) a selection of lower-level libraries (for stuff like logging, database access, session management) instead of letting you choose (and then have to configure) your own.
In the case of ssl.create_default_context some security experts have thought about reasonably secure defaults to configure SSL connections. In particular, it limits the available algorithms to those that are still considered secure, at the expense of complete compatibility with legacy systems, a trade-off that is beneficial in their (and my) opinion.
Essentially they are saying "we have a lot of experience in this domain, and we really think you should do things in the following way".
I suppose this is a response to "enterprise" API that claim to work with every implementation of as many standard interfaces as possible (at the expense of complexity in configuration and combination, requiring costly consultants to set up everything).
Or a natural extension of "Convention over Configuration".
Things should work very well out-of-the-box, so that you only have to twiddle around with expert settings in special cases (and by then you should know what you are doing), as opposed to even a beginner having to make informed decisions about every aspect of the application (which can end in disaster).
Related
After reading about the CEN/XFS programming reference I thought it would be "easy" to write ATM software that will be supported in all ATMs. At first view, the whole standard seems reasonable to me in terms of portability.
However, to my great surprise, I have had access to some ATMs from well known vendors that do not have even the Microsoft XFS manager (msxfs.dll, etc.) installed. I thought this would be a very rare case.
I have been told that some vendors have their own XFS manager. Is it true? I thought JXFS or a vendor specific layer would depend on the CEN/XFS manager under the hood.
If so, do I have to be aware of all vendor dependant APIs? I refuse to believe this industry works like this.
Sad truth is that generig software doesn't work that well on any of the ATMs out there.
Generally speaking I belive every vendor creates their own XFS manager. The used XFS manager is pretty generic though so who ever the XFS manager provider is is not that a big deal. Actual device and service provider implementations are the real differences.
So you could write your software to a common subset of the features and you could even get a decent level of operability using that aproach. Well until you need to start and handle the error cases that is. The limitations would at this point create situations that just make that generic software useless in practice.
Reason to that is simply because all the devices are so different on implementation level and thus can do different things during and after error conditions.
So even though the CEN/XFS error codes might be the same for two vendors the required operations can be quite a bit different as their responses may indicate different severity or the error condition might be even self clearing on one, but may require operator intervention on an other one.
Because you naturally want all the available benefits from the hardware you have so at that point we start to need configuration options that are just outside the scope of CEN/XFS. After you go that way you start to get the benefits of the hardware, but that also means higher complexity to your software. Oh and you'll need lots and lots of testing as sadly you can't really trust vendor documentation either...
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am from network and OS operations and not from development background. I have some experience in writing Python and PHP code, and have studied software development in college.
As a hobby project (for now), I am planning on building a small website, which will have a component to store PII and sensitive information. I have to give security the first preference, and 2nd to performance (mainly of encryption/decryption).
My target is to have everything encrypted wherever possible, and also to have code which gives as little room as possible (by default) for exploitation. The site will be hosted on linux system.
The whole idea of the project is to learn a language in depth (as much as possible), and I feel I will be much more focused if I pick on some idea that I like. And that idea involves handling PII and other sensitive information. And, if the end product turns out good, then will open it up, hence wanting to make a good choice when choosing the language to write the code in.
I have done some reading, and saw people mentioning that for the backend c/c++ would be good, as it gives good performance and flexibility, but security is not easy. The next best choice would be Ada 2012, as that gives more security than C/C++, and also does not compromise on performance. Java can also provide security, but can be slightly slower. And then Python/Ruby.
I am thinking that Ada 2012 may be a good choice, but I don't want to get into a position wherein I learn it to some extent and then realize that I would have been better off with Python or Java or some other language.
I want to know from the experts answers to these 3 specific questions:
Which language will be ideal to develop this site, so that :
the best available encryption/decryption libraries can be used?
the features of the language can be leveraged to write inherently secure code?
Also, the more performance can be gained, the better?
Please advise. And also, if someone has done website (specially those handling PII) development using Ada, please share your experience.
I know each and every language has advantages and disadvantages, and the intent behind my query is to learn from the experience of those who have spent many years as website developers, and have used multiples languages and frameworks to develop websites handling sensitive data. If the mods think the question can start a good vs bad language war, I apologize as that is not the intent, and I will close the question.
The features of the language can be leveraged to write inherently secure code? Ada's type system supports writing code that validates data before usage. It's a feature of the language that helps with IT-security. But of course there is much more to IT-security than that. Configuring the firewalls, for example using systemd to specify how many processes of an executable is allowed to run simultaneously by the OS, how much memory each process is allowed to allocate, which directories the different applications have access to and permissions, and so on. I am sure there is lots I don't mention nor cover in this short response.
The best available encryption/decryption libraries can be used? The best library to my knowledge for cryptography is the Ada-Crypto-Library: https://github.com/cforler/Ada-Crypto-Library.
But what is asked for is making a safe web application. For encrypting the Secure Socket Layer (https) the Ada-Crypto-Library is not used in any http server implementation that I am aware of. If one wants to develop a web application in Ada there are three options that I see: AWS (Ada Web Server) from AdaCore and that is included in the Community Edition of the GNAT compiler (www.adacore.com), the http server implementation in Dmitry Kazakov's simple components (http://www.dmitry-kazakov.de/ada/components.htm) or GNOGA (www.gnoga.com) that is implemented on top of Dmitry Kazakov's Simple Componenets. Oh wait, Matreshka may also be used but I haven't used it yet so I cannot comment (http://forge.ada-ru.org/matreshka).
According to the documentation of AWS it can be compiled to use either OpenSSL, LibreSSL or GNUTLS (http://docs.adacore.com/live/wave/aws/html/aws_ug/building_aws.html#requirements).
With Simple Components and GNOGA the Secure Socket Layer implementation is provided by GNUTLS.
Another option for providing SSL to a web application is to use the Apache web server as a proxy that handles the encryption (I have never done such a setup, only heard of the existence of this possibility).
Also, the more performance can be gained, the better? I like performance and how to get the best performance is a vast subject. On the whole I think Ada is good programming language choice for those who like performance. Of the top of my head, to maximise performance using Ada one should:
1) When using the standard containers and using the GNAT compiler one may use "pragma Suppress (Tampering_Checks);" to increase the performance of ones application. Not everyone agrees with this view to have one debug build with the tampering checks turned on and then one release build with the checks off since one trades safety for performance, but it has a noticable impact on performance. An alternative to the standard containers one may use the Ada-Trait based containers (https://github.com/AdaCore/ada-traits-containers). They may be the World's most well designed containers for the Ada programming language.
2) Avoid usage of Unbounded_String in the standard library. One may use instead the XString unbounded string implementation in the GNATColl library and may give a 10x performance boost. Also consider allocating ordinary Strings inside memory pools (or subpools) if possible (I've done that in the Xml_Parser application in the repository: https://github.com/joakim-strandberg/wayland_ada_binding)
EDIT: I deliberately avoid arguing whether or not Ada, Java or Python is better and instead focus on, if you would do it in Ada, what would you need to do and consider.
short answer - No,such a system is never possible. PII is less sensitive than a nuclear program.
Long answer --
1. the best available encryption/decryption libraries can be used?
-As your question mentions encryption comes with decryption, the SHA-1 is broken now check alternatives (https://www.forbes.com/sites/forbestechcouncil/2017/04/13/sha-1-encryption-has-been-broken-now-what/#35e33f317ee7) and if you want to dig deep it is not about libraries it is about the algorithm used for the job.Any encryption can be broken sooner or later.
2. The features of the language can be leveraged to write inherently secure code?
There is nothing as secure language or features of language to save you there are few frameworks based on some security princiapls;just follow a set of practices to make code secure.
You follow them you would be safe if you don't there could be trouble and there are around 5000 free tools (unofficial number)that can be run on a website to break it.Are you willing to test your system against so many number of tools ?
3.Also, the more performance can be gained, the better?
-The stronger the encryption and security the more performance you lose always a trade off so choose your treadmill.
Security is a very vague and broad term and everyone gets hacked even the likes of yahoo and Symantec.(https://gizmodo.com/researchers-made-a-clever-tool-to-detect-hacks-companie-1821293404)
still not convinced here is the state of the art -https://en.wikipedia.org/wiki/Stuxnet but even this is 20 years old and just 500-kilobyte of threat.
My 2 cents - As we deal in 0 and 1 please define clear goals in terms of security and performance the make a poc(proof of concept) and run some benchmarks test.
I work in a small organization that has built an enterprise SaaS solution. Up until this point our workflows have had no programmatic interface. We're moving to a model that will allow for an end user to do anything programmatically that can be done in the UI. I'm looking for suggestions in terms of the language/framework that you would use to build that programmatic layer.
From an organizational perspective I would like the current UI team to also have ownership of the API. That team is familiar with PHP, Rails, and Javascript. Our current back-end code is written in Scala. I'm leaning toward not doing the APIs in Scala because it doesn't seem like the right tool for the job and the lack of subject matter expertise around it on the UI team.
From a functionality perspective most of the APIs will be fairly simple database operations (CRUD) with perhaps some simplistic business logic applied on top (search for example).
I'm a bit intrigued by using Node.js for this as everyone on the team is really strong with Javascript. That being said I don't just want to hop on the semi-new technology bandwagon. Because it is enterprise software, unit testing frameworks, reusability, and extendability are all important considerations as well.
Any suggestions?
I realize this question was about technology options, but there's a fundamental concern that seems really important to call out:
From an organizational perspective I would like the current UI team to also have ownership of the API.
While this sounds like a logical approach, it may not work out well unless you're UI team is made up of really solid engineers. SaaS API development is arguably one of the most challenging aspects of modern software design. A great API will make everyone's lives easier, while a poor API will bring your system to its knees and leave you completely clueless as to why.
As a quick example, if you don't solve the end user's needs in the right way, you're likely to force a number of n+1 problems on them (and thus, on you.)
There is a bunch of great material out there about how to design great APIs and even more about the pitfalls of designing a bad one. Generally speaking, most of the UI devs I've worked with, particularly ones that are only familiar with scripting languages, are not people I would entrust to API design. Instead I would utilize them as customers (in a Scrum sense) who guide the design by describing end-user needs.
I faced something like this on a previous project, where we ended up going with a combo of Esper and our own DSL written using ANTLR 3.0. Our biggest concern with using a fully funcional runtime, was sandboxing the user's code.
That said, I think Node.JS would be one of the easier ones to sandbox and it fits your needs. Maybe using something like this: http://gf3.github.com/sandbox/ or looking into Cloud9's code to see how they keep things safe. I also like that with Node.js you could give your users a pretty niffy editor using Ace.
Also check out this post: How to run user-submitted scripts securely in a node.js sandbox?
I have been having this debate with a friend where i have a library (its python but I didn't include that as a tag as the question is applicable to any language) that has a few dependencies. The debate is whether to provide a default environment in the initialization or force the user of the code to explicitly set one.
My opinion is to force the user as its explicit and will avoid confusion and make it clear what they are pointing to.
My friend this is safer and more convenient to default to an environment and let the user override if he wants to.
Thoughts ? Are there any good references or examples / patterns in popular libraries that support either of our arguments? also, any popular blogs or articles that discuss this API design point?
I don't have any references, but here are my thoughts as a potential user of said library.
I think it's good to have a default configuration available to allow developers to quickly evaluate the library. I don't want to have to go through a bunch of configuration just to see if the library will do what I need. Once I'm happy that the library will do what i need it to do, then I'm happy to configure it the way I want.
A good example is Microsoft's ASP.Net MVC framework. When you create a new MVC project it hooks in a default authentication and membership provider, which allows the developer to very quickly get a functioning application up and running. It is also easy to configure different providers to be used if the default one's don't meet the requirements of the application in question.
As a slightly different example, Atlassian Confluence is wiki software which supports many different back-end databases. Atlassian could have chosen to have no default DB configuration, but instead Confluence ships with a default, simple, file-based database to allow users to evaluate the software. For production installations you can then hook up to Oracle, SQL Server, mySQL or whatever else you like.
There may be instances where a default configuratino for a library doesn't really make sense, but I think that would be a special case, rather than a general rule.
It depends. If you can provide sensible defaults, you might want to do that: it will make life easer on the occasional user of the library as they can set only the relevant settings, as opposed to the whole environment (with possibly settings the implications of which they don't fully understand (yet)). You are correct, that in situations it is possible this leads to frustration and confusion as the defaulted settings might cause unexpected behavior (unexpected by the (inexperienced) user) -- you have to weigh the reduced frustration of convenience against the price of not-understood defaults to make the choice for each of these possible-to-default settings, which choice might affect the choice for other, related settings as well
On the other hand, if there is no sensible default (e.g. DB credentials, remote address), you should require the user to provide those settings.
The key in both cases is to provide enough information in the documentation of the library and in the error messages (either for missing settings or conflicting ones) that the user can figure out what those settings actually mean/control without having to read through the source code of the library. This part is hard because 1) it is usally tedious from the point of view of the library developer (so it is often skimped) and 2) the documentation has to be written from the mindset of a newbie to the library, which is often different from the library developer's mindset -- the latter knows the implicit connections/implications, the former has to be told about those in an understandable way.
Although not exactly identical in terms of problem domain, this strikes me as the Convention over Configuration argument.
There has been quite a lot momentum behind CoC in recent years, and in my mind, it makes a whole lot of sense. As long as flexibility is not lost, you have everything to gain. Lower friction development is what we are all after, and if I've got to configure every aspect of your API in order to get it working, I'm less inclined to use it over another API of equal functionality.
I happen to like Hanselman's podcasts, so if you want a little light listening, check out this podcast.
I think your question needs some clarification. For starters, I don't think a library should have any runtime configuration. In terms of dependencies, library dependencies should be handled in a manner appropriate to the environment they are being written for. In python, those dependencies should be in the setup.py file (under requirements), and ultimately that file should meet the requirements of whatever service you plan on making it available on (i.e. pypi for python).
For applications, it is completely okay to require runtime configuration, but you should try to have sensible defaults. If your application depends on libraries, that dependency should be handled in the same way a library dependency would be handled, even though that information may be redundant in the context of an installer (if needed). For the most part first-run scripts and their ilk should be apart of the installer/rpm.
For Web Frameworks, it is typical that your app would carry configuration with it, and likely that it would need to be installed in a different way than traditional applications. Here, about the only thing you can do is try to follow the conventions of whatever framework you are writing in.
Mine is not really a question, it's more of a call for opinions - and perhaps this isn't even the right place to post it. Nevertheless, the community here is very informed, and there's no harm in trying...
I was thinking about ways to create a highly scalable and, above all, highly modular back-end architecture. For example, an entire back-end ecosystem for a large site that had the potential for future-proof evolution into a massive site.
This would entail a very high degree of separation of concerns, to the extent that not only could (say) the underling DB be replaced (ie from Oracle to MySQL) but the actual type of database could be replaced (ed SQL to KV, or vice versa).
I envision a situation where each sub-system exposes its own API within the back-end ecosystem. In this way, the API could remain constant, whilst the implementation could change (even radically) over time.
The system must be heterogeneous in that it's not tied to a specific language. It must be able to accommodate modules or entire sub-systems using different languages.
It then occurred to me that what I was imagining was simply the architecture of the web itself.
So here is my discussion point: apart from the overhead of using (mainly) text-based protocols is there any overriding reason why a complex back-end architecture should not be implemented in the manner I describe, or is there some strong rationale I'm missing for using communication protocols such as Twisted, AMQP, Thrift, etc?
UPDATE: Following a comment from #meagar, I should perhaps reformulate the question: are the clear advantages of using a very simple, flexible and well-understood architecture (ie all functionality exposed as a series RESTful APIs) enough to compensate the obvious performance hit incurred when using this architecture in a back-end context?
[code]the actual type of database could be replaced (ed SQL to KV, or vice versa).[/code]
And anyone who wrote a join between two tables will be sad. If you want the "ability" to switch to KV, then you should not expose an API richer than what KV can support.
The answer to your question depends on what it is you're trying to accomplish. You want to keep each module within reasonable reins. Use proper physical layering of code, use defined interfaces with side-effect contracts, use test cases for each success and failure case of each interface. That way, you can depend on things like "when user enters blah page, a user-blah fact is generated so that all registered fact listeners will be invoked." This allows you to extend the system without having direct calls from point A to point B, while still having some kind of control over widely disparate dependencies. (I hate code bases where you can't find-all to find all possible references to a symbol!)
However, the fact that we put lots of code and classes into a single system is because calling between systems is often very, very expensive. You want to think in terms of code modules making requests of each other where you can. The difference in timing between a function call and a REST call is something like one to a million (maybe you can get it as low as one to ten thousand, if you only count cycles, not wallclock time -- but I'm not so sure). Also, anything that goes on a wire in a datacenter may potentially suffer from packet loss, because there is no such thing as a 100% loss-free data center, no matter how hard you try. Packet loss means random latency spikes in the response time for your application.