What's the best way to monitor your REST API? [closed] - api

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I've created an API based on the RESTful pattern and I was wondering what's the best way to monitor it? Can I somehow gather statistics on each request and how deep could I monitor the requests?
Also, could it be done by using open source software (maybe building my own monitoring service) or do I need to buy third party software?
If it could be achieved by using open source software where do I start?

Start with identifying the core needs that you think monitoring will solve. Try to answer the two questions "What do I want to know?" and "How do I want to act on that information?".
Examples of "What do I want to know?"
Performance over time
Largest API users
Most commonly used API features
Error occurrence in the API
Examples of "How do I want to act on that information?"
Review a dashboard of known measurements
Be alerted when something changes beyond expected bounds
Trace execution that led to that state
Review measurements for the entire lifetime of the system
If you can answer those questions, you can either find the right third party solution that captures the metrics that you're interested in, or inject monitoring probes into the right section of your API that will tell you what you need to do know. I noticed that you're primarily a Laravel user, so it's likely that many of the metrics you want to know can be captured by adding before ( Registering Before Filters On a Controller ) and after ( Registering an After Application Filter ) filters with your application, to measure time for response and successful completion of response. This is where the answers to the first set of questions are most important ( "What do I want to know?" ), as it will guide where and what you measure in your app.
Once you know where you can capture the data, selecting the right tool becomes a matter of choosing between (roughly) two classes of monitoring applications: highly specialized monitoring apps that are tightly bound to the operation of your application, and generalized monitoring software that is more akin to a time series database.
There are no popular (to my knowledge) examples of the highly specialized case that are open source. Many commercial solutions do exist however: NewRelic, Ruxit, DynaTrace, etc. etc. etc. Their function could easily be described to be similar to a remote profiler, with many other functions besides. (Also, don't forget that a more traditional profiler may be useful for collecting some of the information you need - while it definitely will not supplant monitoring your application, there's a lot of valuable information that can be gleaned from profiling even before you go to production.)
On the general side of things, there are many more open source options that I'm personally aware of. The longest lived is Graphite (a great intro to which may be read here: Measure Anything, Measure Everything), which is in pretty common use amongst many. Graphite is by far from the only option however, and you can find many other options like Kibana and InfluxDB should you wish to host yourself.
Many of these open source options also have hosted options available from several providers. Additionally, you'll find that there are many entirely commercial options available in this camp (I'm founder of one, in fact :) - Instrumental ).
Most of these commercial options exist because application owners have found it pretty onerous to run their own monitoring infrastructure on top of running their actual application; maintaining availability of yet another distributed system is not high on many ops personnel's wishlists. :)

(I'm clearly biased for answering this since I co-founded Runscope which I believe is the leader in API Monitoring, so you can take this all with a grain of salt or trust my years of experience working with 1000s of customers specifically on this problem :)
I don't know of any OSS tools specific to REST(ful) API monitoring. General purpose OSS metrics monitoring tools (like Graphite) can definitely help keep tabs on pieces of your API stack, but don't have any API-specific features.
Commercial metrics monitoring tools (like Datadog) or Application Performance Monitoring (APM) tools like (New Relic or AppDynamics) have a few more features specific to API use cases, but none are centered on it. These are a useful part of what we call a "layered monitoring approach": start with high-level API monitoring, and use these other tools (exception trackers, APM, raw logs) to dive into issues when they arise.
So, what API-specific features should you be looking for in an API monitoring tool? We categorize them based on the three factors that you're generally monitoring for: uptime/availability, performance/speed and correctness/data validation.
Uptime Monitoring
At a base level you'll want to know if you're APIs are even available to the clients that need to reach them. For "public" (meaning, available on the public internet, not necessarily publicized...a mobile backend API is public but not necessarily publicized) APIs you'll want to simulate the clients that are calling them as much as possible. If you have a mobile app, it's likely the API needs to be available around the world. So at a bare minimum, your API monitoring tool should allow you to run tests from multiple locations. If your API can't be reached from a location, you'll want notifications via email, Slack, etc.
If your API is on a private network (corporate firewall, staging environment, local machine, etc.) you'll want to be able to "see" it as well. There are a variety of approaches for this (agents, VPNs, etc.) just make sure you use one your IT department signs off on.
Global distribution of testing agents is an expensive setup if you're self-hosting, building in-house or using an OSS tool. You need to make sure each remote location you set up (preferably outside your main cluster) is highly-available and fully-monitored as well. This can get expensive and time-consuming very quickly.
Performance Monitoring
Once you've verified your APIs are accessible, then you'll want to start measuring how fast they are performing to make sure they're not slowing down the apps that consume them. Raw response times is the bare minimum metric you should be tracking, but not always the most useful. Consider cases where multiple API calls are aggregated into a view for the user, or actions by the user generate dynamic or rarely called data that may not be present in a caching layer yet. These multi-step tasks or workflows are can be difficult to monitor with APM or metrics-based tools as they don't have the capabilities to understand the content of the API calls, only their existence.
Externally monitoring for speed is also important to get the most accurate representation of performance. If the monitoring agent sits inside your code or on the same server, it's unlikely it's taking into account all the factors that an actual client experiences when making a call. Things like DNS resolution, SSL negotiation, load balancing, caching, etc.
Correctness and Data Validation
What good is an API that's up and fast if it's returning the wrong data? This scenario is very common and is ultimately a far worse user experience. People understand "down"...they don't understand why an app is showing them the wrong data. A good API monitoring tool will allow you to do deep inspection of the message payloads going back and forth. JSON and XML parsing, complex assertions, schema validation, data extractions, dynamic variables, multi-step monitors and more are required to fully validate the data being sent back and forth is Correct.
It's also important to validate how clients authenticate with your API. Good API-specific monitoring tools will understand OAuth, mutual authentication with client certificates, token authentication, etc.
Hopefully this gives you a sense of why API monitoring is different from "traditional" metrics, APM and logging tools, and how they can all play together to get a complete picture of your application is performing.

I am using runscope.com for my company. If you want something free apicombo.com also can do.
Basically you can create a test for your API endpoint to validate the payload, response time, status code, etc. Then you can schedule the test to run. They also provide some basic statistics.

I've tried several applications and methods to do that, and the best (for my company and our related projects) is to log key=value pairs (atomic entries with all the information associated with this operation like IP source, operation result, elapsed time, etc... on specific log files for each node/server) and then monitorize with Splunk. With your REST and json data maybe your aproach will be different, but it's also well supported.
It's pretty easy to install and setup. You can monitor (almost) real time data (responses times, operation results), send notifications on events and do some DWH (and many other things, there are lots of plugins).
It's not open source but you can try it for free if you use less than 50MB logs per day (that's how it worked some time ago, since now I'm on enterprise license I'm not 100% sure).
Here is a little tutorial explaining how to achieve what you are looking for: http://blogs.splunk.com/2013/06/18/getting-data-from-your-rest-apis-into-splunk/

Related

API calls in Microservices Architecture

not sure if my question is explainable enough but I will try to explain it here as much as I can. I am currently exploring and playing with a microservices architecture, to see how it works and learn more. Mostly I understand how things work, what is the role of API Gateway in this architecture, etc...
So I have one more theoretical question. For example, imagine there are 2 services, ie. event (which manage possible events) and service ticket which manages tickets related to a specific event (there could be many tickets). So these 2 services really depend on each other but they have a separated database, are completely isolated and loosely coupled, just as it should be in "ideal" microservices environment.
Now imagine I want to fetch event and all tickets related to that event and display it in a mobile application or web spa application or whatever. Is calling multiple services / URLs to fetch data and output to UI completely okay in this scenario? Or is there a better way to fetch and aggregate this data.
From my reading different sources calling one service from another service is adding latency, making services depend on each other, future changes in one service will break another one, etc so it's not a great idea at all.
I'm sorry if I am repeating a question and it was already asked somewhere (althought I could not find it), but I need an opinion from someone that met this question earlier and can explain the flow here in a better way.
Is calling multiple services / URLs to fetch data and output to UI
completely okay in this scenario? Or is there a better way to fetch
and aggregate this data.
Yes it is ok to call multiple services from your UI and aggregate the data in your Fronted code for your needs. Actually in this case you would call 2 Rest API's to get the data from ticket micro-service and event micro-service.
Another option is that you have some Views/Read optimized micro-service which would aggregate data from both micro-services and serve as a Read-only micro-service. This of course involves some latency considerations and other things. For example this approach can be used if you have requirement like a View which consists of multiple of models(something like a Denormalized view) which will be accessed a lot and/or have some complex filter options as well. In this approach you would have a Third micro-service which would be aggregated from the data of your 2 micro-services(tickets and events). This micro-services would be optimized for reading and could also if needed use a different storage type(Document db or similar). For your case if you would decide to do this you could have only one API call to this micro-service which will provide you all your data.
Calling One micro-service from another. In some cases you can not really avoid this. Even though there are some sources online which would tell you not to do it sometimes it is inevitable. For your example I would not recommend this approach as it would produce coupling and unnecessary latency which can be avoided with other approaches.
Background info:
You can read this answer where the topic is about if it is ok to call one micro-service from another micro-service. For your particular case it is not a good option but for some cases it might be. So read it for some clarification.
Summary:
I have worked with system where we where doing all those 3 things. It really depends on your business scenario and needs of your application. Which approach to pick will depend on a couple of criteria like: usability from UI, scaling(if you have high demand on the micro-services you could consider adding a Third micro-service which could aggregate data from tickets and events micro-service), domain coupling.
For your case I would suggest option 1 or option 2 (if you have a high demanding UI) from user prospective. For some cases option 1 is enough and having a third micro-service would be an overkill, but sometimes this is an option as well.
In my experience with cloud based services, primarily Microsoft Azure, the latency of one service calling another does indeed exist, but can be relied upon to be minimal. This is especially true when compared to the unknown latency involved with the users device making the call over whichever internet plan they happen to have.
There will always be a consuming client that is dependent on a service and its defined interface, whether it is the SPA app or a another service. So in the scenario which you described, something has to aggregate the payloads from both services.
Based on this I have seen improved performance by using a service which handles client requests, aggregates results from n services and responds accordingly. This does indeed result in dependencies, but as your services evolve, it is possible to have multiple versions of your services active simultaneously allowing you to depreciate older versions at a time that is appropriate.
I hope that helps.
Optional Advice
You can maintain the read table (denormalize ) inside any of the services , which suited best. Why? - Because the CQRS apply where needed , CQRS best suited for the big and complex application. It introduce the complexity in your system , and you gain less profit and more headache.

Bugtracker - agregation and automated workflow

Intro:
I'm working for a contractor company. We're making SW for different corporate clients, each with their own rules, SW standards etc.
Problem:
The result is, that we are using several bug-tracking systems. The amount of tickets flow is relatively big and the SLA are deadly sometimes. The main problem is, that we are keeping track of these tickets in our own BT (currently Mantis) but we're also communicating with clients in theirs BT. But as it is, two many channels of communication are making too much information noise.
Solution, progress:
Actual solution is an employee having responsibility for synchronizing the streams and keeping track of the SLA and many other things. It's consuming quite a large part of his time (cca 70%) that can be spend on something more valuable. The other thing is, that he is not fast enough and sometimes the sync is not really synced. Some parts of the comments are left only on one system, some are lost completely. (And don't start me at holidays or sickness, that's where the fun begins)
Question:
How to automate this process: aggregating tasks, watching SLA, notifying the right people etc. partially or all together?
Thank you, for your answers.
You need something like Zapier. It can map different applications and synchronize data between them. It works simply:
You create zap (for example between redmine and teamwork).
You configure mapping (how items/attributes in redmine maps to items/attributes in teamwork)
You generate access tokens in both systems and write them to zap.
Zapier makes regular synchronization between redmine and teamwork.
But mantis is not yet supported by Zapier. If all/most of your clients BT are in Zapier's apps list, you may move your own BT to another platform or make a request to Zapier for mantis support.
Another way is develop your own synchronization service that will connect to all client's BTs as each employee using login/password/token and download updates to your own BT. It is hard way and this solution requires continious development to support actual virsions of client's BTs.
You can have a look a Slack : https://slack.com/
It's a great tools for group conversations
Talk, share, and make decisions in open channels across your team, in
private groups for sensitive matters, or use direct messages
one-to-one.
you can have a lot of integrations tools, and you can use Zapier https://zapier.com/ with it to programm triggers.
With differents channels you can notifying the right people partially or all together in group conversation :)
The obvious answer is to create integrations between all of the various BT's. Without knowing what those are, it's hard to say if that's entirely possible. Most modern BTs have an API and support integrations. Some, especially more desktop based ones, don't. For those you probably have to monitor a database directly.
Zapier, as someone already suggested, is a great tool for creating integrations and may already have some of the ones yo need available. I love Slack and it has an API, but messages are basically just text and unless you want to do some kind of delimiting when you post messages to its API, it probably isn't going to work.
I'm not sure what budget is, but it will cost resources to create the integrations. I'd suggest that you hire someone to simply manage these. Someone who's sole responsibility is to cross-populate the internal and the external bug tracking system and track the progress in each. All you really need is someone with good attention to detail for this, they don't have to be a developer. This should be more cost effective than using developer resources on this.
The other alternative is simply to stop. If your requirements dictate that you use your clients' bug tracking software for projects you do for them, just use their software and stop duplicating the effort. If you need some kind of central repository or something for managing work maybe just a simple table somewhere or spreadsheet with the client, the project, the issue number, the status and if possible a link to the issue in the client's BT. I understand the need and desire for centralizing this, but if it's stifling productivity, then the opportunity costs are too high IMO.
If you create an integration tool foe this, you will indeed have a very viable product. This is actually a pretty common problem.

Clarification of API benefits

While not a code-based question, I feel this question is relevant to the developer community in pursuit of a deeper understanding of API's and their role in business and the IoT at large.
Can someone please expand on the statement below? Other than in-house dev time, how exactly do API's save businesses money and foster agility?
"...APIs save businesses money and provide new levels of business agility through reusability and consistency."
Additionally, while we all know that API's are cool and can be used to build amazing things, I'm seeking to understand this from the perspective of risk vs. reward for a business.
APIs benefit larger organizations or distributed organizations with separate business units or functional units. In that scenario it allows the different functional units to deploy independently assuming you do API versioning. This has a very substantial work queuing benefit in a larger organization.
In a small organization their benefits are questionable and APIs should probably be extracted from systems as duplication arises or new problems could benefit from old solutions. Having gone through this transition I can say it's unwise to build APIs without existing applications.
In the context of IoT APIs make a lot of sense because you have largely dumb devices (supercomputers by 1980's standards) that connect back to smart infrastructure. If that is done in a bespoke or ad-hoc way it's going to be an enormous headache to change things as you release new devices. With versioned APIs separating the devices and the smart infrastructure you have a greater chance of introducing change without disabling your customers' legacy devices.
In IoT Space, APIs offer the following benefits:
New device types (e.g. from different vendors) can be easily added to the IoT platform. This saves money, because you as business owner can select from multiple devices and choose the best for your purpose, i.e. the most cost efficient one. (This relates to the API between the device and the platform).
New applications or new features can be added easily. E.g. in case you need an additional feature, it can be added on top. Even better, you can ask your internal IT or an external system integrator to do the work, again giving you the choice to select the best offer. (This relates to the API between the platform and other application).
From risk viewpoint, APIs do need to be protected as any other endpoint that you expose to the Internet (or Intranet). As minimum, you need authentication (username, password or other means), authentication (access to a subset of the data) and encryption (i.e. use TLS). Depending on your scope, you might need additional governance and API protection (e.g. throttling).

SQL installation on Amazon Web Services

Folks, I have question this morning that hopefully one of you techies can answer – during past few months, I have been heavily involved in preparing several SQL certifications study guides as it’s my desire to secure Microsoft Certified Solutions Associate (MCSA) or associate level. While I have previous experiences within this skill set and wanted to sharpen it by obtaining further experiences and hopefully securing this certification, it has been quite challenging setting up a home lab that allows me to create environment similar to what the big dogs use nowadays – windows server/several sql instances/virtualization and all that – due to lack of proper hardware or cost. In any case, my question today is to seek your advices and guidance on other possible options, particularly if this task can be accomplished using Amazons AWS – I understand they offer some level of space that can be used as playground or if one want to extend the capacity, subscription is an option. So, if I was to subscribe the paid version of it, is it possible to install all software needed to practice and experiment all needed technologies to complete and or master contents on the training kit. Again, I’m already using my small home network and have all proper software, but just feel that it’s not enough as some areas require higher computing power to properly test or rung specific areas..
Short: Yes
You can create a micro instance for free and install whatever you want on it. If your not familiar with using the CLI, it can be a bit daunting but there are plenty of guides online.
They also offer an RDS service where, they will allow you to set up a database instance and will maintain it for you but it's not free.
Edit
Link to there MS Server Page
http://aws.amazon.com/windows/
Azure is the windows cloud service, I think the comment was have you considered looking at azure instead of AWS

Etherpad style synchronisation in Meteor?

Looking into Meteor to create a collaborative document editing app, because it’s great that Meteor synchronizes data between multiple clients by default.
But when using a text-area, like in Sameer Kalburgi’s example
http://www.skalb.com/2012/04/16/creating-a-document-sharing-site-with-meteor-js/
http://docshare-tutorial.meteor.com/
the experience is sub-optimal.
I tried to type at the same time with a colleague and my changes would be overwritten when she typed and vice versa. So in the conflict resolution there is no merge algorithm yet, I think?
Is this planned for the feature? Are there ways to implement this currently? Etherpad seems to handle this problem rather well. Having this in Meteor would make creating collaborative document editing apps way more accessible.
So I looked into it some more, the algorithm used in Etherpad is known as Operational Transformation:
The solution is Operational Transformation (OT). If you haven’t heard of it, OT is a class of algorithms that do multi-site realtime concurrency. OT is like realtime git. It works with any amount of lag (from zero to an extended holiday). It lets users make live, concurrent edits with low bandwidth. OT gives you eventual consistency between multiple users without retries, without errors and without any data being overwritten.
Unfortunately, implementing OT sucks. There's a million algorithms with different tradeoffs, mostly trapped in academic papers. The algorithms are really hard and time consuming to implement correctly. We need some good libraries, so any project can just plug in OT if they need it.
Thats’s from the site of sharejs. A node.js based ot server-client that you can hook into your existing client.
OT is also implemented in the Racer model synchronization engine for Node.js, that forms the underpinnings for Derby. At the moment, derby.js doesn’t transparently provide it yet, but they plan too, from the Derby docs:
Currently, Racer defaults to applying all transactions in the order received, i.e. last-writer-wins. (…) Racer [also] supports conflict resolution via a combination of Software Transactional Memory (STM), Operational Transformation (OT), and Diff-match-patch techniques.
These features are not fully implemented yet, but the Racer demos show preliminary examples of STM and OT.
Coincidentally, both the sharejs and derbyjs teams have an ex Google-waver on board. Meteor has an ex etherpad/Google Waver in their core team. Since Etherpad is one of the best known implementations of OT I was imagining Meteor would surely want to support it at some point as well…
I've created a Meteor smart package that integrates ShareJS:
https://github.com/mizzao/meteor-sharejs
It's quite preliminary right now, but you can import it into your app, drop in textareas, and it "just works". Please try it out and submit some new features via pull requests :)
Check out a demo here:
http://documents.meteor.com
What you describe seems out of Meteors scope for me. Its not a tool to set up collaboration possibilities!
What it provides is a way to transparently work against a subset of a servers database. But the implementation of use-case specific merging functionality is the job of the application, not the framework.