How to view request performance in Application Insights regardless of URL path parameters - asp.net-core

I am using Application Insights to monitor my APIs. When viewing the performance of my endpoints, AI seems to split requests based on their path, including any path parameters. This is not helpful to me; I would like to investigate performance (and get performance alerts) for e.g. all PATCH /observations/{id} operations when viewed together, not when viewed individually for each {id}.
Put another way, in the screenshot below, I would like the underlined operations to all be in the same row, e.g. PATCH /observations/{id}.
Is this possible without a lot of boilerplate code, and without ruining other monitoring features (which editing the telemetry operation names might, though I may be wrong)?
I am using ASP.NET Core endpoint routing, so the framework has all necessary information about the routing (tokens etc.).
Grouping requests this way for performance monitoring seems so obvious that I am surprised that it is so hard to figure this out, or that there are seemingly no "canonical" and simple way to accomplish this. Please let me know if I am going about this the wrong way.

Related

MarkLogic : Design question on Query Options Vs Transform

Our team creates a bunch of custom REST APIs (v1/resources/...) and expose them as enterprise services to other stakeholders, who do not need to know anything about MarkLogic. However, our team is responsible for creating, enhancing and maintaining the server-side scripting (we use JavaScript) within MarkLogic.
While creating custom REST APIs, our current design to meet the search requirements is to start with a Query Options, incorporate as many requirements in Query options, and for any requirements that could not be met by Query Options (for example, sorting within a document, complex XPath, merge with other documents etc.), code within the Java Script extension program (technically not a transform but conceptually similar to a transform).
With the limitations in Query options, increasingly, most of our logic is going into the Javascript extension programs and the query options seem to be just a maintenance overhead. Do we really need to maintain a query options file for every REST extension, while the transforms offer much powerful functionality? Can I get rid of Query options and just use server side Java Script code (conceptually, similar to transforms)? Initially, our thought was that Query options is configuration based and hence changing query options is not exactly a code change, however, based on our experience, we realized that changing query options also involves deployment, regression testing and all other activities. Hence I do not see any specific advantage of query options, in our case (creating custom REST APIs).
Design gurus, please suggest!
These questions are tough to answer without properly knowing the whole situation. You talk about custom REST endpoints, but are using REST extensions on the built-in REST api for that purpose. Are you using anything else of the built-in REST api? You use search options, but don't use them for /v1/search. Could you have done the same with /v1/search with those options, but an extra REST Transform on top? And you speak of a lot of data mangling happening at sub-document level. Have you considered doing part of that processing upfront, or organize your data differently so that handling at request-time becomes much simpler?
A lot of questions, and a lot of possibilities. It is hard to give one straight answer on such an open question. I hope I gave some food for thought nonetheless.
HTH!

In the Diode library for scalajs, what is the distinction between an Action, AsyncAction, and PotAction, and which is appropriate for authentication?

In the scala and scalajs library Diode, I have used but not entirely understood the PotAction class and only recently discovered the AsyncAction class, both of which seem to be favored in situations involving, well, asynchronous requests. While I understand that, I don't entirely understand the design decisions and the naming choices, which seem to suggest a more narrow use case.
Specifically, both AsyncAction and PotAction require an initialModel and a next, as though both are modeling an asynchronous request for some kind of refreshable, updateable content rather than a command in the sense of CQRS. I have a somewhat-related question open regarding synchronous actions on form inputs by the way.
I have a few specific use cases in mind. I'd like to know a sketch (not asking for implementation, just the concept) of how you use something like PotAction in conjunction with any of:
Username/password authentication in a conventional flow
OpenAuth-style authentication with a third-party involved and a redirect
Token or cookie authentication behind the scenes
Server-side validation of form inputs
Submission of a command for a remote shell
All of these seem to be a bit different in nature to what I've seen using PotAction but I really want to use it because it has already been helpful when I am, say, rendering something based on the current state of the Pot.
Historically speaking, PotAction came first and then at a later time AsyncAction was generalized out of it (to support PotMap and PotVector), which may explain their relationship a bit. Both provide abstraction and state handling for processing async actions that retrieve remote data. So they were created for a very specific (and common) use case.
I wouldn't, however, use them for authentication as that is typically something you do even before your application is loaded, or any data requested from the server.
Form validation is usually a synchronous thing, you don't do it in the background while user is doing something else, so again Async/PotAction are not a very good match nor provide much added value.
Finally for the remote command use case PotAction might be a good fit, assuming you want to show the results of the command to the user when they are ready. Perhaps PotStream would be even better, depending on whether the command is producing a steady stream of data or just a single message.
In most cases you should use the various Pot structures for what they were meant for, that is, fetching and updating remote data, and maybe apply some of the ideas or internal models (such as the retry mechanism) to other request types.
All the Pot stuff was separated from Diode core into its own module to emphasize that they are just convenient helpers for working with Diode. Developers should feel free to create their own helpers (and contribute back to Diode!) for new use cases.

Spring Data Rest Without HATEOAS

I really like all the boilerplate code Spring Data Rest writes for you, but I'd rather have just a 'regular?' REST server without all the HATEOAS stuff. The main reason is that I use Dojo Toolkit on the client side, and all of its widgets and stores are set up such that the json returned is just a straight array of items, without all the links and things like that. Does anyone know how to configure this with java config so that I get all the mvc code written for me, but without all the HATEOAS stuff?
After reading Oliver's comment (which I agree with) and you still want to remove HATEOAS from spring boot.
Add this above the declaration of the class containing your main method:
#SpringBootApplication(exclude = RepositoryRestMvcAutoConfiguration.class)
As pointed out by Zack in the comments, you also need to create a controller which exposes the required REST methods (findAll, save, findById, etc).
So you want REST without the things that make up REST? :) I think trying to alter (read: dumb down) a RESTful server to satisfy a poorly designed client library is a bad start to begin with. But here's the rationale for why hypermedia elements are necessary for this kind of tooling (besides the probably familiar general rationale).
Exposing domain objects to the web has always been seen critically by most of the REST community. Mostly for the reason that the boundaries of a domain object are not necessarily the boundaries you want to give your resources. However, frameworks providing scaffolding functionality (Rails, Grails etc.) have become hugely popular in the last couple of years. So Spring Data REST is trying to address that space but at the same time be a good citizen in terms of restfulness.
So if you start with a plain data model in the first place (objects without to many relationships), only want to read them, there's in fact no need for something like Spring Data REST. The Spring controller you need to write is roughly 10 lines of code on top of a Spring Data repository. When things get more challenging the story gets becomes more intersting:
How do you write a client without hard coding URIs (if it did, it wasn't particularly restful)?
How do you handle relationships between resources? How do you let clients create them, update them etc.?
How does the client discover which query resources are available? How does it find out about the parameters to pass etc.?
If your answers to these questions is: "My client doesn't need that / is not capable of doing that.", then Spring Data REST is probably the wrong library to begin with. What you're basically building is JSON over HTTP, but nothing really restful then. This is totally fine if it serves your purpose, but shoehorning a library with clear design constraints into something arbitrary different (albeit apparently similar) that effectively wants to ignore exactly these design aspects is the wrong approach in the first place.

HTTP requests and Apache modules: Creative attack vectors

Slightly unorthodox question here:
I'm currently trying to break an Apache with a handful of custom modules.
What spawned the testing is that Apache internally forwards requests that it considers too large (e.g. 1 MB trash) to modules hooked in appropriately, forcing them to deal with the garbage data - and lack of handling in the custom modules caused Apache in its entirety to go up in flames. Ouch, ouch, ouch.
That particular issue was fortunately fixed, but the question's arisen whether or not there may be other similar vulnerabilities.
Right now I have a tool at my disposal that lets me send a raw HTTP request to the server (or rather, raw data through an established TCP connection that could be interpreted as an HTTP request if it followed the form of one, e.g. "GET ...") and I'm trying to come up with other ideas. (TCP-level attacks like Slowloris and Nkiller2 are not my focus at the moment.)
Does anyone have a few nice ideas how to confuse the server's custom modules to the point of server-self-immolation?
Broken UTF-8? (Though I doubt Apache cares about encoding - I imagine it just juggles raw bytes.)
Stuff that is only barely too long, followed by a 0-byte, followed by junk?
et cetera
I don't consider myself a very good tester (I'm doing this by necessity and lack of manpower; I unfortunately don't even have a more than basic grasp of Apache internals that would help me along), which is why I'm hoping for an insightful response or two or three. Maybe some of you have done some similar testing for your own projects?
(If stackoverflow is not the right place for this question, I apologise. Not sure where else to put it.)
Apache is one of the most hardened software projects on the face of the planet. Finding a vulnerability in Apache's HTTPD would be no small feat and I recommend cutting your teeth on some easier prey. By comparison it is more common to see vulnerabilities in other HTTPDs such as this one in Nginx that I saw today (no joke). There have been other source code disclosure vulnerablites that are very similar, I would look at this and here is another. lhttpd has been abandoned on sf.net for almost a decade and there are known buffer overflows that affect it, which makes it a fun application to test.
When attacking a project you should look at what kind of vulnerabilities have been found in the past. Its likely that programmers will make the same mistakes again and again and often there are patterns that emerge. By following these patterns you can find more flaws. You should try searching vulnerablites databases such as Nist's search for CVEs. One thing that you will see is that apache modules are most commonly compromised.
A project like Apache has been heavily fuzzed. There are fuzzing frameworks such as Peach. Peach helps with fuzzing in many ways, one way it can help you is by giving you some nasty test data to work with. Fuzzing is not a very good approach for mature projects, if you go this route I would target apache modules with as few downloads as possible. (Warning projects with really low downloads might be broken or difficult to install.)
When a company is worried about secuirty often they pay a lot of money for an automated source analysis tool such as Coverity. The Department Of Homeland Security gave Coverity a ton of money to test open source projects and Apache is one of them. I can tell you first hand that I have found a buffer overflow with fuzzing that Coverity didn't pick up. Coverity and other source code analysis tools like the open source Rats will produce a lot of false positives and false negatives, but they do help narrow down the problems that affect a code base.
(When i first ran RATS on the Linux kernel I nearly fell out of my chair because my screen listed thousands of calls to strcpy() and strcat(), but when i dug into the code all of the calls where working with static text, which is safe.)
Vulnerability resarch an exploit development is a lot of fun. I recommend exploiting PHP/MySQL applications and exploring The Whitebox. This project is important because it shows that there are some real world vulnerabilities that cannot be found unless you read though the code line by line manually. It also has real world applications (a blog and a shop) that are very vulnerable to attack. In fact both of these applications where abandoned due to security problems. A web application fuzzer like Wapiti or acuentix will rape these applications and ones like it. There is a trick with the blog. A fresh install isn't vulnerable to much. You have to use the application a bit, try logging in as an admin, create a blog entry and then scan it. When testing a web application application for sql injection make sure that error reporting is turned on. In php you can set display_errors=On in your php.ini.
Good Luck!
Depending on what other modules you have hooked in, and what else activates them (or is it only too-large requests?), you might want to try some of the following:
Bad encodings - e.g. overlong utf-8 like you mentioned, there are scenarios where the modules depend on that, for example certain parameters.
parameter manipulation - again, depending on what the modules do, certain parameters may mess with them, either by changing values, removing expected parameters, or adding unexpected ones.
contrary to your other suggestion, I would look at data that is just barely short enough, i.e. one or two bytes shorter than the maximum, but in different combinations - different parameters, headers, request body, etc.
Look into HTTP Request Smuggling (also here and here) - bad request headers or invalid combinations, such as multiple Content-Length, or invalid terminators, might cause the module to misinterpret the command from Apache.
Also consider gzip, chunked encoding, etc. It is likely that the custom module implements the length check and the decoding, out of order.
What about partial request? e.g requests that cause a 100-Continue response, or range-requests?
The fuzzing tool, Peach, recommended by #TheRook, is also a good direction, but don't expect great ROI first time using it.
If you have access to source code, a focused security code review is a great idea. Or, even an automated code scan, with a tool like Coverity (as #TheRook mentioned), or a better one...
Even if you don't have source code access, consider a security penetration test, either by experienced consultant/pentester, or at least with an automated tool (there are many out there) - e.g. appscan, webinspect, netsparker, acunetix, etc etc.

What is the best way of pulling json data in terms of performance?

Currently I am using HttpWebRequest to pull json data from an external site, and the performance was not good. Is wcf much better?
I need expert advice on this..
Probably not, but that's not the right question.
To answer it: WCF, which certainly supports JSON, is ultimately going to use HttpWebRequest at the bottom level, and it will certainly have the same network latency. Even more importantly, it will use the same server to get the JSON. WCF has a lot of advantages in building, maintaining, and configuring web services and clients, but it's not magically faster. It's possible that your method of deserializing JSON is really slow compared to what WCF would use by default, but I doubt it.
And that brings up the really important point: find out why the performance is bad. Changing frameworks is only an intelligible optimization option if you know what's slow, and, by extension, how doing something different would make it less slow. Is it the server? Is it deserialization? Is it network? Is it authentication or some other request overhead detail? And so on.
So the real answer is: profile! Once you know what the performance issue really is, you can make an informed decision about whether a framework like WCF would help.
The short answer is: no.
The longer answer is that WCF is an API which doesn't specify a communication method, but supports multiple methods. However, those methods are normally over SOAP which is going to involve more overheard than a JSON, and it would seem the world has decided to move on from SOAP.
What sort of performance are you looking for and what are you getting? It may be that you are simply facing physical limitations of network locations, in which case you might look towards making your interface feel more responsive, even if the data is sluggish.
It'd be worth it to see if most of the latency is just in reaching the remote site (e.g. response times are comparable to ping times). Or, perhaps, the problem is the time it takes for the remote site to generate and serve the page. If so, some intermediate caching might be best.
+1 on what Isaac said, but one thing I'd add is, if you do use WCF here, it'll internally use the HttpWebRequest in most places, so you're definitely not gaining performance at all. One way you may unintentionally gain in performance -- however -- is in how WCF recycles, reuses, pools, and caches most transport objects internally. So it ultimately goes back to Isaac's advice on profiling.