MarkLogic : Design question on Query Options Vs Transform - marklogic-9

Our team creates a bunch of custom REST APIs (v1/resources/...) and expose them as enterprise services to other stakeholders, who do not need to know anything about MarkLogic. However, our team is responsible for creating, enhancing and maintaining the server-side scripting (we use JavaScript) within MarkLogic.
While creating custom REST APIs, our current design to meet the search requirements is to start with a Query Options, incorporate as many requirements in Query options, and for any requirements that could not be met by Query Options (for example, sorting within a document, complex XPath, merge with other documents etc.), code within the Java Script extension program (technically not a transform but conceptually similar to a transform).
With the limitations in Query options, increasingly, most of our logic is going into the Javascript extension programs and the query options seem to be just a maintenance overhead. Do we really need to maintain a query options file for every REST extension, while the transforms offer much powerful functionality? Can I get rid of Query options and just use server side Java Script code (conceptually, similar to transforms)? Initially, our thought was that Query options is configuration based and hence changing query options is not exactly a code change, however, based on our experience, we realized that changing query options also involves deployment, regression testing and all other activities. Hence I do not see any specific advantage of query options, in our case (creating custom REST APIs).
Design gurus, please suggest!

These questions are tough to answer without properly knowing the whole situation. You talk about custom REST endpoints, but are using REST extensions on the built-in REST api for that purpose. Are you using anything else of the built-in REST api? You use search options, but don't use them for /v1/search. Could you have done the same with /v1/search with those options, but an extra REST Transform on top? And you speak of a lot of data mangling happening at sub-document level. Have you considered doing part of that processing upfront, or organize your data differently so that handling at request-time becomes much simpler?
A lot of questions, and a lot of possibilities. It is hard to give one straight answer on such an open question. I hope I gave some food for thought nonetheless.
HTH!

Related

NestJS Schema First GraphQL Serialization

I've done some research into the subject of response serialization for NestJS/GraphQL. There's some helpful information to be found here, but the documentation seems to be completely focused on a code first approach. My project happens to be taking schema first approach, and from what I've read across a few sources, the option available for a schema-first project would be to implement interceptors for the resolvers, and carry out the serialization there.
Before I run off and start writing these interceptors, my question is this; is there any better options provided by nestjs to implement serialization for a schema first approach?
If it's just transformation of values then an interceptor is a great tool for that. Everything shown for "code-first" should work for "schema-first" in terms of high level ideas of the framework (interceptors, pipes, filters, etc). In fact, once the server is running, there shouldn't be a distinguishable difference between the two approaches, and how they operate. The big thing you'd need to be concerned with is that you won't be easily able to take advantage of class-transformer and class-validator because the original class definitions are created via the gql-codegen, but you can still extend those types and add on the decorators necessary if you choose.

Dynamically adding data to divs with SQL?

A little background first, recently began coding and I decided to take the "learn as you go" approach as this is solely a project. I have a pretty good handle of HTML and CSS, I have an understanding of Jquery, and haven't even begun to look at other languages.
So basically I'm making a suedo-e-commerce site, and I'm trying to create a page layout comprising of several divs stacked together (think standard catalog page) Creating the modules and every static with HTML and CSS, but I want to add the content, comprising of a banner and some text blocks, dynamically from a database. Now, I'm pretty sure that I will have to use SQL and reference each entry with the HTML, but I have no idea how to do that or where to even start. So I'm asking if someone could point me in the right direction with some reading material, or some examples would be awesome.
You need to use one of databases (MySQL, MSSQL etc.) to save data. In order to show data from database you need to use one of background/server side programming languages. For start I would suggest that you try with php.
W3schools is good starting point for you.
This is very simplified and I hope not condescending. Consider separating how you collect your data and you present your data (the 'view layer'). SQL will help you pull / organize your data, and you could just string functions to add formatting (e.g. div's) to it, but you are better off investigating templating HTML. What happens when you want to put this data into a ul list or something? You have to re-write your perfectly good SQL. Again, very broadly, pull data (with SQL, PHP, combination), ( or get it from a URL with javascript), into a data structure, then within a loop in your template, add the dives for each element.
Good reading really depends on which platform you'll be developing this in. There are a bazillion alternatives, including many in Javascript, PHP, Ruby, Python, Go, ASP. Since you mention SQL, you must have some data somewhere (rather than a data service) so you'll need a server-side language, and since you are a beginner, you may want to look into PHP which I think is approachable. Within that there are several PHP frameworks for data, and several for templating, and several with both. Many of the full frameworks (in any language) are geared for experienced web devs. That said, I like the twig templating language for PHP
This, I think, is a good place to start http://www.phptherightway.com along with the super popular but basic W3schools. The link above I think organizes the concept a little better.
You can install the stuff you need on your laptop for that standard (and old school) 'LAMP stack', or use one of the many hosting companies, nearly all of which provide everything you need. good luck learning!

Spring Data Rest Without HATEOAS

I really like all the boilerplate code Spring Data Rest writes for you, but I'd rather have just a 'regular?' REST server without all the HATEOAS stuff. The main reason is that I use Dojo Toolkit on the client side, and all of its widgets and stores are set up such that the json returned is just a straight array of items, without all the links and things like that. Does anyone know how to configure this with java config so that I get all the mvc code written for me, but without all the HATEOAS stuff?
After reading Oliver's comment (which I agree with) and you still want to remove HATEOAS from spring boot.
Add this above the declaration of the class containing your main method:
#SpringBootApplication(exclude = RepositoryRestMvcAutoConfiguration.class)
As pointed out by Zack in the comments, you also need to create a controller which exposes the required REST methods (findAll, save, findById, etc).
So you want REST without the things that make up REST? :) I think trying to alter (read: dumb down) a RESTful server to satisfy a poorly designed client library is a bad start to begin with. But here's the rationale for why hypermedia elements are necessary for this kind of tooling (besides the probably familiar general rationale).
Exposing domain objects to the web has always been seen critically by most of the REST community. Mostly for the reason that the boundaries of a domain object are not necessarily the boundaries you want to give your resources. However, frameworks providing scaffolding functionality (Rails, Grails etc.) have become hugely popular in the last couple of years. So Spring Data REST is trying to address that space but at the same time be a good citizen in terms of restfulness.
So if you start with a plain data model in the first place (objects without to many relationships), only want to read them, there's in fact no need for something like Spring Data REST. The Spring controller you need to write is roughly 10 lines of code on top of a Spring Data repository. When things get more challenging the story gets becomes more intersting:
How do you write a client without hard coding URIs (if it did, it wasn't particularly restful)?
How do you handle relationships between resources? How do you let clients create them, update them etc.?
How does the client discover which query resources are available? How does it find out about the parameters to pass etc.?
If your answers to these questions is: "My client doesn't need that / is not capable of doing that.", then Spring Data REST is probably the wrong library to begin with. What you're basically building is JSON over HTTP, but nothing really restful then. This is totally fine if it serves your purpose, but shoehorning a library with clear design constraints into something arbitrary different (albeit apparently similar) that effectively wants to ignore exactly these design aspects is the wrong approach in the first place.

Automatisation&Piping of diverse tasks

I am looking for recommendations for a very generic automation/task execution tool. The scope is somewhat between a script, a build system like make and orchestration tools like Ansible or Puppet. The best I can do is describe my rather vague 'requirements' and hope for clues how others have solved these problems. Sorry for the long description, I guess I don't really know what exactly I want he solution to do. I profit from programming answers on SO all the time but I am not entirely sure if my open ended question is acceptable here.
--
We work as data analysts/system validators in a corporate setting. We perform a range of diverse tasks and interact with lots of ever changing systems. Each little step we do is arguably mundane/easy, but the bigger picture only forms if lots of iterations with slightly different inputs or combinations are repeated. It is a bit like looking for a needle in a hay stack, but the concrete problem is slightly different every time. This makes it hard to use a normal script or automation tool, which require more structure to work. But doing things semi-manual without a big team does not allow us to cover all the analysis/cases we want/need.
To give an applied example: a typical tasks could involve setting up a big calculation in a vendor system, extracting their ASCII output from a web server and parsing it. Then we would suck raw input data from a set of configuration files and data bases. This is piped into some of our home grown replication tools/models living in C++. Then both the system's results and our replication is scanned for interesting outliers (e.g. regression tested) and only this subset is uploaded for human analysts to investigate, nicely presented in an Excel sheet.
We can do all these things easily by hand for a once-off or maybe using ad-hoc tools/scripts. We just can't do it repeatedly for ever so slightly different settings. We seem to need a library for 'common tasks' that are just specialized by some few inputs (e.g. task it to download a time series and scan for outliers - parameters would be db access/login and maybe parameters defining what an outlier is in that context). And then I need to chain these tasks together to make complex tasks repeatable and simple to build up from atomic steps.
I have not found anything really do something like this. There seems to be specialist scripting or tools for each niche available, but not something combining all the different tasks I need to perform.
I have been so far toying on and off with a minimalist sqlite database which controls a set of python 'scripts'/wrappers. These scripts take input parameters from the data base, and they are chained/piped based on the database. The scripts write their results back to the database, mostly as plain text and floats/ints. This kind of db interface is very error prone and complicated for humans; the idea is to have (template) scripts writing (concrete/parametrised) scripts to the db for execution, like rolling itself out before executing. Not sure if this is a smart idea, but the db is driving the scripts, without much interacting among these building block script; rather than having the conventional bunch of scripts calling each other and dumping some data into db as an after thought. So far we have lots of separate wrappers (scripts) to talk to all the systems and do the work, what is really missing is something tying it all together an controlling it.
I am interested (obviously) more in data/flow transparency, repeatability and chaining mini-programs together to bigger units, rather than speed or scaling to larger data sets. All the heavier lifting is either done in the systems we interact with, or it is delegated to C++ called from these python scripts. This is not a production system with more stability and fixed goals but rather a flexible analysis/investigation helper.
I really hope someone here has previously run into exactly that problem severely limiting our productivity, and we can just piggy back off your solution or ideas.
I would suggest that you consider staf (Software Test Automation Framework). It's open source, distributed, and cross-platform. It will run just about any task on just about any platform. It has a variety of plugin "Services" available for specific purposes, or you can create your own custom Service. You can also extend the functionality through scripting (jython) It's also well documented and reasonably well supported through user forums by IBM.

Consuming web services from Oracle PL/SQL

Our application is interfacing with a lot of web services these days. We have our own package that someone wrote a few years back using UTL_HTTP and it generally works, but needs some hard-coding of the SOAP envelope to work with certain systems. I would like to make it more generic, but lack experience to know how many scenarios I would have to deal with. The variations are in what namespaces need to be declared and the format of the elements. We have to handle both simple calls with a few parameters and those that pass a large amount of data in an encoded string.
I know that 10g has UTL_DBWS, but there are not a huge number of use-cases on-line. Is it stable and flexible enough for general use? Documentation
I have used UTL_HTTP which is simple and works. If you face a challenge with your own package, you can probably find a solution in one of the many wrapper packages around UTL_HTTP on the net (Google "consuming web services from pl/sql", leading you to e.g.
http://www.oracle-base.com/articles/9i/ConsumingWebServices9i.php)
The reason nobody is using UTL_DBWS is that it is not functional in a default installed database. You need to load a ton of Java classes into the database, but the standard instructions seem to be defective - the process spews Java errors right and left and ultimately fails. It seems very few people have been willing to take the time to track down the package dependencies in order to make this approach work.
I had this challenge and found and installed the 'SOAP API' package that Sten suggests on Oracle-Base. It provides some good envelope-creation functionality on top of UTL_HTTP.
However there were some limitations that pertain to your question. SOAP_API assumes all requests are simple XML- i.e. only one layer tag hierarchy.
I extended the SOAP_API package to allow the client code to arbitrarily insert an extra tag. So you can insert a sub-level such as , continue to build the request, and remember to insert a closing tag.
The namespace issue was a bear for the project- different levels of XML had different namespaces.
A nice debugging tool that I used is TCP Trace from Pocket Soap.
www.pocketsoap.com/tcptrace/
You set it up like a proxy and watch the HTTP request and response objects between client and server code.
Having said all that, we really like having a SOAP client in the database- we have full access to all data and existing PLSQL code, can easily loop through cursors and call the external app via SOAP when needed. It was a lot quicker and easier than deploying a middle tier with lots of custom Java or .NET code. Good luck and let me know if you'd like to see my enhanced SOAP API code.
We have also used UTL_HTTP in a manner similar to what you have described. I don't have any direct experience with UTL_DBWS, so I hope you can follow up with any information/experience you can gather.
#kogus, no it's a quite good design for many applications. PL/SQL is a full-fledged programming language that has been used for many big applications.
Check out this older post. I have to agree with that post's #1 answer; it's hard to imagine a scenario where this could be a good design.
Can't you write a service, or standalone application, which would talk to a table in your database? Then you could implement whatever you want as a trigger on that table.