Apache ignite json field - ignite

Perhaps I am missing this in the documentation, but is it possible to store and query against json data in Apache Ignite? For example, let's say I have a "table" called "cars" with the following fields:
model
blueprint
The "blueprint" field is actually a json field that may contain data such as:
{
horsepower: 200,
mpg: 30
}
Those are not the only fields for the "blueprint" field, and it may contain many more or less fields. Is it possible to run a query such as:
SELECT model FROM cars WHERE blueprint.horsepower < 300 AND blueprint.mpg > 20
It is not known in advance what the fields will be for the "blueprint" field, and creating indexes for them is not optional.
Note: This is not a conversation about if this is the logically optimal way to store this information, or how the "blueprint" field should be stored in a separate table. This question is meant to understand if querying against a json field is trivially possible in apache ignite.

This is not supported out of the box as for now. However, you can create conversion logic between JSON and Ignite binary format and save BinaryObjects in caches. To create a BinaryObject without a Java class, you can use binary object builder: https://apacheignite.readme.io/docs/binary-marshaller#modifying-binary-objects-using-binaryobjectbuilder

Related

Standard deep nested data type?

I took the nice example clientPrintDescription.py and create a HTML form from the description which matches the input data types for the particular RFC function.
In SAP data types can contain data types which can contain data types, and I want to test my HTML form generator with a very nested data type.
Of course I could create my own custom data type, but it would be more re-usable if I would use an existing (rfc-capable) data type.
Which data type in SAP contains a lot of nested data types? And maybe a lot of different data types?
I cannot tell which structure is the best for your case but you could filter the view DD03VV (now that is a meaningful name) using the transaction se16h. If you GROUP BY the column TABNAME and filter on WHERE TABCLASS = 'INTTAB' the number of entries is an indicator for the size of the structure.
You could also aggregate and in a next step filter on the maximum DEPTH value (like a SQL HAVING, which afaik does not exist in SAP R/3). On my system the maximum depth is 12.
Edit: If you cannot access se16h, here's a workaround: Call se37 and execute SE16N_START with I_HANA = 'X'. If you cannot access se37 use sa38 and call RSFUNCTIONBUILDER (the report behind se37).
PS: The requests on DD03VV are awfully slow, probably due to missing optimzation for complex requests on ABAP dictionary views.
If I had to give only one DDIC structure, I would give this one:
FDT_TEST_DDIC_BIND_DEEP_S
It contains many elements of miscellaneous types, including nested ones, and it exists in any ABAP-based system (it belongs to the "BASIS" layer).
As it contains some data and object references in sub-levels which are invalid in RFC, you'll have to copy it and remove those reference fields.
There are also these structures (column "TABNAME") with fields of some interest:
TABNAME FIELDNAME Description
-------------------- ------------- ------------------------------------------------
SFW_BF FROM_RELEASE elementary built-in type
SAUNIT_S_ALERT WHEN data element
SAUNIT_S_ALERT HEADER structure
SAUNIT_S_ALERT TEXT_INFOS table type
SAUNIT_PROG_INFO .INCLUDE include structure SAUNIT_S_TADIR_KEY
SKWF_IOFLD .INCLU-FLD include structure SKWF_IO
SWFEXPSTRU2 .INCLU--AP append structure SWFEXPSTRU3
APPEND_BAPI0002_2_2 .APPEND_DU append structure recursive (append of BAPI0002_2) (unique component of APPEND_BAPI0002_2_2)
SOADDRESS Structure with nested structures on 2 levels
Some structures may not be valid in some ABAP releases. They used to exist in ABAP basis 7.02 and 7.52.
Try the function module RFC_METADATA_TEST...
It has some deeply nested parameters.
In Se80 under Enterpise service browser, you will find examples of Proxy structures that are complex DDIC structures. With many different types.
Example edo_tw_a0401request
Just browse around, you will find something you like.
I found STFC_STRUCTURE in the docs of test_datatypes of PyRFC.
Works find for testing, since it is already available in my SAP system. I don't need a dummy rfc for testing. Nice.

How do I programmatically build Eloquent queries based on a custom data structure saved in a persistent storage?

I'm building a reporting tool for my Laravel app that will allow users to create reports and save them for later use.
Users will be able to select from a pre-defined list to modify the query, then run the report and save it.
Having never done this before, I was just wondering if it is ok to save the query in the database? This would allow the user to select a saved report and execute the query.
One approach that would be easier / more robust than the suggested approach of saving queries to the database would be build a Controller that constructs the queries based on user input.
You could validate server side that the query parameters match the predefined list of options and then use Eloquent's QueryBuilder to programatically build the queries.
Actual code examples are hard to provide based off of your question however, as it's very broad and doesn't contain any specific examples.
You essentially need to build a converter between your storage mechanism and your data model in PHP. A code example would not add much value because you need to build it based on your needs.
You need to build a data structure (ideally using JSON in this case, since it's powerful enough for this) that defines all the query elements in a way that your business logic is able to read and convert in Eloquent queries.
I have done something similar in the past but for some simple scenarios, like defining variables for queries, instead of actual query elements.
This is how I'd do it, for example:
{
table: 'users',
type: 'SELECT',
fields: ['firstname AS fName', 'lastname AS lName'],
wheres: [
is_admin: false,
is_registered: true
]
}
converts to:
DB::table('users')
->where('is_admin', false)
->where('is_registered', true)
->get(['firstname AS fName', 'lastname AS lName']);
which converts to:
SELECT firstname AS fName, lastname AS lName WHERE is_admin = false AND is_registered = true
Here's an answer about Saving Report Parameters to a db but from a SQL Server Report Service (SSRS) angle.
But it's still a generic enough EAV structure to work for any parameter datatype (strings, ints, dates, etc.).
You might want to skip Eloquent and use mysql stored procedures. Then you only need to save the list of parameters you'd pass to each.
Like the preferred output type (e.g. .pdf, .xlsx, .html), and who to email it to, and who has permission to run it.

Azure Stream Analytics -> how much control over path prefix do I really have?

I'd like to set the prefix based on some of the data coming from event hub.
My data is something like:
{"id":"1234",...}
I'd like to write a blob prefix that is something like:
foo/{id}/guid....
Ultimately I'd like to have one blob for each id. This will help how it gets consumed downstream by a couple of things.
What I don't see is a way to create prefixes that aren't related to date and time. In theory I can write another job to pull from blobs and break it up after the stream analytics step. However, it feels like SA should allow me to break it up immediately.
Any ideas?
{date} , {time} and {partition} are the only ones supported in blob output prefix. {partition} is a number.
Using a column value in blob prefix is currently not supported.
If you have a limited number of such {id}s then you could workaround by writing multiple "select --" statements with different filters writing to different outputs and hardcode the prefix in the output. Otherwise it is not possible with just ASA.
It should be noted that now you actually can do this. Not sure when it was implemented but you can now use a single property from your message as a custom partition key and the syntax is exactly as the OP has asked for: foo/{id}/something/else
More details are documented here: https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-custom-path-patterns-blob-storage-output
Key points:
Only one custom property allowed
Must be a direct reference to an existing message property (i.e. no concatenations like {prop1+prop2})
If the custom property results in too many partitions (more than 8,000) then an arbitrary number of blobs may be created for the same parition

Designing the filesystem and database for JSON data files

I currently have an API which accepts JSON files(which are JSON serialised objects which contains some user transaction data) and stores the same into the server. Every such JSON file has a unique global id and a unique user to which it is associated. The user should then be able to query through all JSON files that are associated to him and produce a bunch of aggregated results calculated on top of those files.
**Edits:
A typical JSON file that needs to be stored looks something like:
[{"sequenceNumber":125435,"currencyCode":"INR","vatRegistrationNumber":"10868758650","receiptNumber":{"value":"1E466GDX5X2C"},"retailTransaction":[{"otherAttributes":{},"lineItem":[{"sequenceNumber":1000,"otherAttributes":{},"sale":{"otherAttributes":{},"description":"Samsung galaxy S3","unitCostPrice":{"quantity":1,"value":35000},"discountAmount":{"value":2500,"currency":"INR"},"itemSubType":"SmartPhone"}},{"sequenceNumber":1000,"otherAttributes":{},"customerOrderForPickup":{"otherAttributes":{},"description":"iPhone5","unitCostPrice":{"quantity":1,"value":55000},"discountAmount":{"value":5000,"currency":"INR"},"itemSubType":"SmartPhone"}}],"total":[{"value":35000,"type":"TransactionGrossAmount","otherAttributes":{}}],"grandTotal":90000.0,"reason":"Delivery"},null]}]
The above JSON is the serialised version of a complex object containing single or array of Objects of other classes as attributes. So the 'receiptNumber' is the universal id of the JSON file.
To answer Sammaye's question, I would need to query stuff like quantity and value of the customerOrderForPickup or the grandTotal of the transaction, and in as an aggegate of various such transaction JSONs
**
I would like to have some suggestion as to how to go about:
1) Storing these JSON files on the server, the file system ie
2) What kind of a database should I use to query through these JSON files with such a complex structure
My research has resulted in a couple of possibilities:
1) Use a MongoDB database to store the JSON representatives of the object and query through the database. How would the JSON files be stored? What will be the best way to store the transaction JSONs in the MongoDB database?
2) Couple a SQL database containing the unique global id, user id and the address of the JSON file on the server, with an aggregating code on those files. I doubt if this can be scaled
Would be glad if someone has any insights on the problem. Thanks.
I can see 2 options:
Store in MongoDB, as you mentioned, just need to create a collection, and add each JSON file directly as a document to the collection. You may need to change the layout of the JSON a bit to improve queryability.
Store in HDFS, and layer Hive on it. There is a JSON SerDe (Serializer Deserializer) in Hive. This would also scale well.

Converting SQL Result Sets to XML

I am looking for a tool that can serialize and/or transform SQL Result Sets into XML. Getting dumbed down XML generation from SQL result sets is simple and trivial, but that's not what I need.
The solution has to be database neutral, and accepts only regular SQL query results (no db xml support used). A particular challenge of this tool is to provide nested XML matching any schema from row based results. Intermediate steps are too slow and wasteful - this needs to happen in one single step; no RS->object->XML, preferably no RS->XML->XSLT->XML. It must support streaming due to large result sets, big XML.
Anything out there for this?
With SQL Server you really should consider using the FOR XML construct in the query.
If you're using .Net, just use a DataAdapter to fill a dataset. Once it's in a dataset, just use its .WriteXML() method. That breaks your DB->object->XML rule, but it's really how things are done. You might be able to work something out with a datareader, but I doubt it.
Not that I know of. I would just roll my own. It's not that hard to do, maybe something like this:
#!/usr/bin/env jruby
import java.sql.DriverManager
# TODO some magic to load the driver
conn = DriverManager.getConnection(ARGV[0], ARGV[1], ARGV[2])
res = conn.executeQuery ARGV[3]
puts "<result>"
meta = res.meta_data
while res.next
puts "<row>"
for n in 1..meta.column_count
column = meta.getColumnName n
puts "<#{column}>#{res.getString(n)}</#{column}"
end
puts "</row>"
end
puts "</result>"
Disclaimer: I just made all of that up, I'm not even bothering to pretend that it works. :-)
In .NET you can fill a dataset from any source and then it can write that out to disk for you as XML with or without the schema. I can't say what performance for large sets would be like. Simple :)
Another option, depending on how many schemas you need to output, and/or how dynamic this solution is supposed to be, would be to actually write the XML directly from the SQL statement, as in the following simple example...
SELECT
'<Record>' ||
'<name>' || name || '</name>' ||
'<address>' || address || '</address>' ||
'</Record>'
FROM
contacts
You would have to prepend and append the document element, but I think this example is easy enough to understand.
dbunit (www.dbunit.org) does go from sql to xml and vice versa; you might be able to modify it more for your needs.
Technically, converting a result set to an XML file is straight forward and doesn't need any tool unless you have a requirement to convert the data structure to fit specific export schema. In general the result set gets the top-level element of an XML file, then you produce a number of record elements containing attributes, which effectively are the fields of a record.
When it comes to Java, for example, you just need appropriate JDBC driver for interfacing with DBMS of your choice addressing the database independency requirement (usually provided by a DBMS vendor), and a few lines of code to read a result set and print out an XML string per record, per field. Not a difficult task for an average Java developer in my opinion.
Anyway, the more concrete purpose you state the more concrete answer you get.
In Java, you may just fill an object with the xml data (like an entity bean) and then use XMLEncoder to get it to xml. From there you may use XSLT for further conversion or XMLDecoder to bring it back to an object.
Greetz, GHad
PS: See http://ghads.wordpress.com/2008/09/16/java-to-xml-to-java/ for an example for the Object to XML part... From DB to Object multiple more way are possible: JDBC, Groovy DataSets or GORM. Apache Common Beans may help to fill up JavaBeans via Reflection-like methods.
I created a solution to this problem by using the equivalent of a mail merge using the resultset as the source, and a template through which it was merged to produce the desired XML.
The template was standard XML, with a Header element, a Footer element and a Body element. Using a CDATA block in the Body element allowed me to include a complete XML structure that acted as the template for each row. In order to include a fields from the resultset in the template, I used markers that looked like this <[FieldName]>. The template was then pre-parsed to isolate the markers such that in operation, the template requests each of the fields from the resultset as the Body is being produced.
The Header and Footer elements are output only once at the beginning and end of the output set. The body could be any XML or text structure desired. In your case, it sounds like you might have several templates, one for each of your desired schemas.
All of the above was encapsulated in a Template class, such that after loading the Template, I merely called merge() on the template passing the resultset in as a parameter.