Avro Schema for unstructured data with random names - serialization

I need to save nested data objects with unpredictable names in an Avro schema. For example:
{
"foo": "bar",
"baz": {
"randomName1": 0.23,
...
}
}
Because creating recursive maps is only possible with records, but records must have a field name, I think I need to transform the object into something else.
I though about one of these three options:
(1) Array of nested key/value pairs
Example:
[
{
key: "foo",
value: "bar"
},
{
key: "baz",
value: [
{
name: "randomName1",
value; 0.23
},
...
]
},
]
(2) Flat map with dot-syntaxed key/value pairs
Example:
{
"foo": "bar",
"baz.randomName1": 0.23,
...
}
(3) Array of flattened key/value objects
Example:
[
{
"name": "foo",
"value": "bar"
},
{
"name": "baz.randomName1",
"value": 0.23
},
...
]
All three approaches translate well to Avro, but I am unsure of the implications of either approach, for example when trying to query those values via KSQL.
Any hint towards potentials gotchas further down the road is highly appreciated.

Related

Faunadb create index on child item field

I'm trying to get an index on a field of a child item in my document. The data is this:
[
{
"ref": Ref(Collection("ColA"), "111111111111111111"),
"ts":1659241462840000,
"data":{
"name":"Test a",
"members":[
{
"userId":"1",
"roles":[
"admin"
]
}
]
}
},
{
"ref": Ref(Collection("ColA"), "111111111111111112"),
"ts":1659241462840000,
"data":{
"name":"Test b",
"members":[
{
"userId":"1",
"roles":[
"admin"
]
},
{
"userId":"2",
"roles":[
"read-only"
]
}
]
}
},
{
"ref": Ref(Collection("ColA"), "111111111111111113"),
"ts":1659241462840000,
"data":{
"name":"Test c",
"members":[
{
"userId":"2",
"roles":[
"admin"
]
}
]
}
}
]
Trying to using data.members.userId as term in the index. This only gives back one result when I use the index with the filter value '1'
Then I tried to create the index as following:
CreateIndex({
name: 'spaces_member_ids',
source: {
collection: Collection("ColA"),
fields: {
members: Query(
Lambda(
"ColA",
Select(["data", "members", "userId"], Var("ColA"), '')
)
),
},
},
terms: [
{ binding: "members" },
],
values: [
{ field: "data.name" },
{ field: "ref" },
]
})
But that gives no results when I use the index with the filter value '1' Both times I expect to get two items back (Test a and Test b).
Anyone knows how to create an index that gived back all the data of ColA filtered on field 'userId' in the 'members' array?
The problem is that there is no userId field as a direct descendant of the members array.
For background, Fauna index entries can only contain scalar values. Objects are not indexed at all. For arrays, one index entry is created per scalar value in the array. If you attempt to index multiple array fields, the number of index entries produced is the Cartesian product of the items in all indexed arrays.
If you create your index like so:
CreateIndex({
name: 'spaces_member_ids',
source: Collection("ColA"),
terms: [
{ field: ["data", "members", 0, "userId"] },
],
values: [
{ field: ["data", "name"] },
{ field: "ref" },
]
})
Then you'll be able to search for userId values that appear in the first item in the members array.
If you need to create index entries for all userId values from each ColA document, then your binding approach is close, but it needs to provide an array.
CreateIndex({
name: "spaces_member_ids",
source: {
collection: Collection("ColA"),
fields: {
members: Query(
Lambda(
"ColA",
Map(
Select(["data", "members"], Var("ColA"), []),
Lambda(
"member",
Select(["userId"], Var("member"), "")
)
)
)
),
},
},
terms: [
{ binding: "members" },
],
values: [
{ field: ["data", "name"] },
{ field: "ref" },
]
})
The notable changes that I made are:
Within the binding, Map is used to iterate on the members field in the document.
Simply returning the userId field value within a Map is sufficient to return an array of userId values.
Corrected the syntax in the values definition. Fauna indexes don't process dot notation.

How to map certain keys of an array to make another one in mule 4

I have an array -
[
{
"Number": "12345",
"abc": {
"group": "abc",
"operation": "Create"
},
"def": {
"group": "def",
"operation": "Create"
}
},
{
"Number": "45678",
"xyz": {
"group": "xyz",
"operation": "Update"
},
"sdf": {
"group": "sfd",
"operation": "Delete"
}
}
]
and need to convert into this form -
[
{
"Number": "12345",
"group": "abc",
"operation": "Create"
},
{
"Number": "12345",
"group": "def",
"operation": "Create"
},
{
"Number": "45678",
"group": "xyz",
"operation": "Update"
},
{
"Number": "45678",
"group": "sfd",
"operation": "Delete"
}
]
Trying to write dataweave expression for the same. The issue is that abc, def, xyz and all are objects which may or maynot come and can have different values.
Another way to handle this:
%dw 2.0
output application/json
---
payload flatMap ((item, index) ->
(item - "Number") pluck {
"Number": item.Number,
($)
}
)
The approach is mostly the same, but here is the explanation: we use map to iterate, but with flatMap instead since we know we will be returning multiple items from each instance. Then the first thing we do is remove the key Number from the item since we only want to build a new object for each key that isn't Number. Then we can pluck, which gives us access to each key and value; from here we build a new object with our item's number value, and expand the entire object we plucked into that object. When using an anonymous function like this, the $, $$, $$$, etc represent the functions parameters - in pluck's case value, key, index. The parentheses we put around $ means to expand the entire object into our object; in javascript this is similar to { ...props, anotherKey: 'value' }. This means we don't really need to know or care about the structure of that object, which is useful if we have a potentially flexible schema.
You need to map each element, then filter each object to eliminate the attribute Number, and use pluck to convert each remaining key into an array. I used flatMap to concatenate each resulting array from each pluck into the response.
%dw 2.0
output application/json
---
payload flatMap ((item, index) ->
item
filterObject ((value, key, index) -> !(key ~= "Number"))
pluck ((value, key, index) -> {Number: item.Number, group: value.group, operation: value.operation})
)

Validating that a property value exists withing the keys of an object

Wise crowd,
I already have a working JSON Schema (v0.7) to validate my data. This is an example of valid JSON:
{
"people": [
{ "id": 1, "name": "bob" },
...
]
}
Now I need to a bunch of strings in it:
{
"people": [
{ "id": 1, "name": "bob", "appears_in": "long_string_id_1" },
{ "id": 2, "name": "ann", "appears_in": "long_string_id_1" }
...
],
"long_strings": {
"long_string_id_1": "blah blah blah.....",
...
}
}
What I need is:
a value for key appears_in MUST be a key of the long_strings object
(optional) a key of the long_strings object MUST be used as value in on of the appears_in key
Property dependencies are nice, but don't seem to address my needs.
Any idea?
And this question is not a duplicate, because I do not know the values in advance.
Sorry. You cannot do this in JSON schema. You cannot reference data in your schema.

express-graphql: How to remove external "data" object layer.

I am replacing an existing REST endpoint with GraphQL.
In our existing REST endpoint, we return a JSON array.
[{
"id": "ABC"
},
{
"id": "123"
},
{
"id": "xyz"
},
{
"id": "789"
}
]
GraphQL seems to be wrapping the array in two additional object layers. Is there any way to remove the "data" and "Client" layers?
Response data:
{
"data": {
"Client": [
{
"id": "ABC"
},
{
"id": "123"
},
{
"id": "xyz"
},
{
"id": "789"
}
]
}
}
My query:
{
Client(accountId: "5417727750494381532d735a") {
id
}
}
No. That was the whole purpose of GraphQL. To have a single endoint and allow users to fetch different type/granularity of data by specifying the input in a query format as opposed to REST APIs and then map them onto the returned JSON output.
'data' acts as a parent/root level container for different entities that you have queried. Without these keys in the returned JSON data, there won't be any way to segregate the corresponding data. e.g.
Your above query can be modified to include another entity like Owner,
{
Client(accountId: "5417727750494381532d735a") {
id
}
Owner {
id
}
}
In which case, the output will be something like
{
"data": {
"Client": [
...
],
"Owner": [
...
]
}
}
Without the 'Client' and 'Owner' keys in the JSON outout, there is no way to separate the corresponding array values.
In your case, you can get only the array by doing data.Client on the returned output.

Is it possible to have inlined models in associations or in fields?

I have a complex data structure and would like to have it modeled.
But I would like to avoid creating the many sub-models that the data has, as they have no independent existence outside the scope of this data-structure
my data structure:
[
{
"name": "General",
"contexts": [
{
"name": "User Profile",
"settings": [
{
"valuesList": [
{
"name": "Name1",
"key": "key1"
},
....
],
"required": true
},
........
],
......
]
}
]
Can this be modeled in a single Ext.data.Model?
Thanks!
Seemingly, single model implementation not possible. You should refer these resources.
Extjs reading complex JSON data into store
One-to-many relationship between two models