Convert triple-triple to object-table in Sqlite JSON1 - sql

I have an Sqlite table triples that contains triple information in { id, rel, tgt } format [1]. I would like to create a view that exposes this triple-format data to "object format", which is more easily consumed by applications reading from this database. In theory sqlite's JSON1 extension would allow me to construct such objects, but I'm struggling.
My current query
select distinct json_object(
'id', id,
rel, json_group_array(distinct tgt)
) as entity from content
group by src, rel, tgt
order by src, rel, tgt
does not work correctly. It produces objects like
{ id: 'a', 'is': ['b'] }
{ id: 'a', 'is': ['c'] }
Rather than
{ id: 'a', 'is': ['b', 'c'] }
It also produces duplicate keys like
{ id: 'a', id: ['a'] }
Edit
This is closer, but does not handle IDs correctly. It constructs an array, not a string
create view if not exists entity as
select distinct json_group_object(
rel, json_array(distinct tgt)
) as entity from content
group by src
I think iif might help
Question;
Can you help me adjust my query to produce correct output (see below)? Please comment if anything needs disambiguation or clarification
Desired Output
Input:
Triple Format:
id | rel | tgt
-----------------------
Bob | is | Bob
Bob | is | Person
Bob | age | 20
Bob | likes | cake
Bob | likes | chocolate
Alice | id | Alice
Alice | is | Person
Alice | hates | chocolate
Output:
Object Format [2]:
{
id: Bob,
is: [ Person ],
age: [ 20 ],
likes: [ cake, chocolate ]
}
{
id: Alice,
is: [ Person ],
hates: [ chocolate ]
}
Details
[1] This dataset has unpredictable structure; I can assume no prior knowledge of what 'rel' keys exist beyond id. A triple <src> id <src> will exist for every src parameter.
[2] The objects should have the following format. id must not be overwritten.
{
id: <id>
<distinct rel>: [
< tgt >
]
}
Relevant Information
https://www.sqlite.org/json1.html

CREATE TABLE content (
id VARCHAR(32),
rel VARCHAR(32),
tgt VARCHAR(32)
);
INSERT INTO
content
VALUES
('Bob' , 'id' , 'Bob'),
('Bob' , 'is' , 'Person'),
('Bob' , 'age' , '20'),
('Bob' , 'likes', 'cake'),
('Bob' , 'likes', 'chocolate'),
('Alice', 'id' , 'Alice'),
('Alice', 'is' , 'Person'),
('Alice', 'hates', 'chocolate')
WITH
id_rel AS
(
SELECT
id,
rel,
JSON_GROUP_ARRAY(tgt) AS tgt
FROM
content
GROUP BY
id,
rel
)
SELECT
JSON_GROUP_OBJECT(
rel,
CASE WHEN rel='id'
THEN JSON(tgt)->0
ELSE JSON(tgt)
END
)
AS entity
FROM
id_rel
GROUP BY
id
ORDER BY
id
entity
{"hates":["chocolate"],"id":"Alice","is":["Person"]}
{"age":["20"],"id":"Bob","is":["Person"],"likes":["cake","chocolate"]}
fiddle
You must aggregate in two steps, as your edited code doesn't combine cake and chocolate in to a single array of two elements...
https://dbfiddle.uk/9ptyuhuj

Related

Left join add matched rows in child table to an array in parent row (as JSON format)

I have the following two tables:
+----------------------------+
| Parent Table |
+----------------------------+
| uuid (PK) |
| caseId |
| param |
+----------------------------+
+----------------------------+
| Child Table |
+----------------------------+
| uuid (PK) |
| parentUuid (FK) |
+----------------------------+
My goal is to do a (left?) join and get all matching rows in the child table based on the FK in an array on the parent row and not inside the parent row itself on matching column names (see further down on desired output).
Examples of values in tables:
Parent table:
1. uuid: "10dd617-083-e5b5-044b-d427de84651", caseId: 1, param: "test1"
2. uuid: "5481da7-8b7-22db-d326-b6a0a858ae2f", caseId: 1, param: "test1"
3. uuid: "857dec3-aa3-1141-b8bf-d3a8a3ad28a7", caseId: 2, param: "test1"
Child table:
1. uuid: 7eafab9f-5265-4ba6-bb69-90300149a87d, parentUuid: 10dd617-083-e5b5-044b-d427de84651
2. uuid: f1afb366-2a6b-4cfc-917b-0794af7ade85, parentUuid: 10dd617-083-e5b5-044b-d427de84651
What my desired output should look like:
Something like this query (with pseudo-ish SQL code):
SELECT *
FROM Parent_table
WHERE caseId = '1'
LEFT JOIN Child_table ON Child_table.parentUuid = Parent_table.uuid
Desired output (in JSON)
[
{
"uuid": "10dd617-083-e5b5-044b-d427de84651",
"caseId": "1",
// DESIRED FORMAT HERE
"childRows": [
{
"uuid": "7eafab9f-5265-4ba6-bb69-90300149a87d",
"parentUuid": "10dd617-083-e5b5-044b-d427de84651"
},
{
"uuid": "f1afb366-2a6b-4cfc-917b-0794af7ade85",
"parentUuid": "10dd617-083-e5b5-044b-d427de84651"
}
]
},
{
"uuid": "5481da7-8b7-22db-d326-b6a0a858ae2f",
"caseId": "1"
}
]
You can use nested FOR JSON clauses to achieve this.
SELECT
p.uuid,
p.caseId,
childRows = (
SELECT
c.uuid,
c.parentUuid
FROM Child_table c
WHERE c.parentUuid = p.uuid
FOR JSON PATH
)
FROM Parent_table p
WHERE p.caseId = '1'
FOR JSON PATH;
SQL does not support rows inside rows as you actually want, instead you have to return the entire result set (either as a join or 2 separate datasets) from SQL server then create the objects in your backend. If you are using .net and EF/Linq this is as simple as getting all the parents with an include to also get the children. Other backends will do this in other ways.

Postgres get multiple rows into a single json object

I have a users table with columns like id, name, email, etc. I want to retrieve information of some users in the following format in a single json object:
{
"userId_1" : {"name" : "A", "email": "A#gmail.com"},
"userId_2" : {"name" : "B", "email": "B#gmail.com"}
}
Wherein the users unique id is the key and a json containing his information is its corresponding value.
I am able to get this information in two separate rows using json_build_object but I would want it get it in a single row in the form of one single json object.
You can use json aggregation functions:
select jsonb_object_agg(id, to_jsonb(t) - 'id') res
from mytable t
jsonb_object_agg() aggregates key/value pairs into a single object. The key is the id of each row, and the values is a jsonb object made of all columns of the table except id.
Demo on DB Fiddle
Sample data:
id | name | email
:------- | :--- | :----------
userid_1 | A | A#gmail.com
userid_2 | B | B#gmail.com
Results:
| res |
| :----------------------------------------------------------------------------------------------------- |
| {"userid_1": {"name": "A", "email": "A#gmail.com"}, "userid_2": {"name": "B", "email": "B#gmail.com"}} |
try -
select row_to_json(col) from T
link below might help https://hashrocket.com/blog/posts/faster-json-generation-with-postgresql
Try this:
SELECT json_object(array_agg(id), array_agg(json::text)) FROM (
SELECT id, json_build_object('name', name, 'email', email) as json
FROM users_table
) some_alias_name
If your id is not of text type then you have to cast it to text too.

Using different time periods in one Azure log query

So I have an Azure log query (KQL) that takes in a date parameter, like check log entries for the last X amount of days. In this query I look up values from two different logs, and I would like to have the two lookups use different dates for the respective logs. To get an idea of what I'm looking for, see the query below which is almost what I have now, with a bit of pseudo code where I can't quite figure out how to structure it.
let usernames = LogNumberOne
| where TimeGenerated > {timeperiod:start} and TimeGenerated < {timeperiod:end}
| bla bla bla lots of stuff
let computernames = LogNumberTwo
| where TimeGenerated > {timeperiod:start} - 2d
| where bla bla bla lots of stuff
usernames
| join kind=innerunique (computernames) on session_id
| some logic to display table
So from LogNumberOne I want the values within the specified time period, but from LogNumberTwo I want the values from the specified time period plus another 2 days before that. Is this possible or do I need another parameter? I have tried with the query above, so {timeperiod:start} - 2d, but that doesn't seem to work, it just uses the timeperiod:start value without subtracting 2 days.
See next variant for using join with filter later.
let usernames = datatable(col1:string, session_id:string, Timestamp:datetime )
[
'user1', '1', datetime(2020-05-14 16:00:00),
'user2', '2', datetime(2020-05-14 16:05:30),
];
let computernames =
datatable(session_id:string, ComputerName:string, Timestamp:datetime )
[
'1', 'Computer1', datetime(2020-05-14 16:00:30),
'2', 'Computer2', datetime(2020-05-14 16:06:20),
];
usernames
| join kind=inner (
computernames
| project-rename ComputerTime = Timestamp
) on session_id
| where Timestamp between(ComputerTime .. (-2d))
In case large sets of join are involved - use technique described in the following article:
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/join-timewindow
let window = 2d;
let usernames = datatable(col1:string, session_id:string, Timestamp:datetime )
[
'user1', '1', datetime(2020-05-13 16:00:00),
'user2', '2', datetime(2020-05-12 16:05:30),
];
let computernames =
datatable(session_id:string, ComputerName:string, Timestamp:datetime )
[
'1', 'Computer1', datetime(2020-05-14 16:00:30),
'2', 'Computer2', datetime(2020-05-14 16:06:20),
];
usernames
| extend _timeKey = range(bin(Timestamp, 1d), bin(Timestamp, 1d)+window, 1d)
| mv-expand _timeKey to typeof(datetime)
| join kind=inner (
computernames
| project-rename ComputerTime = Timestamp
| extend _timeKey = bin(ComputerTime, 1d)
) on session_id, _timeKey
| where Timestamp between(ComputerTime .. (-window))

Can PostgreSQL JOIN on jsonb array objects?

I am considering switching to PostgreSQL, because of the JSON support. However, I am wondering, if the following would be possible with a single query:
Let's say there are two tables:
Table 1) organisations:
ID (INT) | members (JSONB) |
------------+---------------------------------------------------------|
1 | [{ id: 23, role: "admin" }, { id: 24, role: "default" }]|
2 | [{ id: 23, role: "user" }]
Table 2) users:
ID (INT) | name TEXT | email TEXT |
------------+-----------+---------------|
23 | Max | max#gmail.com |
24 | Joe | joe#gmail.com |
Now I want to get a result like this (all i have is the ID of the organisation [1]):
ID (INT) | members (JSONB) |
------------+--------------------------------------------------------|
1 | [{ id: 23, name: "Max", email: "max#gmail.com", role:
"admin" },
{ id: 24, name: "Joe", email: "joe#gmail.com ", role:
"default" }]
(1 row)
I know this is not what JSONB is intended for and that there is a better solution for storing this data in SQL, but I am just curious if it would be possible.
Thanks!
Yes it is possible to meet this requirement with Postgres. Here is a solution for 9.6 or higher.
SELECT o.id, JSON_AGG(
JSON_BUILD_OBJECT(
'id' , u.id,
'name' , u.name,
'email' , u.email,
'role' , e.usr->'role'
)
)
FROM organisations o
CROSS JOIN LATERAL JSONB_ARRAY_ELEMENTS(o.data) AS e(usr)
INNER JOIN users u ON (e.usr->'id')::text::int = u.id
GROUP BY o.id
See this db fiddle.
Explanation :
the JSONB_ARRAY_ELEMENTS function splits the organisation json array into rows (one per user) ; it is usually used in combination with JOIN LATERAL
to join the users table, we access the content of the id field using the -> operator
for each user, the JSONB_BUILD_OBJECT is used to create a new object, by passing a list of values/keys pairs ; most values comes from the users table, excepted the role, that is taken from the organisation json element
the query aggregates by organisation id, using JSONB_AGG to generate a json array by combining above objects
For more information, you may also have a look at Postgres JSON Functions documentation.
There might be more ways to do that. One way would use jsonb_to_recordset() to transform the JSON into a record set you can join. Then create a JSON from the result using jsonb_build_object() for the individual members and jsonb_agg() to aggregate them in a JSON array.
SELECT jsonb_agg(jsonb_build_object('id', "u"."id",
'name', "u"."name",
'email', "u"."email",
'role', "m"."role"))
FROM "organisations" "o"
CROSS JOIN LATERAL jsonb_to_recordset(o."members") "m" ("id" integer,
"role" text)
INNER JOIN "users" "u"
ON "u"."id" = "m"."id";
db<>fiddle
What functions are available in detail depends on the version. But since you said you consider switching, assuming a more recent version should be fair.

Postgresql: Find row via text in json array of objects

I am trying to find rows in my Postgresql Database where a json column contains a given text.
row schema:
id | name | subitems
-----------------------------------------------------------------
1 | "item 1" | [{name: 'Subitem A'}, {name: 'Subitem B'}]
2 | "item 2" | [{name: 'Subitem C'}, {name: 'Subitem D'}]
My wanted result for query 'Subitem B'
id | name | subitems
-----------------------------------------------------------------
1 | "item 1" | [{name: 'Subitem A'}, {name: 'Subitem B'}]
I can search for the first subitem like this:
WHERE lower(subitems->0->>\'name\') LIKE '%subitem a%'
But obviously I can't find any other subitem but the first one this way.
I can get all the names of my subitems:
SELECT lower(json_array_elements(subitems)->>'name') FROM ...
But it gives me 2 rows containing the names:
lower
----------------------------------------------------------------
"subitem a"
"subitem b"
What I actually need is 1 row containing the item.
Can anyone tell me how to do that?
You're almost there. Your query:
SELECT lower(json_array_elements(subitems)->>'name') FROM foo;
That gets you what you want to filter against. If you plop that into a subquery, you get the results you're looking for:
# SELECT *
FROM foo f1
WHERE 'subitem a' IN
(SELECT lower(json_array_elements(subitems)->>'name')
FROM foo f2 WHERE f1.id = f2.id
);
id | name | subitems
----+--------+------------------------------------------------
1 | item 1 | [{"name": "Subitem A"}, {"name": "Subitem B"}]
(1 row)
Edited to add
Okay, to support LIKE-style matching, you'll have to go a bit deeper, putting a subquery into your subquery. Since that's a bit hard to read, I'm switching to using common table expressions.
WITH all_subitems AS (
SELECT id, json_array_elements(subitems)->>'name' AS subitem
FROM foo),
matching_items AS (
SELECT id
FROM all_subitems
WHERE
lower(subitem) LIKE '%subitem a%')
SELECT *
FROM foo
WHERE
id IN (SELECT id from matching_items);
That should get you what you need. Note that I moved the call to lower up a level, so it's alongside the LIKE. That means the filtering condition is in one spot, so you can switch to a regular expression match, or whatever, more easily.