Postgresql join on array and transform to json - sql

I would like to make a join on array containing ids and transform the result of this subselect into json (json array).
I have the following model:

The lnam_refs column contains identifiers that are related to the lnam column
I would like transform the column lnam_refs into something like [row_to_json(), row_to_json()] or [] or [row_to_json()] or …
I tried several methods but I can not achieve a clean result…
To try to be clearer :
Table in input:
id | label | lnam | lnam_refs
--------+----------------------+----------+-----------------------
1 | 'master1' | 11111111 | {33333333}
2 | 'master2' | 22222222 | {44444444,55555555}
3 | 'slave1' | 33333333 | {}
4 | 'slave2' | 44444444 | {}
5 | 'slave3' | 55555555 | {}
6 | 'master3' | 66666666 | {}
Results Expected:
id | label | lnam | lnam_refs | slaves
--------+----------------------+----------+-----------------------+---------------------------------
1 | 'master1' | 11111111 | {33333333} | [ {id: 3, label: 'slave1', lnam: 33333333, lnam_refs: []} ]
2 | 'master2' | 22222222 | {44444444,55555555} | [ {id: 4, label: 'slave2', lnam: 44444444, lnam_refs: []}, {id: 5, label: 'slave3', lnam: 55555555, lnam_refs: []} ]
6 | 'master3' | 66666666 | {} | []
Thanks for your help !

Here's one way to do it. (I created a table called t with that data you supplied.)
SELECT *, (SELECT JSON_AGG(ROW_TO_JSON(t2)) FROM t t2 WHERE label LIKE 'slave%' AND lnam = ANY(t1.lnam_refs)) AS slaves
FROM t t1
WHERE label LIKE 'master%'
I use the label field in the WHERE clause as I don't know how else you're determining which records should be master etc.
Result:
1;master1;11111111;{33333333};[{"id":3,"label":"slave1","lnam":33333333,"lnam_refs":[]}]
2;master2;22222222;{44444444,55555555};[{"id":4,"label":"slave2","lnam":44444444,"lnam_refs":[]}, {"id":5,"label":"slave3","lnam":55555555,"lnam_refs":[]}]
6;master3;66666666;{};

Related

Snowflake - using json_parse and select distinct to un-nested column and compare with another column

I have 2 columns, 1 is a nested column named custom_field and the other is sales_id I want to compare the sales_id_2 values in custom_field with sales_id column
I've tried this but it didn't work:
select distinct parse_json(custom_fields) as CUSTOM_FIELDS
from my_table where custom_fields:sales_id_2 = sales_id;
but I get the error:
SQL compilation error: error line 1 at position 111 Invalid argument
types for function 'GET': (VARCHAR(16777216), VARCHAR(2)).
+-----------------------------------------------------+
| custom_field | sales_id |
|-----------------------------------------------------|
| | |
| { | 235324115 |
| "sales_id_2": 235324115, | 1234351 |
| "g": 12, | |
| "r": 255 | |
| } | |
| { | 678322341 |
| "sales_id_2": 1234351, | 5648561 |
| "g": 13, | |
| "r": 254 | |
| } | |
I'm hoping to see empty results, because I believe sales_id_2 is the same as sales_id
:: is for casting, plus you are trying a JSON operation on a varchar column. try this
select distinct parse_json(custom_fields) as CUSTOM_FIELDS from my_table where parse_json(custom_fields):sales_id_2 = sales_id;

Postgresql order by out of order

I have a database where I need to retrieve the data as same order as it was populated in the table. The table name is bible When I type in table bible; in psql, it prints the data in the order it was populated with, but when I try to retrieve it, some rows are always out of order as in the below example:
table bible
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 1
day | 1
book | Genesis
chapter | 1
verse | 1
text | In the beginning God created the heavens and the earth.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Genesis1.1&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 2
day | 1
book | John
chapter | 1
verse | 1
text | In the beginning was the Word, and the Word was with God, and the Word was God.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=John1.1&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 3
day | 1
book | John
chapter | 1
verse | 2
text | The same was in the beginning with God.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=John1.2&key=dc5e2d416f46150bf6ceb21d884b644f
Everything is in order, but when I try to query the same thing using for example: select * from bible where day='1' or select * from bible where day='1' order by day or select * from bible where day='1' order by day, id;, I always get some rows out of order either in the day selected (here 1) or any other day.
I have been using Django to interfere with Postgres database, but since I found this problem, I tried to query using SQL, but nothing, I still get rows out of order, although they all have unique ids which I verified with select count(distinct id), count(id) from bible;
- [ RECORD 1 ]------------------------------------------------------------------------------------------------------
id | 1
day | 1
book | Genesis
chapter | 1
verse | 1
text | In the beginning God created the heavens and the earth.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Genesis1.1&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 10
day | 1
book | Colossians
chapter | 1
verse | 18
text | And he is the head of the body, the church: who is the beginning, the firstborn from the dead; that in all things he might have the preemine
nce.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Colossians1.18&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 11
day | 1
book | Genesis
chapter | 1
verse | 2
text | And the earth was waste and void; and darkness was upon the face of the deep: and the Spirit of God moved upon the face of the waters.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Genesis1.2&key=dc5e2d416f46150bf6ceb21d884b644f
As you could see above if you notice, the ids are out of order 1, 10, 11.
my table
Table "public.bible";
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------+------+-----------+----------+---------+----------+--------------+-------------
id | text | | | | extended | |
day | text | | | | extended | |
book | text | | | | extended | |
chapter | text | | | | extended | |
verse | text | | | | extended | |
text | text | | | | extended | |
link | text | | | | extended | |
Access method: heap
The id field is of type text because I used pandas's to_sql() method to populate the bible table. I tried to drop the id column and then I added it again as a PK with ALTER TABLE bible ADD COLUMN id SERIAL PRIMARY KEY; but I still get data return out of order.
Is there anyway I can retrieve the data with ordering with id, without having some of the rows totally out of order? Thank you in advance!
Thou shalt cast thy id to integer to order it as number.
SELECT * FROM bible ORDER BY cast(id AS integer);
While #jordanvrtanoski is correct, the way to do this is django is:
>>> Bible.objects.extra(select={'id': 'CAST(id AS INTEGER)'}).order_by('id').values('id')
<QuerySet [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 10}, {'id': 20}]>
Side note: If you want to filter on day as an example, you can do this:
>>> Bible.objects.extra(select={
'id': 'CAST(id AS INTEGER)',
'day': 'CAST(day AS INTEGER)'}
).order_by('id').values('id', 'day').filter(day=2)
<QuerySet [{'id': 2, 'day': 2}, {'id': 10, 'day': 2}, {'id': 11, 'day': 2}, {'id': 20, 'day': 2}]>
Otherwise you get this issue: (notice 1 is followed by 10 and not 2)
>>> Bible.objects.order_by('id').values('id')
<QuerySet [{'id': '1'}, {'id': '10'}, {'id': '2'}, {'id': '20'}, {'id': '3'}]>
I HIGHLY suggest you DO NOT do any of this, and set your tables correctly (have the correct column types and not have everything as text), or your query performance is going to suck.. BIG TIME
Building on both answers of #jordanvrtanoski and #Javier Buzzi, and some search online, the issue is because the ids are of type TEXT (or VARCHAR too), so, you would need to cast the id to type INTEGER as in the following:
ALTER TABLE bible ALTER COLUMN id TYPE integer USING (id::integer);
Now here is my table
Table "public.bible"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------+---------+-----------+----------+-----------------------------------------+----------+--------------+-------------
id | integer | | | nextval('bible_id_seq'::regclass) | plain | |
day | text | | | | extended | |
book | text | | | | extended | |
chapter | text | | | | extended | |
verse | text | | | | extended | |
text | text | | | | extended | |
link | text | | | | extended | |
Indexes:
"lesson_unique_id" UNIQUE CONSTRAINT, btree (id)
Referenced by:
TABLE "notes_note" CONSTRAINT "notes_note_verse_id_5586a4bf_fk" FOREIGN KEY (verse_id) REFERENCES days_lesson(id) DEFERRABLE INITIALLY DEFERRED
Access method: heap
Hope this helps other people, and thank you everyone!

SELECT 1 ID and all belonging elements

I try to create a json select query which can give me back the result on next way.
1 row contains 1 main_message_id and belonging messages. (Like the bottom image.) The json format is not a requirement, if its work with other methods, it will be fine.
I store the data as like this:
+-----------------+---------+----------------+
| main_message_id | message | sub_message_id |
+-----------------+---------+----------------+
| 1 | test 1 | 1 |
| 1 | test 2 | 2 |
| 1 | test 3 | 3 |
| 2 | test 4 | 4 |
| 2 | test 5 | 5 |
| 3 | test 6 | 6 |
+-----------------+---------+----------------+
I would like to create a query, which give me back the data as like this:
+-----------------+-----------------------+--+
| main_message_id | message | |
+-----------------+-----------------------+--+
| 1 | {test1}{test2}{test3} | |
| 2 | {test4}{test5}{test6} | |
| 3 | {test7}{test8}{test9} | |
+-----------------+-----------------------+--+
You can use json_agg() for that:
select main_message_id, json_agg(message) as messages
from the_table
group by main_message_id;
Note that {test1}{test2}{test3} is invalid JSON, the above will return a valid JSON array e.g. ["test1", "test2", "test3"]
If you just want a comma separated list, use string_agg();
select main_message_id, string_ag(message, ', ') as messages
from the_table
group by main_message_id;

Transform DataFrame to list of dictionaries where column name is a value of key:value pair

I have a panda DataFrame as follow
|---------------------|------------------|------------------|
| A | B | C |
|---------------------|------------------|------------------|
| abc | 34 | 8 |
|---------------------|------------------|------------------|
| abc | | 12 |
|---------------------|------------------|------------------|
| abc | 6 | 321 |
|---------------------|------------------|------------------|
I would like to conver it to a list of dictionary like this:
[
{
name: "A",
value: "abc"
},
{
name: "B",
value: 34
},
{
name: "C",
value: 8
}
]
There are several way to do it with a lot of data manipulation but I am looking for one that is straightforward if it exists
Thank you for your help
[[{'name':k, 'value':v} for k,v in x.items()] for x in df.to_dict(orient='records')]
This would probably work, not sure it is straightforward though.

How to create hierarchal json object using ltree query results? (postgresql)

I'm trying to create a storage system for custom categories using postgres.
After looking around for potential solutions I settled on trying to use ltree;
Here is an example of raw data below;
+----+---------+---------------------------------+-----------+
| id | user_id | path | name |
+----+---------+---------------------------------+-----------+
| 1 | 1 | root.test | test |
| 2 | 1 | root.test.inbox | inbox |
| 3 | 1 | root.personal | personal |
| 4 | 1 | root.project | project |
| 5 | 1 | root.project.idea | idea |
| 6 | 1 | root.personal.events | events |
| 7 | 1 | root.personal.events.janaury | january |
| 8 | 1 | root.project.objective | objective |
| 9 | 1 | root.personal.events.february | february |
| 10 | 1 | root.project.objective.january | january |
| 11 | 1 | root.project.objective.february | february |
+----+---------+---------------------------------+-----------+
I thought that it might be easier to first order the results, and remove the top level from the path return. Using;
select id, name, subpath(path, 1) as path, nlevel(subpath(path, 1)) as level from testLtree order by level, path
I get;
+----+-----------+----------------------------+-------+
| id | name | path | level |
+----+-----------+----------------------------+-------+
| 3 | personal | personal | 1 |
| 4 | project | project | 1 |
| 1 | test | test | 1 |
| 6 | events | personal.events | 2 |
| 5 | idea | project.idea | 2 |
| 8 | objective | project.objective | 2 |
| 2 | inbox | test.inbox | 2 |
| 9 | february | personal.events.february | 3 |
| 7 | january | personal.events.january | 3 |
| 11 | february | project.objective.february | 3 |
| 10 | january | project.objective.january | 3 |
+----+-----------+----------------------------+-------+
I'm hoping to be able to transform this result into a set of JSON data somehow. I would like an output similar to this;
personal: {
id: 3,
name: 'personal',
children: {
events: {
id: 6,
name: 'events',
children: {
january: {
id: 7,
name: 'january',
children: null
},
february: {
id: 9,
name: 'february',
children: null
}
}
}
}
},
project: {
id: 4,
name: 'project',
children: {
idea: {
id: 5,
name: 'idea',
children: null
},
objective: {
id: 8,
name: 'objective',
children: {
january: {
id: 10,
name: 'january',
children: null
},
february: {
id: 11,
name: 'february',
children: null
}
}
}
}]
},
test: {
id: 1,
name: 'test',
children: {
inbox: {
id: 2,
name: 'inbox',
children: null
}
}
}
I've been looking around for the best way to do this but haven't came across any solutions that make sense to me. However, as I am new to postgres and SQL in general this is expected.
I think I may have to use a recursive query? I'm a bit confused over what the best method/execution of this would be. Any help/advice is much appreciated! and any further questions please ask.
I've put everything into a sqlfiddle below;
http://sqlfiddle.com/#!17/1713e/5
I ran into the same problem as you. I had a large struggle with this in PostgreSQL and it became overly complex to solve. Since I'm using Django (Python framework), I decided to solve it using Python. In case it can help anyone in my same situation, I would like to share the code:
https://gist.github.com/eherrerosj/4685e3dc843e94f3ef8645d31dbe490c