PostgreSQL update field for each element in array - sql

I wish to do the following task in SQL:
I have a table with columns:
uuid (uuid), word (text), wordList (text[]), uuidList (uuid[])
I have the wordList array, uuid and word columns populated. I wish to update and populate the uuidList like this:
foreach element in wordList
var x = select uuid where word = element;
uuidList.append(x);
Example:
I have a table like this:
+---------+-------+--------------------+----------+
| uuid | word | wordList | uuidList |
+---------+-------+--------------------+----------+
| aaaa... | hello | NULL | NULL |
| bbbb... | world | NULL | NULL |
| cccc... | blah | {'hello', 'world'} | NULL |
+---------+-------+--------------------+----------+
I want it to become like this:
+---------+-------+--------------------+--------------------+
| uuid | word | wordList | uuidList |
+---------+-------+--------------------+--------------------+
| aaaa... | hello | NULL | NULL |
| bbbb... | world | NULL | NULL |
| cccc... | blah | {'hello', 'world'} | {aaaa..., bbbb...} |
+---------+-------+--------------------+--------------------+
I'm quite new to SQL and have gotten confused how to do it. I don't think I can join a table to itself. I don't know if I should be storing information in a temporary table to somehow achieve this (some related questions I read had this proposed)...
Thanks!

You can aggregate all the needed UUIDs in a single statement:
select w1.uid, array_agg(w2.uid order by wl.idx) as uuidlist
from words w1
cross join lateral unnest(w1.wordlist) with ordinality as wl(word,idx)
join words w2 on w2.word = wl.word
where w1.wordlist is not null
and w1.uuidlist is null -- optional
group by w1.uid;
The option with ordinality returns an additional column that indicates the position of the element in the original array. This is needed to aggregate the UUIDs in the correct order.
This returns the following result with your sample data:
uid | uuidlist
-----+------------
cccc | {aaaa,bbbb}
This can be used as the source of an update statement (assuming the column uid is unique):
update words
set uuidlist = t.uuidlist
from (
select w1.uid, array_agg(w2.uid order by wl.idx) as uuidlist
from words w1
cross join lateral unnest(w1.wordlist) with ordinality as wl(word,idx)
join words w2 on w2.word = wl.word
where w1.wordlist is not null
and w1.uuidlist is null -- optional
group by w1.uid
) t
where t.uid = words.uid;
Online example: https://rextester.com/LZUYC57184
(note that the display of arrays is a bit weird in that example)

Related

how to loop an array in string in a where clause

I have an information table with a column of an array in string format. The length is unknown starting from 0. How can I put it in a where clause of PostgreSQL?
* hospital_information_table
| ID | main_name | alternative_name |
| --- | ---------- | ----------------- |
| 111 | 'abc' | 'abe, abx' |
| 222 | 'bbc' | '' |
| 333 | 'cbc' | 'cbe,cbd,cbf,cbg' |
​
​
* record
| ID | name | hospital_id |
| --- | ------- | ------------ |
| 1 | 'abc-1' | |
| 2 | 'bbe+2' | |
| 3 | 'cbf*3' | |
​
e.g. this column is for alternative names of hospitals. let's say e.g. 'abc,abd,abe,abf' as column Name and '111' as ID. And I have a record with a hospital name 'cbf*3' ('3' is the department name) and I would like to check its ID. How can I check all names one by one in 'cbe,cbd,cbf,cbg' and get its ID '333'?
--update--
In the example, in the record table, I used '-', '*', '+', meaning that I couldn't split the name in the record table under a certain pattern. But I can make sure that some of the alternative names may appear in the record name (as a substring). something similar to e.g. 'cbf' in 'cbf*3'. I would like to check all names, if 'abe' in 'cbf*3'? no, if 'abx' in 'cbf*3'? no, then the next row etc.
--update--
Thanks for the answers! They are great!
For more details, the original dataset is not in alphabetic languages. The text in the record name is not separable. it is really hard to find a separator or many separators. Therefore, for the solutions with regrex like '[-*+]' could not work here.
Thanks in advance!
You could use regexp_split_to_array to convert the coma-delimited string to a proper array, and then use the any operator to search inside it:
SELECT r.*, h.id
FROM record r
JOIN hospital_information h ON
SPLIT_PART(r.name, '-', 1) = ANY(REGEXP_SPLIT_TO_ARRAY(h.name, ','))
SQLFiddle demo
Substring can be used with a regular expression to get the hospital name from the record's name.
And String_to_array can transform a CSV string to an array.
SELECT
r.id as record_id
, r.name as record_name
, h.id as hospital_id
FROM record r
LEFT JOIN hospital_information h
ON SUBSTRING(r.name from '^(.*)[+*\-]\w+$') = ANY(STRING_TO_ARRAY(h.alternative_name,',')||h.main_name)
WHERE r.hospital_id IS NULL;
record_id
record_name
hospital_id
1
abc-1
111
2
bbe+2
222
3
cbf*3
333
Demo on db<>fiddle here
Btw, text [] can be used as a datatype in a table.

How to get a value inside of a JSON that is inside a column in a table in Oracle sql?

Suppose that I have a table named agents_timesheet that having a structure like this:
ID | name | health_check_record | date | clock_in | clock_out
---------------------------------------------------------------------------------------------------------
1 | AAA | {"mental":{"stress":"no", "depression":"no"}, | 6-Dec-2021 | 08:25:07 |
| | "physical":{"other_symptoms":"headache", "flu":"no"}} | | |
---------------------------------------------------------------------------------------------------------
2 | BBB | {"mental":{"stress":"no", "depression":"no"}, | 6-Dec-2021 | 08:26:12 |
| | "physical":{"other_symptoms":"no", "flu":"yes"}} | | |
---------------------------------------------------------------------------------------------------------
3 | CCC | {"mental":{"stress":"no", "depression":"severe"}, | 6-Dec-2021 | 08:27:12 |
| | "physical":{"other_symptoms":"cancer", "flu":"yes"}} | | |
Now I need to get all agents having flu at the day. As for getting the flu from a single JSON in Oracle SQL, I can already get it by this SQL statement:
SELECT * FROM JSON_TABLE(
'{"mental":{"stress":"no", "depression":"no"}, "physical":{"fever":"no", "flu":"yes"}}', '$'
COLUMNS (fever VARCHAR(2) PATH '$.physical.flu')
);
As for getting the values from the column health_check_record, I can get it by utilizing the SELECT statement.
But How to get the values of flu in the JSON in the health_check_record of that table?
Additional question
Based on the table, how can I retrieve full list of other_symptoms, then it will get me this kind of output:
ID | name | other_symptoms
-------------------------------
1 | AAA | headache
2 | BBB | no
3 | CCC | cancer
You can use JSON_EXISTS() function.
SELECT *
FROM agents_timesheet
WHERE JSON_EXISTS(health_check_record, '$.physical.flu == "yes"');
There is also "plain old way" without JSON parsing only treting column like a standard VARCHAR one. This way will not work in 100% of cases, but if you have the data in the same way like you described it might be sufficient.
SELECT *
FROM agents_timesheet
WHERE health_check_record LIKE '%"flu":"yes"%';
How to get the values of flu in the JSON in the health_check_record of that table?
From Oracle 12, to get the values you can use JSON_TABLE with a correlated CROSS JOIN to the table:
SELECT a.id,
a.name,
j.*,
a."DATE",
a.clock_in,
a.clock_out
FROM agents_timesheet a
CROSS JOIN JSON_TABLE(
a.health_check_record,
'$'
COLUMNS (
mental_stress VARCHAR2(3) PATH '$.mental.stress',
mental_depression VARCHAR2(3) PATH '$.mental.depression',
physical_fever VARCHAR2(3) PATH '$.physical.fever',
physical_flu VARCHAR2(3) PATH '$.physical.flu'
)
) j
WHERE physical_flu = 'yes';
db<>fiddle here
You can use "dot notation" to access data from a JSON column. Like this:
select "DATE", id, name
from agents_timesheet t
where t.health_check_record.physical.flu = 'yes'
;
DATE ID NAME
----------- --- ----
06-DEC-2021 2 BBB
Note that this approach requires that you use an alias for the table name (so you can use it in accessing the JSON data).
For testing I used the data posted by MT0 on dbfiddle. I am not a big fan of double-quoted column names; use something else for "DATE", such as dt or date_.

Postgres jsonb. Heterogenous json fields

If I have a table with a single jsonb column and the table has data like this:
[{"body": {"project-id": "111"}},
{"body": {"my-org.project-id": "222"}},
{"body": {"other-org.project-id": "333"}}]
Basically it stores project-id differently for different rows.
Now I need a query where the data->'body'->'etc'., from different rows would coalesce into a single field 'project-id', how can I do that?
e.g.: if I do something like this:
select data->'body'->'project-id' projectid from mytable
it will return something like:
| projectid |
| 111 |
But I also want project-id's in other rows too, but I don't want additional columns in the results. i.e, I want this:
| projectid |
| 111 |
| 222 |
| 333 |
I understand that each of your rows contains a json object, with a nested object whose key varies over rows, and whose value you want to acquire.
Assuming the 'body' always has a single key, you could do:
select jsonb_extract_path_text(t.js -> 'body', x.k) projectid
from t
cross join lateral jsonb_object_keys(t.js -> 'body') as x(k)
The lateral join on jsonb_object_keys() extracts all keys in the object as rows. Then we use jsonb_extract_path_text() to get the corresponding value.
Demo on DB Fiddle:
with t as (
select '{"body": {"project-id": "111"}}'::jsonb js
union all select '{"body": {"my-org.project-id": "222"}}'::jsonb
union all select '{"body": {"other-org.project-id": "333"}}'::jsonb
)
select jsonb_extract_path_text(t.js -> 'body', x.k) projectid
from t
cross join lateral jsonb_object_keys(t.js -> 'body') as x(k)
| projectid |
| :--------- |
| 111 |
| 222 |
| 333 |

Check if a string is composed by several substrings from a whitelist Table

Is there any way to select the expected records with Access query or SQL?
Environment
Access 2010
Table "words"
actual number of records: ten thousands-order
id | word |
---|---------------|
1 | green |
2 | light |
3 | greenlight |
4 | redlight |
5 | greenLEDlight |
6 | reddiamond |
Table "whitelist"
actual number of records: thousands-order
listword |
-------- |
green |
light |
Expected result
1) Select the following, with excluding "word" which consists of only "listword" including ones with concatenating them(*)
id | word |
---|---------------|
4 | redlight |
5 | greenLEDlight |
6 | reddiamond |
2) Or, select only "word" which of only "listword" including ones with concatenating them(*)
id | word |
---|---------------|
1 | green |
2 | light |
3 | greenlight |
(*) "green" or "light" or "greenlight" or "lightgreen"
What I tried
SELECT words.id, words.word
FROM words, whitelist
WHERE not exists (
SELECT listword
FROM whitelist
WHERE word Like "*" & [listword] & "*"
)
GROUP BY words.id, words.word;
Result
id | word |
---|---------------|
6 | reddiamond |
Do these two queries return what you look for ?
1)
SELECT id, word
FROM words
WHERE not exists (
SELECT *
FROM whitelist
WHERE listword = word
)
and not exists (
SELECT *
FROM whitelist w1, whitelist w2
WHERE w1.listword & w2.listword = word
)
2)
SELECT id, word
FROM words
WHERE exists (
SELECT *
FROM whitelist
WHERE listword = word
)
or exists (
SELECT *
from whitelist w1, whitelist w2
WHERE w1.listword & w2.listword = word
)
This code will check simple words on the whitelist and the existence of "exact" pairs like "greenlight". But will fail if you also need to check triplets like "greenlightgreen". You can add new subqueries crossing three or more times the whitelist table, but having thousands of records it will be awfully slow.

postgres - pivot query with array values

Suppose I have this table:
Content
+----+---------+
| id | title |
+----+---------+
| 1 | lorem |
+----|---------|
And this one:
Fields
+----+------------+----------+-----------+
| id | id_content | name | value |
+----+------------+----------+-----------+
| 1 | 1 | subtitle | ipsum |
+----+------------+----------+-----------|
| 2 | 1 | tags | tag1 |
+----+------------+----------+-----------|
| 3 | 1 | tags | tag2 |
+----+------------+----------+-----------|
| 4 | 1 | tags | tag3 |
+----+------------+----------+-----------|
The thing is: i want to query the content, transforming all the rows from "Fields" into columns, having something like:
+----+-------+----------+---------------------+
| id | title | subtitle | tags |
+----+-------+----------+---------------------+
| 1 | lorem | ipsum | [tag1,tag2,tag3] |
+----+-------+----------+---------------------|
Also, subtitle and tags are just examples. I can have as many fields as I desired, them being array or not.
But I haven't found a way to convert the repeated "name" values into an array, even more without transforming "subtitle" into array as well. If that's not possible, "subtitle" could also turn into an array and I could change it later on the code, but I needed at least to group everything somehow. Any ideas?
You can use array_agg, e.g.
SELECT id_content, array_agg(value)
FROM fields
WHERE name = 'tags'
GROUP BY id_content
If you need the subtitle, too, use a self-join. I have a subselect to cope with subtitles that don't have any tags without returning arrays filled with NULLs, i.e. {NULL}.
SELECT f1.id_content, f1.value, f2.value
FROM fields f1
LEFT JOIN (
SELECT id_content, array_agg(value) AS value
FROM fields
WHERE name = 'tags'
GROUP BY id_content
) f2 ON (f1.id_content = f2.id_content)
WHERE f1.name = 'subtitle';
See http://www.postgresql.org/docs/9.3/static/functions-aggregate.html for details.
If you have access to the tablefunc module, another option is to use crosstab as pointed out by Houari. You can make it return arrays and non-arrays with something like this:
SELECT id_content, unnest(subtitle), tags
FROM crosstab('
SELECT id_content, name, array_agg(value)
FROM fields
GROUP BY id_content, name
ORDER BY 1, 2
') AS ct(id_content integer, subtitle text[], tags text[]);
However, crosstab requires that the values always appear in the same order. For instance, if the first group (with the same id_content) doesn't have a subtitle and only has tags, the tags will be unnested and will appear in the same column with the subtitles.
See also http://www.postgresql.org/docs/9.3/static/tablefunc.html
If the subtitle value is the only "constant" that you wan to separate, you can do:
SELECT * FROM crosstab
(
'SELECT content.id,name,array_to_string(array_agg(value),'','')::character varying FROM content inner join
(
select * from fields where fields.name = ''subtitle''
union all
select * from fields where fields.name <> ''subtitle''
) fields_ordered
on fields_ordered.id_content = content.id group by content.id,name'
)
AS
(
id integer,
content_name character varying,
tags character varying
);