how to loop an array in string in a where clause - sql

I have an information table with a column of an array in string format. The length is unknown starting from 0. How can I put it in a where clause of PostgreSQL?
* hospital_information_table
| ID | main_name | alternative_name |
| --- | ---------- | ----------------- |
| 111 | 'abc' | 'abe, abx' |
| 222 | 'bbc' | '' |
| 333 | 'cbc' | 'cbe,cbd,cbf,cbg' |
​
​
* record
| ID | name | hospital_id |
| --- | ------- | ------------ |
| 1 | 'abc-1' | |
| 2 | 'bbe+2' | |
| 3 | 'cbf*3' | |
​
e.g. this column is for alternative names of hospitals. let's say e.g. 'abc,abd,abe,abf' as column Name and '111' as ID. And I have a record with a hospital name 'cbf*3' ('3' is the department name) and I would like to check its ID. How can I check all names one by one in 'cbe,cbd,cbf,cbg' and get its ID '333'?
--update--
In the example, in the record table, I used '-', '*', '+', meaning that I couldn't split the name in the record table under a certain pattern. But I can make sure that some of the alternative names may appear in the record name (as a substring). something similar to e.g. 'cbf' in 'cbf*3'. I would like to check all names, if 'abe' in 'cbf*3'? no, if 'abx' in 'cbf*3'? no, then the next row etc.
--update--
Thanks for the answers! They are great!
For more details, the original dataset is not in alphabetic languages. The text in the record name is not separable. it is really hard to find a separator or many separators. Therefore, for the solutions with regrex like '[-*+]' could not work here.
Thanks in advance!

You could use regexp_split_to_array to convert the coma-delimited string to a proper array, and then use the any operator to search inside it:
SELECT r.*, h.id
FROM record r
JOIN hospital_information h ON
SPLIT_PART(r.name, '-', 1) = ANY(REGEXP_SPLIT_TO_ARRAY(h.name, ','))
SQLFiddle demo

Substring can be used with a regular expression to get the hospital name from the record's name.
And String_to_array can transform a CSV string to an array.
SELECT
r.id as record_id
, r.name as record_name
, h.id as hospital_id
FROM record r
LEFT JOIN hospital_information h
ON SUBSTRING(r.name from '^(.*)[+*\-]\w+$') = ANY(STRING_TO_ARRAY(h.alternative_name,',')||h.main_name)
WHERE r.hospital_id IS NULL;
record_id
record_name
hospital_id
1
abc-1
111
2
bbe+2
222
3
cbf*3
333
Demo on db<>fiddle here
Btw, text [] can be used as a datatype in a table.

Related

How to get a value inside of a JSON that is inside a column in a table in Oracle sql?

Suppose that I have a table named agents_timesheet that having a structure like this:
ID | name | health_check_record | date | clock_in | clock_out
---------------------------------------------------------------------------------------------------------
1 | AAA | {"mental":{"stress":"no", "depression":"no"}, | 6-Dec-2021 | 08:25:07 |
| | "physical":{"other_symptoms":"headache", "flu":"no"}} | | |
---------------------------------------------------------------------------------------------------------
2 | BBB | {"mental":{"stress":"no", "depression":"no"}, | 6-Dec-2021 | 08:26:12 |
| | "physical":{"other_symptoms":"no", "flu":"yes"}} | | |
---------------------------------------------------------------------------------------------------------
3 | CCC | {"mental":{"stress":"no", "depression":"severe"}, | 6-Dec-2021 | 08:27:12 |
| | "physical":{"other_symptoms":"cancer", "flu":"yes"}} | | |
Now I need to get all agents having flu at the day. As for getting the flu from a single JSON in Oracle SQL, I can already get it by this SQL statement:
SELECT * FROM JSON_TABLE(
'{"mental":{"stress":"no", "depression":"no"}, "physical":{"fever":"no", "flu":"yes"}}', '$'
COLUMNS (fever VARCHAR(2) PATH '$.physical.flu')
);
As for getting the values from the column health_check_record, I can get it by utilizing the SELECT statement.
But How to get the values of flu in the JSON in the health_check_record of that table?
Additional question
Based on the table, how can I retrieve full list of other_symptoms, then it will get me this kind of output:
ID | name | other_symptoms
-------------------------------
1 | AAA | headache
2 | BBB | no
3 | CCC | cancer
You can use JSON_EXISTS() function.
SELECT *
FROM agents_timesheet
WHERE JSON_EXISTS(health_check_record, '$.physical.flu == "yes"');
There is also "plain old way" without JSON parsing only treting column like a standard VARCHAR one. This way will not work in 100% of cases, but if you have the data in the same way like you described it might be sufficient.
SELECT *
FROM agents_timesheet
WHERE health_check_record LIKE '%"flu":"yes"%';
How to get the values of flu in the JSON in the health_check_record of that table?
From Oracle 12, to get the values you can use JSON_TABLE with a correlated CROSS JOIN to the table:
SELECT a.id,
a.name,
j.*,
a."DATE",
a.clock_in,
a.clock_out
FROM agents_timesheet a
CROSS JOIN JSON_TABLE(
a.health_check_record,
'$'
COLUMNS (
mental_stress VARCHAR2(3) PATH '$.mental.stress',
mental_depression VARCHAR2(3) PATH '$.mental.depression',
physical_fever VARCHAR2(3) PATH '$.physical.fever',
physical_flu VARCHAR2(3) PATH '$.physical.flu'
)
) j
WHERE physical_flu = 'yes';
db<>fiddle here
You can use "dot notation" to access data from a JSON column. Like this:
select "DATE", id, name
from agents_timesheet t
where t.health_check_record.physical.flu = 'yes'
;
DATE ID NAME
----------- --- ----
06-DEC-2021 2 BBB
Note that this approach requires that you use an alias for the table name (so you can use it in accessing the JSON data).
For testing I used the data posted by MT0 on dbfiddle. I am not a big fan of double-quoted column names; use something else for "DATE", such as dt or date_.

Count string occurances within a list column - Snowflake/SQL

I have a table with a column that contains a list of strings like below:
EXAMPLE:
STRING User_ID [...]
"[""null"",""personal"",""Other""]" 2122213 ....
"[""Other"",""to_dos_and_thing""]" 2132214 ....
"[""getting_things_done"",""TO_dos_and_thing"",""Work!!!!!""]" 4342323 ....
QUESTION:
I want to be able to get a count of the amount of times each unique string appears (strings are seperable within the strings column by commas) but only know how to do the following:
SELECT u.STRING, count(u.USERID) as cnt
FROM table u
group by u.STRING
order by cnt desc;
However the above method doesn't work as it only counts the number of user ids that use a specific grouping of strings.
The ideal output using the example above would like this!
DESIRED OUTPUT:
STRING COUNT_Instances
"null" 1223
"personal" 543
"Other" 324
"to_dos_and_thing" 221
"getting_things_done" 146
"Work!!!!!" 22
Based on your description, here is my sample table:
create table u (user_id number, string varchar);
insert into u values
(2122213, '"[""null"",""personal"",""Other""]"'),
(2132214, '"[""Other"",""to_dos_and_thing""]"'),
(2132215, '"[""getting_things_done"",""TO_dos_and_thing"",""Work!!!!!""]"' );
I used SPLIT_TO_TABLE to split each string as a row, and then REGEXP_SUBSTR to clean the data. So here's the query and output:
select REGEXP_SUBSTR( s.VALUE, '""(.*)""', 1, 1, 'i', 1 ) extracted, count(*) from u,
lateral SPLIT_TO_TABLE( string , ',' ) s
GROUP BY extracted
order by count(*) DESC;
+---------------------+----------+
| EXTRACTED | COUNT(*) |
+---------------------+----------+
| Other | 2 |
| null | 1 |
| personal | 1 |
| to_dos_and_thing | 1 |
| getting_things_done | 1 |
| TO_dos_and_thing | 1 |
| Work!!!!! | 1 |
+---------------------+----------+
SPLIT_TO_TABLE https://docs.snowflake.com/en/sql-reference/functions/split_to_table.html
REGEXP_SUBSTR https://docs.snowflake.com/en/sql-reference/functions/regexp_substr.html

Pull NULL if column not present in table while UNION SQL Server

I am currently building a dynamic SQL query. The tables and columns are sent as parameters. So the columns may not be present in the table. Is there a way to pull NULL data in the result set when the column is not present in the table?
ex:
SELECT * FROM Table1
Output:
created date | Name | Salary | Married
-------------+-------+--------+----------
25-Jan-2016 | Chris | 2500 | Y
27-Jan-2016 | John | 4576 | N
30-Jan-2016 | June | 3401 | N
So when I run the query below
SELECT Created_date, Name, Age, Married
FROM Table1
I need to get
created date | Name | AGE | Married
-------------+-------+--------+----------
25-Jan-2016 | Chris | NULL | Y
27-Jan-2016 | John | NULL | N
30-Jan-2016 | June | NULL | N
Does anything like IF NOT EXISTS or ISNULL work in this?
I can't use extensive T-SQL in this segment and need to be simple since I am creating a UNION query to more than 50 tables (requirement :| ) . Any advice would be of great help to me.
I can't think of an easy solution. Since you're using dynamic sql, instead of
(previous dynamic string part)+' fieldname '+(next dynamic string part)
you could use
(previous dynamic string part)
+ case when exists (
select 1
from sys.tables t
inner join sys.columns c on t.object_id=c.object_id
where c.name=your_field_name and t.name=your_table_name)
) then ' fieldname ' else ' NULL ' end
+(next dynamic string part)

Splitting a string column in BigQuery

Let's say I have a table in BigQuery containing 2 columns. The first column represents a name, and the second is a delimited list of values, of arbitrary length. Example:
Name | Scores
-----+-------
Bob |10;20;20
Sue |14;12;19;90
Joe |30;15
I want to transform into columns where the first is the name, and the second is a single score value, like so:
Name,Score
Bob,10
Bob,20
Bob,20
Sue,14
Sue,12
Sue,19
Sue,90
Joe,30
Joe,15
Can this be done in BigQuery alone?
Good news everyone! BigQuery can now SPLIT()!
Look at "find all two word phrases that appear in more than one row in a dataset".
There is no current way to split() a value in BigQuery to generate multiple rows from a string, but you could use a regular expression to look for the commas and find the first value. Then run a similar query to find the 2nd value, and so on. They can all be merged into only one query, using the pattern presented in the above example (UNION through commas).
Trying to rewrite Elad Ben Akoune's answer in Standart SQL, the query becomes like this;
WITH name_score AS (
SELECT Name, split(Scores,';') AS Score
FROM (
(SELECT * FROM (SELECT 'Bob' AS Name ,'10;20;20' AS Scores))
UNION ALL
(SELECT * FROM (SELECT 'Sue' AS Name ,'14;12;19;90' AS Scores))
UNION ALL
(SELECT * FROM (SELECT 'Joe' AS Name ,'30;15' AS Scores))
))
SELECT name, score
FROM name_score
CROSS JOIN UNNEST(name_score.score) AS score;
And this outputs;
+------+-------+
| name | score |
+------+-------+
| Bob | 10 |
| Bob | 20 |
| Bob | 20 |
| Sue | 14 |
| Sue | 12 |
| Sue | 19 |
| Sue | 90 |
| Joe | 30 |
| Joe | 15 |
+------+-------+
If someone is still looking for an answer
select Name,split(Scores,';') as Score
from (
# replace the inner custome select with your source table
select *
from
(select 'Bob' as Name ,'10;20;20' as Scores),
(select 'Sue' as Name ,'14;12;19;90' as Scores),
(select 'Joe' as Name ,'30;15' as Scores)
);

SQLite, selecting values having same criteria (throughout all table)

I have an sqlite database table similar to the one given below
Name | Surname | AddrType | Age | Phone
John | Kruger | Home | 23 | 12345
Sarah | Kats | Home | 33 | 12345
Bill | Kruger | Work | 15 | 12345
Lars | Kats | Home | 54 | 12345
Javier | Roux | Work | 45 | 12345
Ryne | Hutt | Home | 36 | 12345
I would like to select Name values matching same "Surname" value for each of the rows in the table.
For example, for the first line the query would be "select Name from myTable where Surname='Kruger'" whereas for the second line the query would be "select Name from myTable where Surname='Kats' and so an....
Is it possible to traverse through the whole table and select all values like that?
PS : I will use these method in a C++ application, the alternative method is to use sqlite3_exec() and process each row one by one. I just want to know if there is any other possible way for the same approach.
I'd do:
sqlite> SELECT group_concat(Name, '|') Names FROM People GROUP BY Surname;
Names
----------
Ryne
Sarah|Lars
John|Bill
Javier
Then split each value of "Names" in C++ using the "|" separator (or any other you choose in group_concat function.
Basically you just want to exclude any records that don't have a buddy.
Something simple like joining the table against itself should work:
SELECT a.Name
FROM tab AS a
JOIN tab AS b
ON a.Surname = b.Surname;
Just returning the full sorted table and doing the duplicate check yourself may be faster if incidence is high (and will always be high for all sets of data). That would be a pretty strong assumption though.
SELECT Name
FROM tab
SORT BY Surname;