Splitting a row containing JSONB string array into two different rows (PostgreSQL) - sql

Given a row that looks like this (PostgreSQL 10 and 11):
CREATE TABLE examples (
"id" varchar NOT NULL,
"type" varchar NOT NULL,
"relation_id" varchar NOT NULL,
"things" jsonb,
PRIMARY KEY ("id")
);
INSERT INTO examples(id, type, relation_id, things) values
('7287b283-f2d8-4940-94ae-c8253599d479', 'letter-number', 'relation-id-1', '["A", "B", "1", "2", "C"]');
INSERT INTO examples(id, type, relation_id, things) values
('7287b283-f2d8-4940-94ae-c8253599d480', 'letter-number', 'relation-id-2', '["A", "2", "C"]');
INSERT INTO examples(id, type, relation_id, things) values
('7287b283-f2d8-4940-94ae-c8253599d481', 'letter-number', 'relation-id-3', '[]');
How would you go ahead and split those rows into:
'7287b283-f2d8-4940-94ae-c8253599d480', 'relation-id-1', 'number', '["2"]'
'7287b283-f2d8-4940-94ae-c8253599d482', 'relation-id-1', 'letter', '["A", "C"]'.
Essentially split the "type" field and conditionally divide the jsonb array.
Assumptions:
Values in the JSONB are always there (they can be an empty array) but they are always structured like that.
There are other types in the same table.
Keeping the relation_id is crucial.
The values that go inside the JSONB array are know, there is only a couple of them (lets say five or six) so we can hardcode them in the query.
Changes have to be persisted. The original row can be removed/updated.
https://www.db-fiddle.com/f/4VV1tZD3pBYMiCmtTFtQnj/5 <- DB-fiddle.
I tried messing around with json_array_elements and INSERT ... SELECT with a subquery but that got me nowhere for now.

Click: demo:db<>fiddle
SELECT
id,
relation_id,
CASE WHEN elems ~ '[0-9]' THEN 'number' ELSE 'letter' END AS type, -- 2
jsonb_agg(elems) -- 3
FROM
examples,
jsonb_array_elements_text(things) elems -- 1
GROUP BY
1,2,3 -- 3
Expand the array into one row per element
Check if element is digit using a regular expression. If so, create new type number, otherwise letter
Group by this new type and reaggregate the elements
With extensions:
Update the table with the split rows (remove old, insert new)
Empty arrays generate rows with both types
Click: demo:db<>fiddle
WITH del AS (
DELETE FROM examples
RETURNING id, relation_id, type, things
)
INSERT INTO examples
SELECT
id || '_' || type,
relation_id,
type,
COALESCE(jsonb_agg(elems) FILTER (WHERE elems IS NOT NULL), '[]')
FROM (
SELECT
id,
relation_id,
CASE WHEN elems ~ '[0-9]' THEN 'number' ELSE 'letter' END AS type,
elems
FROM
del,
jsonb_array_elements_text(things) elems
UNION ALL
SELECT
id,
relation_id,
t,
null
FROM
examples,
unnest(array['number', 'letter']) as t
WHERE things = '[]'
) s
GROUP BY 1,2,3;

Related

Compare two arrays in PostgreSQL

I have a table in postgres with a value column that contains string arrays. My objective is to find all arrays that contain any of the following strings: {'cat', 'dog'}
id value
1 {'dog', 'cat', 'fish'}
2 {'elephant', 'mouse'}
3 {'lizard', 'dog', 'parrot'}
4 {'bear', 'bird', 'cat'}
The following query uses ANY() to check if 'dog' is equal to any of the items in each array and will correctly return rows 1 and 3:
select * from mytable where 'dog'=ANY(value);
I am trying to find a way to search value for any match in an array of strings. For example :
select * from mytable where ANY({'dog', 'cat'})=ANY(value);
Should return rows 1, 3, and 4. However, the above code throws an error. Is there a way to use the ANY() clause on the left side of this equation? If not, what would be the workaround to check if any of the strings in an array are in value?
You can use && operator to find out whether two array has been overlapped or not. It will return true only if at least one element from each array match.
Schema and insert statements:
create table mytable (id int, value text[]);
insert into mytable values (1,'{"dog", "cat", "fish"}');
insert into mytable values (2,'{"elephant", "mouse"}');
insert into mytable values (3,'{"lizard", "dog", "parrot"}');
insert into mytable values (4,'{"bear", "bird", "cat"}');
Query:
select * from mytable where array['dog', 'cat'] && (value);
Output:
id
value
1
{dog,cat,fish}
3
{lizard,dog,parrot}
4
{bear,bird,cat}
db<>fiddle here

SELECT on JSON operations of Postgres array column?

I have a column of type jsonb[] (a Postgres array of jsonb objects) and I'd like to perform a SELECT on rows where a criteria is met on at least one of the objects. Something like:
-- Schema would be something like
mytable (
id UUID PRIMARY KEY,
col2 jsonb[] NOT NULL
);
-- Query I'd like to run
SELECT
id,
x->>'field1' AS field1
FROM
mytable
WHERE
x->>'field2' = 'user' -- for any x in the array stored in col2
I've looked around at ANY and UNNEST but it's not totally clear how to achieve this, since you can't run unnest in a WHERE clause. I also don't know how I'd specify that I want the field1 from the matching object.
Do I need a WITH table with the values expanded to join against? And how would I achieve that and keep the id from the other column?
Thanks!
You need to unnest the array and then you can access each json value
SELECT t.id,
c.x ->> 'field1' AS field1
FROM mytable t
cross join unnest(col2) as c(x)
WHERE c.x ->> 'field2' = 'user'
This will return one row for each json value in the array.

query column names for a single row where value is null?

I'm attempting to return all of the column names for a single row where the value is null. I can parse the entire row afterward but curious if there's a function that I can leverage.
I can return a JSON object containing key value pairs where the value is not null using row_to_json() and json_strip_nulls where conditional references a single unique row:
SELECT json_strip_nulls(row_to_json(t))
FROM table t where t.id = 123
Is there a function or simple way to accomplish the inverse of this, returning all of the keys (column names) with null values?
You need a primary key or unique column(s). In the example id is unique:
with my_table(id, col1, col2, col3) as (
values
(1, 'a', 'b', 'c'),
(2, 'a', null, null),
(3, null, 'b', 'c')
)
select id, array_agg(key) as null_columns
from my_table t
cross join jsonb_each_text(to_jsonb(t))
where value is null
group by id
id | null_columns
----+--------------
2 | {col2,col3}
3 | {col1}
(2 rows)
key and value are default columns returned by the function jsonb_each_text(). See JSON Functions and Operators in the documentation.
Actually the JSON approach might work. First transform the rows to a JSON object with row_ro_json(). Then expand the JSON objects back to a set using json_each_text(). You can now filter for NULL values and use aggregation to get the columns, that contain NULL.
I don't know what output format you want. json_object_agg() is the "complement" to your json_strip_nulls()/row_to_json() approach. But you may also want a JSON array (json_agg), just an array (array_agg()) or a comma separated string list (string_agg()).
SELECT json_object_agg(jet.k, jet.v),
json_agg(jet.k),
array_agg(jet.k),
string_agg(jet.k, ',')
FROM elbat t
CROSS JOIN LATERAL row_to_json(t) rtj(j)
CROSS JOIN LATERAL json_each_text(rtj.j) jet(k, v)
WHERE jet.v IS NULL
GROUP BY rtj.j::text;
db<>fiddle

Is there a SQL SELECT to rename one column preserving column order? [duplicate]

here is what I'm trying to do- I have a table with lots of columns and want to create a view with one of the column reassigned based on certain combination of values in other columns, e.g.
Name, Age, Band, Alive ,,, <too many other fields)
And i want a query that will reassign one of the fields, e.g.
Select *, Age =
CASE When "Name" = 'BRYAN ADAMS' AND "Alive" = 1 THEN 18
ELSE "Age"
END
FROM Table
However, the schema that I now have is Name, Age, Band, Alive,,,,<too many>,, Age
I could use 'AS' in my select statment to make it
Name, Age, Band, Alive,,,,<too many>,, Age_Computed.
However, I want to reach the original schema of
Name, Age, Band, Alive.,,,, where Age is actually the computed age.
Is there a selective rename where I can do SELECT * and A_1 as A, B_1 as b? (and then A_1 completely disappears)
or a selective * where I can select all but certain columns? (which would also solve the question asked in the previous statement)
I know the hacky way where I enumerate all columns and create an appropriate query, but I'm still hopeful there is a 'simpler' way to do this.
Sorry, no, there is not a way to replace an existing column name using a SELECT * construct as you desire.
It is always better to define columns explicitly, especially for views, and never use SELECT *. Just use the table's DDL as a model when you create the view. That way you can alter any column definition you want (as in your question) and eliminate columns inappropriate for the view. We use this technique to mask or eliminate columns containing sensitive data like social security numbers and passwords. The link provided by marc_s in the comments is a good read.
Google BigQuery supports SELECT * REPLACE:
A SELECT * REPLACE statement specifies one or more expression AS identifier clauses. Each identifier must match a column name from the SELECT * statement.
In the output column list, the column that matches the identifier in a REPLACE clause is replaced by the expression in that REPLACE clause.
A SELECT * REPLACE statement does not change the names or order of columns. However, it can change the value and the value type.
Select *, Age = CASE When "Name" = 'BRYAN ADAMS' AND "Alive" = 1 THEN 18
ELSE "Age"
END
FROM tab
=>
SELECT * REPLACE(CASE WHEN Name = 'BRYAN ADAMS' AND Alive = 1 THEN 18
ELSE Age END AS Age)
FROM Tab
Actually, there is a way to do this in MySQL. You need to use a hack to select all but one column as posted here, then add it separately in the AS statement.
Here is an example:
-- Set-up some example data
DROP TABLE IF EXISTS test;
CREATE TABLE `test` (`ID` int(2), `date` datetime, `val0` varchar(1), val1 INT(1), val2 INT(4), PRIMARY KEY(ID, `date`));
INSERT INTO `test` (`ID`, `date`, `val0`, `val1`, `val2`) VALUES
(1, '2016-03-07 12:20:00', 'a', 1, 1001),
(1, '2016-04-02 12:20:00', 'b', 2, 1004),
(1, '2016-03-01 10:09:00', 'c', 3, 1009),
(1, '2015-04-12 10:09:00', 'd', 4, 1016),
(1, '2016-03-03 12:20:00', 'e', 5, 1025);
-- Select all columns, renaming 'val0' as 'yabadabadoo':
SET #s = CONCAT('SELECT ', (SELECT REPLACE(GROUP_CONCAT(COLUMN_NAME), 'val0,', '')
FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'test' AND TABLE_SCHEMA =
'<database_name>'), ', val0 AS `yabadabadoo` FROM test');
PREPARE stmt1 FROM #s;
EXECUTE stmt1;

PostgreSQL: Check if each item in array is contained by a larger string

I have an array of strings in PostgreSQL:
SELECT ARRAY['dog', 'cat', 'mouse'];
And I have a large paragraph:
Dogs and cats have a range of interactions. The natural instincts of each species lead towards antagonistic interactions, though individual animals can have non-aggressive relationships with each other, particularly under conditions where humans have socialized non-aggressive behaviors.
The generally aggressive interactions between the species have been noted in cultural expressions.
For each item in the array, I want to check if it appears in my large paragraph string. I know for any one string, I could do the following:
SELECT paragraph_text ILIKE '%dog%';
But is there a way to simultaneously check every string in the array (for an arbitrary number of array elements) without resorting to plpgsql?
I belive you want something like this (assuming paragraph_text is column from table named table):
SELECT
paragraph_text,
sub.word,
paragraph_text ILIKE '%' || sub.word || '%' as is_word_in_text
FROM
table1 CROSS JOIN (
SELECT unnest(ARRAY['dog', 'cat', 'mouse']) as word
) as sub;
Function unnest(array) takes creates table of record from array values. The you can do CROSS JOIN which means all rows from table1 are combines with all rows from that unnest-table.
If paragraph_text is some kind of static value (not from table) you can do just:
SELECT
paragraph_text,
sub.word,
paragraph_text ILIKE '%' || sub.word || '%' as is_word_in_text
FROM (
SELECT unnest(ARRAY['dog', 'cat', 'mouse']) as word
) as sub;
This solution will work only for postgres 8.4 and above as unrest is not available for earlier versions.
drop table if exists t;
create temp table t (col1 text, search_terms text[] );
insert into t values
('postgress is awesome', array['postgres', 'is', 'bad']),
('i like open source', array['open', 'code', 'i']),
('sql is easy', array['mysql']);
drop table if exists t1;
select *, unnest(search_terms) as search_term into temp t1 from t;
-- depending on how you like to do pattern matching.
-- it will look for term not whole words
select *, position(search_term in col1) from t1;
-- This will match only whole words.
select *, string_to_array(col1, E' ')#>string_to_array(search_term, E' ') from t1;
Basically, you need to flatten array of search_terms into one column and then match long string with each search term row wise.