hierarchy with parent and nested child id in bigquery - sql

I have a table in bigquery with the following schema
Name STRING NULLABLE
Parent_id STRING NULLABLE
Child_ids STRING REPEATED
The table is filled with the following rows:
Name Parent_id Child_ids
A 1 [2,3]
B 2 [4]
C 3 null
D 4 null
I would like to make a query which could return not only child_ids but also their name, i.e:
Name Parent_id Child_info
A 1 [(2,B),(3,C)]
B 2 [(4,D)]
C 3 null
D 4 null
Do you have any idea?

Consider below approach
select * except(Child_ids),
array(
select as struct id, Name
from t.Child_ids id
join your_table
on id = Parent_id
) Child_info
from your_table t;
if applied to sample data in your question - output is

Related

How do I add a boolean column to an existing table to indicate if there's a matching id value for each row in another table

Suppose I have a table, foo
id
1
2
3
4
and another table bar
id product
1 abc
1 def
4 ghi
4 abc
I want to add a boolean field, has_product to foo that indicates whether it has at least one record in bar with a matching id. In this example,
id has_product
1 true
2 false
3 false
4 true
How can I do this?
Reproducible Example
CREATE OR REPLACE TABLE test.foo AS
(
SELECT 1 AS id
UNION ALL SELECT 2 AS id
UNION ALL SELECT 3 AS id
UNION ALL SELECT 4 AS id
);
CREATE OR REPLACE TABLE test.bar AS
(
SELECT 1 AS id, "abc" as product
UNION ALL SELECT 1 AS id, "def" as product
UNION ALL SELECT 4 AS id, "ghi" as product
UNION ALL SELECT 4 AS id, "abc" as product
);
What I've tried
I suspect there's a combination of ADD COLUMN and UPDATE that will do the trick. For example, the below code inserts a column with all true values.
alter table test.foo add column has_product bool;
update test.foo
set has_product = True
where true
result
id has_product
1 true
2 true
3 true
4 true
(But obviously this is not my desired result.)
Not sure below works efficiently for your real situation, at least it will return what you expect for small sized toy data. Hope this is helpful.
alter table test.foo add column has_product boolean;
update test.foo
set has_product = EXISTS (SELECT 1 FROM test.bar WHERE bar.id = foo.id)
where true;
select * from test.foo;
If you just want to view this particular output, you may use a left anti-join:
SELECT DISTINCT f.id, b.id IS NOT NULL AS has_product
FROM foo f
LEFT JOIN bar b
ON b.id = f.id

Liquibase/Oracle: Copy value from one table to another in a one-to-many relationship

I have two tables parent and child with a one-to-many relation. I have a value in the child table that is always the same for a given parent ID. Therefore, I want to copy the value to a newly created column in the parent table.
PARENT_ID
VALUE
1
null
2
null
CHILD_ID
 PARENT_ID
VALUE
1
1
VALUE_1
2
1
VALUE_1
3
1
VALUE_1
4
2
VALUE_2
5
2
VALUE_2
I am using Liquibase and looking for a solution that works with Oracle and H2 (in Oracle mode).
This gives me a syntax error but the nested SELECT works:
UPDATE (SELECT DISTINCT PARENT.PARENT_ID, CHILD.VALUE as OLD_COLUMN, PARENT.VALUE as NEW_COLUMN
FROM PARENT
LEFT JOIN CHILD
ON PARENT.PARENT_ID = CHILD.PARENT_ID) t
SET t.NEW_COLUMN = t.OLD_COLUMN;
Output of nested SELECT:
PARENT_ID
 OLD_COLUMN
NEW_COLUMN
1
VALUE_1
null
2
VALUE_2
null
It doesn't have to be a SQL solution, a Liquibase Update would be even better.
I think this sql will achieve what you are looking for:
update parent p
set value = (select distinct value
from child c
where c.parent_id = p.parent_id);

SQL grouping by distinct values in a multi-value string column

(I want to perform a group-by based on the distinct values in a string column that has multiple values
The said column has a list of strings in a standard format separated by commas. The potential values are only a,b,c,d.
For example the column collection (type: String) contains:
Row 1: ["a","b"]
Row 2: ["b","c"]
Row 3: ["b","c","a"]
Row 4: ["d"]`
The expected output is a count of unique values:
collection | count
a | 2
b | 3
c | 2
d | 1
For all the below i used this table:
create table tmp (
id INT auto_increment,
test VARCHAR(255),
PRIMARY KEY (id)
);
insert into tmp (test) values
("a,b"),
("b,c"),
("b,c,a"),
("d")
;
If the possible values are only a,b,c,d you can try one of this:
Tke note that this will only works if you have not so similar values like test and test_new, because then the test would be joined also with all test_new rows and the count would not match
select collection, COUNT(*) as count from tmp JOIN (
select CONCAT("%", tb.collection, "%") as like_collection, collection from (
select "a" COLLATE utf8_general_ci as collection
union select "b" COLLATE utf8_general_ci as collection
union select "c" COLLATE utf8_general_ci as collection
union select "d" COLLATE utf8_general_ci as collection
) tb
) tb1
ON tmp.test LIKE tb1.like_collection
GROUP BY tb1.collection;
Which will give you the result you want
collection | count
a | 2
b | 3
c | 2
d | 1
or you can try this one
SELECT
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%a%') as a_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%b%') as b_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%c%') as c_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%d%') as d_count
;
The result would be like this
a_count | b_count | c_count | d_count
2 | 3 | 2 | 1
What you need to do is to first explode the collection column into separate rows (like a flatMap operation). In redshift the only way to generate new rows is to JOIN - so let's CROSS JOIN your input table with a static table having consecutive numbers, and take only ones having id less or equal to number of elements in the collection. Then we'll use split_part function to read the item at correct index. Once we have the exploaded table, we'll do a simple GROUP BY.
If your items are stored as JSON array strings ('["a", "b", "c"]') then you can use JSON_ARRAY_LENGTH and JSON_EXTRACT_ARRAY_ELEMENT_TEXT instead of REGEXP_COUNT and SPLIT_PART respectively.
with
index as (
select 1 as i
union all select 2
union all select 3
union all select 4 -- could be substituted with 'select row_number() over () as i from arbitrary_table limit 4'
),
agg as (
select 'a,b' as collection
union all select 'b,c'
union all select 'b,c,a'
union all select 'd'
)
select
split_part(collection, ',', i) as item,
count(*)
from index,agg
where regexp_count(agg.collection, ',') + 1 >= index.i -- only get rows where number of items matches
group by 1

Using recursion with a self Foreign key

I'm using PostgreSQL 10, and I have the following structure:
A table Type with a foreign key to itself.
id name parent_id
1 namea Null
2 nameb Null
3 namea1 1
4 namea11 3
5 namea111 4
6 nameb1 2
7 nameb2 2
A table Item_Type for a Many to Many relation
id type_id item_id
1 1 1
2 3 2
3 5 3
4 7 4
Table Item which has M2M relation to Type.
id name
1 item1
2 item2
3 item3
4 item4
At this moment, I'm using an additional path field, which I calculate every time I make operations(crud) with Type.
I'm wondering if is not faster and easy to try to use the PostgreSQL recursion.
I checked the documentation but I didn't understand very well, because I get an error, and I don't understate why.
WITH RECURSIVE descendants AS (
SELECT id, name FROM Type WHERE id = 1
UNION
SELECT t.id, t.name, t.parent_id FROM Type AS t
INNER JOIN descendants AS d ON d.id = t.parent_id
) SELECT * FROM descendants;
ERROR: each UNION query must have the same number of columns
What I need - Giving a Type name:
1) Get all names/id for the requested Type and is descendants
2) Get all Item for the requested Type and is descendants, and the number of Item per Type and descendants
For example:
If the requested Type name is 'namea1', I should get for Type ids 1,3,4,5 and
for Item ids 1,2,3
The error says it all. Your union is divided between:
SELECT <2 fields> from Type ...
SELECT <3 fields> from Type JOIN Descendant ...
Simply select 3 fields on both halves:
WITH RECURSIVE descendants AS (
SELECT id, name, parent_id FROM Type WHERE id = 1
UNION
SELECT t.id, t.name, t.parent_id FROM Type AS t
INNER JOIN descendants AS d ON d.id = t.parent_id
) SELECT * FROM descendants;

Delete all level child item using sql query

I have a table where I have menus listed where I can insert and delete.
Structure goes like:-
ID Name ParentId
1 1. Home 0
2 2. Products 0
3 a. SubProduct1 2
4 b. SubProduct2 2
5 i. Subsub 4
6 ii. ...... 4
7 3. About 0
Top-level menu ParentId is always 0 as displayed in 1, 2 and 7.
Child level items would have ParentId of their parent for ex. Subproduct has 2 as its parentId.
When I delete menu item that time all level child item should be delete irrespective of there levels using SQL query.
There can be any number of levels
The levels can go upto subsubsubsub...... any number.
How about this query:
DECLARE #DelID INT
SET #DelID=1
;WITH T(xParent, xChild)AS
(
SELECT ParentID, ChildId FROM Table WHERE ParentID=#DelID
UNION ALL
SELECT ParentID, ChildId FROM TABLE INNER JOIN T ON ParentID=xChild
)
DELETE FROM TABLE WHERE ParentID IN (SELECT xParent FROM T)
You can use a common table expression to get all the heirarchy items from the item you want to delete to the end of the tree hten
;WITH ParentChildsTree
AS
(
SELECT ID, Name, ParentId
FROM MenuItems
WHERE Id = #itemToDelete
UNION ALL
SELECT ID, Name, ParentId
FROM ParentChildsTree c
INNER JOIN MenuItems t ON c.ParentId = t.Id
)
DELETE FROM MenuItems
WHERE ID IN (SELECT ID FROM ParentChildsTree);
Here is a Demo.
For example if you pass a parameter #itemToDelete = 4 to the query the the items with ids 2 and 4 will be deleted.