SQL LEFT JOIN WITH SPLIT - sql

I want to do a left join on a table where the format of the two columns are not the same. I use REPLACE to remove the "[ ]" but I'm having trouble making one of the rows into two rows so be able to complete the join.
emp_tbl state_tbl
emp state id name
+--------+-------+ +------+-----+
| Steve | [1] | | 1 | AL |
| Greg | [2|3] | | 2 | NV |
| Steve | [4] | | 3 | AZ |
+--------+-------+ | 4 | NH |
+------+-----+
Desired output:
+--------+------+
| Steve | AL |
| Greg | NV |
| Greg | AZ |
| Steve | NH |
+--------+------+
SELECT emp_tbl.emp, state_tbl.name
FROM emp_tbl
LEFT JOIN state_tbl on state_tbl.id = REPLACE(REPLACE(emp_tbl.state, '[', ''), ']', '')
With this query i can remove the "[ ]" and do the join, but the row with two "states" does obiously not work.

Your query will never produce 4 rows because the left table only has 3 rows. You need to flatten the rows that contains multiple state_ids before the join.
Prepare the table and data:
create or replace table emp_tbl (emp varchar, state string);
create or replace table state_tbl (id varchar, name varchar);
insert into emp_tbl values
('Steve', '[1]'), ('Greg', '[2|3]'), ('Steve', '[4]');
insert into state_tbl values
(1, 'AL'), (2, 'NV'), (3, 'AZ'), (4, 'NH');
Then below query should give you the data you want:
with emp_tbl_tmp as (
select emp, parse_json(replace(state, '|', ',')) as states from emp_tbl
),
flattened_tbl as (
select emp, value as state_id from emp_tbl_tmp, table(flatten(input => states))
)
select emp, name from flattened_tbl emp
left join state_tbl state on (emp.state_id = state.id);
Or if you want to save one step:
with flattened_emp_tbl as (
select emp, value as state_id
from emp_tbl,
table(flatten(
input => parse_json(replace(state, '|', ','))
))
)
select emp, name from flattened_emp_tbl emp
left join state_tbl state
on (emp.state_id = state.id);

here is how you can do it :
select emp_tbl.emp, state_tbl.name
from emp_tbl tw
lateral flatten (input=>split(parse_json(tw.state), '|')) s
left join state_tbl on s.value = state_tbl.id

Related

Update MyTable with values from AnotherTable (with self join)

I'm relatively new to SQL and currently making some practical tasks to gain experience and got struggled with an update of my custom overview table with values from another table that contains join.
I have an overview table MyTable with column EmployeeID. AnotherTable contains data of employees with EmployeeID and their ManagerID.
I am able to retrieve ManagerName using different join methods, including:
SELECT m.first_name
FROM AnotherTable.employees e LEFT JOIN
AnotherTable.employees m
on m.EmployeeID = e.ManagerID
But I am getting stuck updating MyTable, as I usually receive errors such as "single row query returns more than one row" or "SQL command not properly ended". I've read that Oracle doesnt support joins for updating tables. How can I overcome this issue? A sample data would be:
MyTable
------------------------------
EmployeeID | SomeOtherColumns| ..
1 | SomeData |
2 | SomeData |
3 | SomeData |
4 | SomeData |
5 | SomeData |
------------------------------
OtherTable
-------------------------------------
EmployeeID | Name | ManagerID |
1 | Steve | - |
2 | John | 1 |
3 | Peter | 1 |
4 | Bob | 2 |
5 | Patrick | 3 |
6 | Connor | 1 |
-------------------------------------
And the result would be then:
MyTable
-------------------------------------------
EmployeeID | SomeOtherColumns |ManagerName|
1 | SomeData | - |
2 | SomeData | Steve |
3 | SomeData | Steve |
4 | SomeData | John |
5 | SomeData | Peter |
6 | SomeData | Steve |
-------------------------------------------
As one of the options I tried to use is:
update MyTable
set MyTable.ManagerName = (
SELECT
(m.name) ManagerName
FROM
OtherTable.employees e
LEFT JOIN OtherTable.employees m ON
m.EmployeeID = e.ManagerID
)
But there I get "single row query returns more than one row" error. How is it possible to solve this?
You can use a hierarchical query:
UPDATE mytable m
SET managername = (SELECT name
FROM othertable
WHERE LEVEL = 2
START WITH employeeid = m.employeeid
CONNECT BY PRIOR managerid = employeeid);
or a self-join:
UPDATE mytable m
SET managername = (SELECT om.name
FROM othertable o
INNER JOIN othertable om
ON (o.managerid = om.employeeid)
WHERE o.employeeid = m.employeeid);
Which, for the sample data:
CREATE TABLE MyTable (EmployeeID, SomeOtherColumns, ManagerName) AS
SELECT LEVEL, 'SomeData', CAST(NULL AS VARCHAR2(20))
FROM DUAL
CONNECT BY LEVEL <= 5;
CREATE TABLE OtherTable(EmployeeID, Name, ManagerID) AS
SELECT 1, 'Alice', NULL FROM DUAL UNION ALL
SELECT 2, 'Beryl', 1 FROM DUAL UNION ALL
SELECT 3, 'Carol', 1 FROM DUAL UNION ALL
SELECT 4, 'Debra', 2 FROM DUAL UNION ALL
SELECT 5, 'Emily', 3 FROM DUAL UNION ALL
SELECT 6, 'Fiona', 1 FROM DUAL;
Then after either update, MyTable contains:
EMPLOYEEID
SOMEOTHERCOLUMNS
MANAGERNAME
1
SomeData
null
2
SomeData
Alice
3
SomeData
Alice
4
SomeData
Beryl
5
SomeData
Carol
Note: Keeping this data violates third-normal form; instead, you should keep the employee name in the table with the other employee data and then when you want to display the manager's name use SELECT ... FROM ... LEFT OUTER JOIN with a hierarchical query to include the result. What you do not want to do is duplicate the data as then it has the potential to become out-of-sync when something changes.
db<>fiddle here

How to handle duplicate row with join

I have a table bic_table.
-------------------------
KeyInstn | SwiftBICCode
100369 | BOFAUSV1
100369 | MLCOUS33
keyInstn_table
-------------------------
KeyInstn | country
100369 | USA
100370 | India
I am trying to join keyInstn_table with bic_table.
And I want to join both value as a comma separated.
How to get the result as
-------------------------
KeyInstn | country | SwiftBICCode
100369 | USA | BOFAUSV1,MLCOUS33
100370 | India | BOFH76HG
-------------------------
If your database version is SQL Server 2017+ then you can use following:
SELECT a.keyInstn, country,STRING_AGG(SwiftBICCode, ', ') AS SwiftBICCode
FROM tablename a inner join keyInstn_table b on a.keyInstn=b.KeyInstn
GROUP BY a.keyInstn,country
Alternatively, you can use stuff() for lower versions of SQL Server
select u.keyInstn, country,
stuff(( select concat( ',', SwiftBICCode) from tablename y
where y.keyInstn= u.keyInstn for xml path('')),1,1, '')
from tablename u inner join keyInstn_table b on u.keyInstn=b.KeyInstn
group by u.keyInstn,country

Multiple select from CTE with different number of rows in a StoredProcedure

How to do two select with joins from the cte's which returns total number of columns in the two selects?
I tried doing union but that appends to the same list and there is no way to differentiate for further use.
WITH campus AS
(SELECT DISTINCT CampusName, DistrictName
FROM dbo.file
),creditAcceptance AS
(SELECT CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal, COUNT(id) AS N
FROM dbo.file
WHERE (EligibilityStatusFinal LIKE 'Eligible%') AND (CollegeCreditEarnedFinal = 'Yes') AND (CollegeCreditAcceptedFinal = 'Yes')
GROUP BY CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal
),eligibility AS
(SELECT CampusName, EligibilityStatusFinal, COUNT(id) AS N, CollegeCreditAcceptedFinal
FROM dbo.file
WHERE (EligibilityStatusFinal LIKE 'Eligible%')
GROUP BY CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal
)
SELECT a.CampusName, c.[EligibilityStatusFinal], SUM(c.N) AS creditacceptCount
FROM campus as a FULL OUTER JOIN creditAcceptance as c ON a.CampusName=c.CampusName
WHERE (a.DistrictName = 'xy')
group by a.CampusName ,c.EligibilityStatusFinal
Union ALL
SELECT a.CampusName , b.[EligibilityStatusFinal], SUM(b.N) AS eligible
From Campus as a FULL OUTER JOIN eligibility as b ON a.CampusName = b.CampusName
WHERE (a.DistrictName = 'xy')
group by a.CampusName,b.EligibilityStatusFinal
Expected output:
+------------+------------------------+--------------------+
| CampusName | EligibilityStatusFinal | creditacceptCount |
+------------+------------------------+--------------------+
| M | G | 1 |
| E | NULL | NULL |
| A | G | 4 |
| B | G | 8 |
+------------+------------------------+--------------------+
+------------+------------------------+----------+
| CampusName | EligibilityStatusFinal | eligible |
+------------+------------------------+----------+
| A | G | 8 |
| C | G | 9 |
| A | T | 9 |
+------------+------------------------+----------+
As you can see here CTEs can be used in a single statement only, so you can't get the expected output with CTEs.
Here is an excerpt from Microsoft docs:
A CTE must be followed by a single SELECT, INSERT, UPDATE, or DELETE
statement that references some or all the CTE columns. A CTE can also
be specified in a CREATE VIEW statement as part of the defining SELECT
statement of the view.
You can use table variables (declare #campus table(...)) or temp tables (create table #campus (...)) instead.

Redshift create all the combinations of any length for the values in one column

How can we create all the combinations of any length for the values in one column and return the distinct count of another column for that combination?
Table:
+------+--------+
| Type | Name |
+------+--------+
| A | Tom |
| A | Ben |
| B | Ben |
| B | Justin |
| C | Ben |
+------+--------+
Output Table:
+-------------+-------+
| Combination | Count |
+-------------+-------+
| A | 2 |
| B | 2 |
| C | 1 |
| AB | 3 |
| BC | 2 |
| AC | 2 |
| ABC | 3 |
+-------------+-------+
When the combination is only A, there are Tom and Ben so it's 2.
When the combination is only B, 2 distinct names so it's 2.
When the combination is A and B, 3 distinct names: Tom, Ben, Justin so it's 3.
I'm working in Amazon Redshift. Thank you!
NOTE: This answers the original version of the question which was tagged Postgres.
You can generate all combinations with this code
with recursive td as (
select distinct type
from t
),
cte as (
select td.type, td.type as lasttype, 1 as len
from td
union all
select cte.type || t.type, t.type as lasttype, cte.len + 1
from cte join
t
on 1=1 and t.type > cte.lasttype
)
You can then use this in a join:
with recursive t as (
select *
from (values ('a'), ('b'), ('c'), ('d')) v(c)
),
cte as (
select t.c, t.c as lastc, 1 as len
from t
union all
select cte.type || t.type, t.type as lasttype, cte.len + 1
from cte join
t
on 1=1 and t.type > cte.lasttype
)
select type, count(*)
from (select name, cte.type, count(*)
from cte join
t
on cte.type like '%' || t.type || '%'
group by name, cte.type
having count(*) = length(cte.type)
) x
group by type
order by type;
There is no way to generate all possible combinations (A, B, C, AB, AC, BC, etc) in Amazon Redshift.
(Well, you could select each unique value, smoosh them into one string, send it to a User-Defined Function, extract the result into multiple rows and then join it against a big query, but that really isn't something you'd like to attempt.)
One approach would be to create a table containing all possible combinations — you'd need to write a little program to do that (eg using itertools in Python). Then, you could join the data against that reasonably easy to get the desired result (eg IF 'ABC' CONTAINS '%A%').

Keep null relations on WHERE IN() or with SELECT and LEFT JOIN

I have table like there:
table:
| id | fkey | label | amount |
|----|------|-------|--------|
| 1 | 1 | aaa | 10 |
| 2 | 1 | bbb | 15 |
| 3 | 1 | fff | 99 |
| 4 | 1 | jjj | 33 |
| 5 | 2 | fff | 10 |
fkey is a foreign key to other table.
Now I need to query for all amounts asociated with some labels ('bbb', 'eee', 'fff') and with specifed fkey, but i need to keep all unexisting labels with NULL.
For simple query with WHERE IN ('bbb', 'eee', 'fff') I got, of course, only two rows:
SELECT label, amount FROM table WHERE label IN ('bbb', 'eee', 'fff') AND fkey = 1;
| label | amount |
|-------|--------|
| bbb | 15 |
| fff | 99 |
but excepted result should be:
| label | amount |
|-------|--------|
| bbb | 15 |
| eee | NULL |
| fff | 99 |
I tried also SELECT label UNION ALL label (...) LEFT JOIN which should work on MySQL (Keep all records in "WHERE IN()" clause, even if they are not found):
SELECT T.label, T.amount FROM (
SELECT 'bbb' AS "lbl"
UNION ALL 'eee' AS "lbl"
UNION ALL 'fff' AS "lbl"
) LABELS
LEFT OUTER JOIN table T ON (LABELS."lbl" = T."label")
WHERE T.fkey = 1;
and also with WITH statement:
WITH LABELS AS (
SELECT 'bbb' AS "lbl"
UNION ALL 'eee' AS "lbl"
UNION ALL 'fff' AS "lbl"
)
SELECT T.label, T.amount FROM LABELS
LEFT OUTER JOIN table T ON (LABELS."lbl" = T."label")
WHERE T.fkey = 1;
but always this LEFT JOIN give me 2 rows istead 3.
Create temporary table doesn't work at all (I got 0 rows from it and I cannot use this in join):
CREATE TEMPORARY TABLE __ids (
id VARCHAR(9) PRIMARY KEY
) ON COMMIT DELETE ROWS;
INSERT INTO __ids (id) VALUES
('bbb'),
('eee'),
('fff');
SELECT
*
FROM __ids
Any idea how to enforce Postgres to keep empty relation? Or even any other idea to get label 'eee' with NULL amount if there is not row for this in table?
List of labels can be different on every request.
This case online: http://rextester.com/CRQY46630
------ EDIT -----
I extended this question with filter where, because answer from a_horse_with_no_name is great, but not cover my whole case (I supposed this where no matter there)
Your approach with the outer join does work. You just need to take the label value from the "outer" joined table, not from the "table":
with labels (lbl) as (
values ('bbb'), ('eee'), ('fff')
)
select l.lbl, --<< this is different to your query
t.amount
from labels l
left outer join "table" t on l.lbl = t.label;
Online example: http://rextester.com/LESK82163
Edit after the scope of the question was extended.
If you want to filter on the base table, you need to move the into the JOIN condition, not the where clause:
with labels (lbl) as (
values ('bbb'), ('eee'), ('fff')
)
select l.lbl, --<< this is different to your query
t.amount
from labels l
left outer join "table" t
on l.lbl = t.label
and t.fkey = 1; --<<
Online example: http://rextester.com/XDO76971