Lookup table from another table and create a new column - sql

I would like to use SQL Server something like lookup table in Excel.
I cannot normalize it since original table has over 8 million rows. It crushed my laptop when I tried it.
How can I put data into relevant columns and create a new column if data is not found.
For example)
tableA,
Type1 Type2
---------------------
A F
B G
C H
D I
E NULL
TableB
ID Country AllTypes
---------------------------------
1 Italy A, B, C
2 USA D, E, A, F
4 Japan I, O, Z
5 UK NULL
By using these two tables, I would like to get the output such as
ID Country AllTypes Type1 Type2 UnCaptured
----------------------------------------------------------------------
1 Italy A, B, C A, B, C NULL NULL
2 USA D, E, G, F D, E G, F NULL
4 Japan I, O, Z NULL I O, Z
5 UK NULL NULL NULL NULL
========================= Thanks to answers before I edited my question, I could do it here so far
with TableA as (
select 'A' as Type1, 'F' as Type2 union all
select 'B', 'G' union all
select 'C', 'H' union all
select 'D', 'I' union all
select 'E', NULL
),
TableB as (
select 1 as ID, 'Italy' as Country, 'A, B, C' as Alltypes union all
select 2, 'USA', 'D, E, A, F' union all
select 4, 'Japan', 'I', 'O', 'Z' union all
select 5, 'UK', NULL
)
select b.Id, b.Country, b.Alltypes,
String_Agg(v.type1,',') Type1,
String_Agg(v.type2,',') Type2,
String_Agg(v.Type3,',') Uncaptured
from tableb b
outer apply (
select Trim(value) t,
case when exists
(select * from tablea a where a.type1=Trim(value))
then Trim(value) end type1,
case when exists
(select * from tablea a where a.type2=Trim(value))
then Trim(value) end Type2,
Case when not exists
( (select * from tablea a where a.type1=Trim(value))
and
(select * from tablea a where a.type2=Trim(value))
) then Trim(value) end Type3
from String_Split(alltypes, ',')
)v
group by Id, Country, AllTypes
But it shows error. I was also thinking of else but did not work as well.
Could you help me please?

Having multiple delimited values in a single column is always going to be problematic, one way is to use a combination of string_split and string_agg if you are using SQL Server 2017+
select b.Id, b.Country, b.Alltypes,
String_Agg(v.type1,',') Type1,
String_Agg(v.type2,',') Type2
from tableb b
outer apply (
select Trim(value) t,
case when exists
(select * from tablea a where a.type1=Trim(value))
then Trim(value) end type1,
case when exists
(select * from tablea a where a.type2=Trim(value))
then Trim(value) end type2
from String_Split(alltypes, ',')
)v
group by Id, Country, AllTypes

Too long for a comment. Any chance you can refactor the DB?
I would suggest
tableA
col1 Type
A 1
B 1
..
F 2
G 2
Country
id Name
1 Italy
2 USA
Country_TableA
countryId aId
1 A
1 B

As commented previously, combining string_split and string_agg allows you to reach what you want. This is my (shorter) version:
select
b.ID, b.Country, b.Alltypes,
string_agg(a.Type1, ',') as Type1,
string_agg(aa.Type2, ',') as Type2
from TableB b
outer apply string_split(b.Alltypes, ',')
left join TableA a
on a.Type1 = ltrim(rtrim(value))
left join TableA aa
on aa.Type2 = ltrim(rtrim(value))
group by b.ID, b.Country, b.Alltypes
You can test on this db<>fiddle

Related

Adding values to a column in SQL(Snowflake) from another table

I have two tables A,B
Table A:
uid category
1 a
1 b
1 c
2 b
2 d
Table B:
category
d
e
Table A contains user id and category
Table B contains top 2 most categories selected by the user
How can I add categories from table B to category column in table A but only the distinct value.
Final result
uid category
1 a
1 b
1 c
1 d
1 e
2 b
2 d
2 e
It is possible to generate missing rows by perfroming CROSS JOIN of distinct UID from tableA and categories from tableB:
WITH cte AS (
SELECT A.UID, B.CATEGORY
FROM (SELECT DISTINCT UID FROM tableA) AS A
CROSS JOIN tableB AS B
)
SELECT A.UID, A.CATEGORY
FROM tableA AS A
UNION ALL
SELECT C.UID, C.CATEGORY
FROM cte AS c
WHERE (c.UID, c.category) NOT IN (SELECT A.UID, A.CATEGORY
FROM tableA AS A)
ORDER BY 1,2;
Sample input:
CREATE OR REPLACE TABLE tableA(uid INT, category TEXT)
AS
SELECT 1,'a' UNION ALL
SELECT 1,'b' UNION ALL
SELECT 1,'c' UNION ALL
SELECT 2,'b' UNION ALL
SELECT 2,'d';
CREATE OR REPLACE TABLE tableB(category TEXT)
AS
SELECT 'd' UNION ALL SELECT 'e';
Output:
Let union take care of duplicates
select uid, category
from t1
union
select uid, category
from (select distinct uid from t1) t1 cross join t2
order by uid, category

Split comma separated values based on another table

I would like to split comma separated values based on another table
I cannot normalize it since original table has over 8 million rows. It crushed my laptop when I tried it.
How can I put data into relevant columns and create a new column if data is not found.
For example:
TableA,
Type1 Type2
---------------------
A F
B G
C H
D I
E NULL
TableB
ID Country AllTypes
---------------------------------
1 Italy A, B, C
2 USA D, E, A, F
4 Japan I, O, Z
5 UK NULL
By using these two tables, I would like to get the output such as
ID Country AllTypes Type1 Type2 UnCaptured
----------------------------------------------------------------------
1 Italy A, B, C A, B, C NULL NULL
2 USA D, E, G, F D, E G, F NULL
4 Japan I, O, Z NULL I O, Z
5 UK NULL NULL NULL NULL
This is I have done so far
with TableA as (
select 'A' as Type1, 'F' as Type2 union all
select 'B', 'G' union all
select 'C', 'H' union all
select 'D', 'I' union all
select 'E', NULL
),
TableB as (
select 1 as ID, 'Italy' as Country, 'A, B, C' as Alltypes union all
select 2, 'USA', 'D, E, A, F' union all
select 4, 'Japan', 'I', 'O', 'Z' union all
select 5, 'UK', NULL
)
select b.Id, b.Country, b.Alltypes,
String_Agg(v.type1,',') Type1,
String_Agg(v.type2,',') Type2
**String_Agg(v.Type3,',') Uncaptured*** ------- This query
from tableb b
outer apply (
select Trim(value) t,
case when exists
(select * from tablea a where a.type1=Trim(value))
then Trim(value) end type1,
case when exists
(select * from tablea a where a.type2=Trim(value))
then Trim(value) end Type2,
Case when not exists ------------This query
( (select * from tablea a where a.type1=Trim(value)) -------
and ------
(select * from tablea a where a.type2=Trim(value))------
) then Trim(value) end Type3** -------------
from String_Split(alltypes, ',')
)v
group by Id, Country, AllTypes
Without highlighted queries(-----) which are for creating a new column (Uncaptured), it works ok like below.
Id Country Alltypes Type1 Type2
1 Italy A, B, C A,B,C NULL
2 USA D, E, A, F D,E,A F
4 Japan I, O, Z I NULL
5 UK NULL NULL NULL
But if I add those highlighted queries, it shows error. I was also thinking of else but did not work as well.
Could someone help me please?
----------------------- DDL+DML: Should have been provided by the OP !
DROP TABLE IF EXISTS TableA,TableB
GO
create table TableA(Type1 CHAR(1), Type2 char(1))
GO
INSERT TableA (Type1,Type2) VALUES
('A', 'F' ),
('B', 'G' ),
('C', 'H' ),
('D', 'I' ),
('E', NULL )
GO
CREATE TABLE TableB (ID INT, Country NVARCHAR(100), AllTypes NVARCHAR(100))
GO
INSERT TableB (ID,Country,AllTypes)VALUES
(1, 'Italy','A, B, C' ),
(2, 'USA ','D, E, G, F' ),
(4, 'Japan','I, O, Z' ),
(5, 'UK ','NULL' )
GO
----------------------- Solution
;WITH MyCTE AS (
SELECT ID,Country,AllTypes, MyType = TRIM([value])
FROM TableB
CROSS APPLY string_split(AllTypes,',')
)
,MyCTE02 as (
SELECT ID,Country,AllTypes, MyType,a1.Type1,a2.Type2,
UnCaptured = CASE WHEN a1.Type1 IS NULL and a2.Type2 IS NULL THEN MyType END
FROM MyCTE c
LEFT JOIN TableA a1 ON c.MyType = a1.Type1
LEFT JOIN TableA a2 ON c.MyType = a2.Type2
)
SELECT ID,Country,AllTypes--,MyType
,Type1 = STRING_AGG(Type1,','),Type2 = STRING_AGG(Type2,','),UnCaptured = STRING_AGG(UnCaptured,',')
FROM MyCTE02
GROUP BY ID,Country,AllTypes
GO
How about
outer apply (
select Trim(value) t, a1.type1, a2.type2,
CASE WHEN COALESCE(a1.type1, a2.type2) IS NULL THEN Trim(s.value) END unCaptured
from String_Split(alltypes, ',') s
left join tablea a1 where a1.type1=Trim(s.value)
left join tablea a2 where a2.type2=Trim(s.value)
)v

Count distinct by link on sql?

I'm trying to count distinct by the link between two columns.
Here is the example.
rownum
type
id
1
A
a
2
A
b
3
B
b
4
B
c
5
C
c
6
C
d
If I count distinct by type column, it returns 3. However, what I'd like to do is to consider rownum 2 and 3, 4 and 5 are not distinctive because they got the same value on id column.
To rephrase,
type
array of id
A
a, b
B
b, c
C
c, d
Since A and B got same b, and B and C got same c on their arrays, it would return 1 as a result.
I have no idea where to start. Would appreciate if I can get any hint or something.
Consider below:
you might use STRING_AGG
WITH TMP_TBL AS
(
SELECT 1 AS ROWNUM, 'A' AS TYPE, 'a' AS ID UNION ALL
SELECT 2,'A','b' UNION ALL
SELECT 3,'B','b' UNION ALL
SELECT 4,'B','b' UNION ALL
SELECT 5,'C','c' UNION ALL
SELECT 6,'C','d'
);
SELECT DISTINCT TYPE,N_ID
FROM
(
SELECT TYPE,STRING_AGG(ID)OVER(PARTITION BY TYPE) AS N_ID FROM TMP_TBL
)

Get Distinct values without null

I have a table like this;
--Table_Name--
A | B | C
-----------------
A1 NULL NULL
A1 NULL NULL
A2 NULL NULL
NULL B1 NULL
NULL B2 NULL
NULL B3 NULL
NULL NULL C1
I want to get like this ;
--Table_Name--
A | B | C
-----------------
A1 B1 C1
A2 B2 NULL
NULL B3 NULL
How should I do that ?
Here's one option:
sample data is from line #1 - 9
the following CTEs (lines #11 - 13) fetch ranked distinct not null values from each column
the final query (line #15 onward) returns desired result by outer joining previous CTEs on ranked value
SQL> with test (a, b, c) as
2 (select 'A1', null, null from dual union all
3 select 'A1', null, null from dual union all
4 select 'A2', null, null from dual union all
5 select null, 'B1', null from dual union all
6 select null, 'B2', null from dual union all
7 select null, 'B3', null from dual union all
8 select null, null, 'C1' from dual
9 ),
10 --
11 ta as (select distinct a, dense_rank() over (order by a) rn from test where a is not null),
12 tb as (select distinct b, dense_rank() over (order by b) rn from test where b is not null),
13 tc as (select distinct c, dense_rank() over (order by c) rn from test where c is not null)
14 --
15 select ta.a, tb.b, tc.c
16 from ta full outer join tb on ta.rn = tb.rn
17 full outer join tc on ta.rn = tc.rn
18 order by a, b, c
19 /
A B C
-- -- --
A1 B1 C1
A2 B2
B3
SQL>
If you have only one value per column, then I think a simpler solution is to enumerate the values and aggregate:
select max(a) as a, max(b) as b, max(c) as c
from (select t.*,
dense_rank() over (partition by (case when a is null then 1 else 2 end),
(case when b is null then 1 else 2 end),
(case when c is null then 1 else 2 end)
order by a, b, c
) as seqnum
from t
) t
group by seqnum;
This only "aggregates" once and only uses one window function, so I think it should have better performance than handling each column individually.
Another approach is to use lateral joins which are available in Oracle 12C -- but this assumes that the types are compatible:
select max(case when which = 'a' then val end) as a,
max(case when which = 'b' then val end) as b,
max(case when which = 'c' then val end) as c
from (select which, val,
dense_rank() over (partition by which order by val) as seqnum
from t cross join lateral
(select 'a' as which, a as val from dual union all
select 'b', b from dual union all
select 'c', c from dual
) x
where val is not null
) t
group by seqnum;
The performance may be comparable, because the subquery removes so many rows.

SQL Query with group by clause, but counting two distinct values as if they were the same

I have a simple table with two columns, like the one below:
Id | Name
0 | A
1 | A
2 | B
3 | B
4 | C
5 | D
6 | E
7 | E
I want to make a SQL query which will count how many times each "Name" appears on the table. However, I need a few of these values to count as if they were the same. For example, a normal group by query would be:
select Name, count(*)
from table
group by Name
The above query would produce the result:
Name | Count
A | 2
B | 2
C | 1
D | 1
E | 2
but I need the query to count "A" and "B" as if they were only "A", and to count "D" and "E" as if they were only "D", so that the result would be like:
Name | Count
A | 4 // (2 "A"s + 2 "B"s)
C | 1
D | 3 // (1 "D" + 2 "E"s)
How can I make this kind of query?
You can make translation with case. Also, you can use subquery or CTE so you don't have to repeat yourself:
with cte as (
select
case Name
when 'B' then 'A'
when 'E' then 'D'
else Name
end as Name
from table
)
select Name, count(*)
from cte
group by Name
or with with online translation table:
select
isnull(R.B, t.Name), count(*)
from table as t
left outer join (
select 'A', 'B' union all
select 'E', 'D'
) as R(A, B) on R.A = t.Name
group by isnull(R.B, t.Name)
If you need A and B, D and E, to count the same, you can build a query like this:
SELECT
CASE Name WHEN 'B' THEN 'A' WHEN 'E' THEN 'D' ELSE Name END as Name
, COUNT(*)
FROM table
GROUP BY CASE Name WHEN 'B' THEN 'A' WHEN 'E' THEN 'D' ELSE Name END
Demo on sqlfiddle.
With a layer of abstraction and a CASE (SQL Fiddle example):
;WITH x AS
(
SELECT CASE Name WHEN 'B' THEN 'A'
WHEN 'E' THEN 'D'
ELSE Name
END AS Name
FROM Table1
)
SELECT Name, COUNT(1)
FROM x
GROUP BY Name
With a translation table (SQL Fiddle):
CREATE TABLE Translate(FromName char(1), ToName char(1));
INSERT INTO Translate VALUES ('B', 'A'), ('E', 'D');
SELECT COALESCE(t.ToName, a.Name) Name, COUNT(1)
FROM Table1 a
LEFT OUTER JOIN Translate t ON a.Name = t.FromName
GROUP BY COALESCE(t.ToName, a.Name)
FWIW, you can also do this with a VALUES derived table instead of a real table (SQL Fiddle):
SELECT COALESCE(t.ToName, a.Name) Name, COUNT(1)
FROM Table1 a
LEFT OUTER JOIN
(
VALUES ('B', 'A'),
('E', 'D')
) t(FromName, ToName) ON a.Name = t.FromName
GROUP BY COALESCE(t.ToName, a.Name)
this works
select t.a,count(t.id) from (
select case name when 'A' then 'A' when 'B' then 'A'
when 'C' then 'C' when 'D' then 'C'
when 'D' then 'D' when 'E' then 'D' end as A,id
from test) as t
group by A;