Group data based on different conditions - sql

Tables and logic
Example Output
Lets assume "column1" and "column2" are the only columns in the data table. So I'm trying to group the data based on column2. For example, and the table is as below:
APL Apple
APL Apple
APL Apple
ORG Orange
ORG Apple
GVA Apple.
APL is Apple, ORG is Orange so they will be grouped into 'A' and 'B' respectively. and ORG Apple and GVA Apple will be grouped into "others". How do I do that? Do I need a lookup table? If yes, how do I merge the lookup table into the data table?

SELECT COL1, COL2, ifnull(LkpTbl.grp, 'Others') grp
FROM Table1
left outer join LkpTbl
in table1.col1 = LkpTbl.col1
put your lookup table only APL and ORG that is needed to be grouped

Do you just want a case expression?
select (case when column1 = 'APL' and column2 = 'Apple' then 'A'
when column1 = 'ORG' and column2 = 'Orange' then 'B'
else 'Others'
end) as column3
from t;
If you want this as a lookup table, it could be something like this:
select t.*, coalesce(v.grp, 'Others')
from t left join
(values ('APL', 'Apple', 'A'), ('ORG', 'Orange', 'B')
) v(column1, column2, grp)
on t.column1 = v.column1 and t.column2 = v.column2;

Related

SQL CASE statement returns duplicate values

Here is how my data looks
title value
------------
t1 v1
t2 v2
t3 v3
Now I want t1 and t2 to be inferred as the same value t12. So, I do:
SELECT
CASE
WHEN title = 't1' OR title = 't2'
THEN 't12'
ELSE title
END AS inferred_title,
COUNT(value)
FROM
my_table
GROUP BY
inferred_title;
I expected the output to be:
inferred title values
-----------------------
t12 2
t3 1
But what I end up getting is:
inferred title values
--------------------------
t12 1
t12 1
t3 1
How do I make it behave the way I want it to? I don't want the duplicated rows.
The problem is scoping. You must have an inferred_title in the table. Either give a new column alias or repeat the expression:
SELECT (CASE WHEN title IN ('t1', 't2') THEN 't12'
ELSE title
END) AS inferred_title,
COUNT(value)
FROM my_table
GROUP BY (CASE WHEN title IN ('t1', 't2') THEN 't12'
ELSE title
END);
Do the "merge" case in a derived table (sub-query), group by its result:
SELECT inferred_title, COUNT(value)
FROM
(
SELECT CASE WHEN title = 't1' OR title = 't2' THEN 't12'
ELSE title
END AS inferred_title,
value
FROM my_table
) dt
GROUP BY inferred_title;
This saves you some typing, is less error prone and easier to maintain - and is
ANSI SQL compliant!
Select Title, COUNT(Title) AS Totals
From my_table
Group By Title
Having COUNT(Title)>1
Order By 2 desc

Using case on return value of subquery

I'd like to write case clause which takes its input from inner query. Please let me describe this in more detail.
Say I have a table:
create table food
( fruit varchar2(50),
chips varchar2(50)
);
with values
INSERT INTO food
(fruit, chips)
VALUES ('Apple', 'Paprika');
INSERT INTO food
(fruit, chips)
VALUES ('Orange', 'Salt');
DB Fiddle
I would like to write a query that will show:
fruit, chips and 1 if fruit is 'Apple' and 0 otherwise
which would give a result (example)
'Apple', 'Paprika', 1
'Orange, 'Salt', 0
I do not want use joins for this. It has to be subquery. That's a requirement I must comply with.
I've come up with the following query:
select f.fruit,
((case (select ff.fruit from food ff)
when ff.fruit = 'Apple' then 1 else 0 end ) as is_apple) from food f;
However, I get the following error ORA-00905: missing keyword
You don't need a subquery for this:
select fruit, chips,
case when fruit = 'Apple'
then 1
else 0
end as is_apple
from food
If the value must be the result of a subquery, you may use:
select fruit, chips,
(select case when f2.fruit = 'Apple'
then 1
else 0
end
from food f2
where f.rowid = f2.rowid
)
from food f
If it has to be subquery then use dual:
select fruit, chips,
(select case food.fruit when 'Apple' then 1 else 0 end from dual) is_apple
from food;
or subselect from food using primary key (if your table contains it), or rowid:
select fruit, chips,
case (select fruit from food ff where ff.rowid = food.rowid)
when 'Apple' then 1
else 0
end is_apple
from food;
demo

Comparing two tables that doesn't have unique key

I need to compare two tables data and check which attributed are mismatching, tables have same table definition, but the problem is i dint have a unique key to compare. I tried to use
CONCAT(CONCAT(CONCAT(table1.A, Table1.B))
=CONCAT(CONCAT(CONCAT(table2.A, Table2.B))
but still facing duplicate rows also tried NVL on few columns but didn't work
SELECT
UT.cat,
PD.cat
FROM
EM UT, EM_63 PD
WHERE
NVL(UT.cat, 1) = NVL(PD.cat, 1) AND
NVL(UT.AT_NUMBER, 1) = NVL(PD.AT_NUMBER, 1) AND
NVL(UT.OFFSET, 1) = NVL(PD.OFFSET, 1) AND
NVL(UT.PROD, 1) = NVL(PD.PROD, 1)
;
There are 34k records in one table 35k records in another table, but if I run the above query, the count of rows is 3 millions.
Columns in table:
COUNTRY
CATEGORY
TYPE
DESCRIPTION
Sample data :
Table 1 :
COUNTRY CATEGORY TYPE DESCRIPTION
US C T1 In
IN A T2 OUT
B C T2 IN
Y C T1 INOUT
Table 2:
COUNTRY CATEGORY TYPE DESCRIPTION
US C T2 In
IN B T2 Out
Q C T2 IN
Expected output:
column Matched unmatched
COUNTRY 2 1
CATEGORY 2 1
TYPE 2 1
DESCRIPTION 3 0
In the most general case (when you may have duplicate rows, and you want to see which rows exist in one table but not in the other, and ALSO which rows may exist in both tables, but the row exists 3 times in the first table but 5 times in the other):
This is a very common problem with a settled "best solution" which for some reason it seems most people are still not aware of, even though it was developed on AskTom many years ago and has been presented numerous times.
You do NOT need a join, you do not need a unique key of any kind, and you don't need to read either table more than once. The idea is to add two columns to show from which table each row comes, do a UNION ALL, then GROUP BY all the columns except the "source" columns and show the count for each table. Something like this:
select count(t_1) as count_table_1, count(t_2) as count_table_2, col1, col2, ...
from (
select 'x' as t_1, null as t_2, col1, col2, ...
from table_1
union all
select null as t_1, 'x' as t_2, col1, col2, ...
from table_2
)
group by col1, col2, ...
having count(t_1) != count(t_2)
;
Start with this query to check if these 4 columns form a key.
select occ_total,occ_ut,occ_pd
,count(*) as records
from (select count (*) as occ_total
,count (case tab when 'UT' then 1 end) as occ_ut
,count (case tab when 'PD' then 1 end) as occ_pd
from select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD from EM
union all select 'PD' ,cat,AT_NUMBER,OFFSET,PROD from EM_63 PD
) t
group by cat,AT_NUMBER,OFFSET,PROD
) t
group by occ_total,occ_ut,occ_pd
order by records desc
;
After you have chosen your "key",you can use the following query to see the attributes' values
select count (*) as occ_total
,count (case tab when 'UT' then 1 end) as occ_ut
,count (case tab when 'PD' then 1 end) as occ_pd
,count (distinct att1) as cnt_dst_att1
,count (distinct att2) as cnt_dst_att2
,count (distinct att3) as cnt_dst_att3
,...
,listagg (case tab when 'UT' then att1 end) within group (order by att1) as att1_vals_ut
,listagg (case tab when 'PD' then att1 end) within group (order by att1) as att1_vals_pd
,listagg (case tab when 'UT' then att2 end) within group (order by att2) as att2_vals_ut
,listagg (case tab when 'PD' then att2 end) within group (order by att2) as att2_vals_pd
,listagg (case tab when 'UT' then att3 end) within group (order by att3) as att3_vals_ut
,listagg (case tab when 'PD' then att3 end) within group (order by att3) as att3_vals_pd
,...
from select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from E M
union all select 'PD' ,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from EM_63 PD
) t
group by cat,AT_NUMBER,OFFSET,PROD
;
The problem with CONCATis, that you could get invalid matches, if your data looks similar to this:
table1.A = '123'
table1.B = '456'
concatenates to: '123456'
table2.A = '12'
table2.B = '3456'
concatenates also to: '123456'
You have to compare the fields individually: table1.A = table2.A AND table1.B = table2.B

Simple SQL Query 1

I'm using PL/SQL if that matters.
Table = Stuff
ID: FRUIT:
100 Apple
100 Grape
200 Apple
200 Orange
550 Apple
700 Orange
800 Orange
900 Grape
... ...
I want to list all of the Apples and their IDs that do NOT share the same ID as Orange. How do I go about doing this?
The output should be:
100 Apple
550 Apple
You can do this with a subquery so you effectively pick all of the ID's for Oranges out in this subquery then pick all of the fruit which are Apples and ID's aren't in the subquery. Something like this;
SELECT *
FROM stuff
WHERE fruit = 'Apple'
AND ID NOT IN (SELECT ID FROM stuff WHERE fruit = 'Orange')
You can select only once from the table using CASE EXPRESSION and a GROUP BY WITH HAVING clause like this :
SELECT t.id,
MAX(CASE WHEN t.FRUIT = 'Apple' THEN t.FRUIT end) as fruit
FROM stuff t
GROUP BY t.id
HAVING MAX(CASE WHEN t.FRUIT = 'Orange' THEN 1 ELSE 0 END) = 0
You are subtracting one set of records from another and a subquery will do the job.
Edited for your new data set
select *
from stuff
where fruit = 'Apple'
and id not in (
select ID from stuff where fruit != 'Apple'
);
Or you could use a MINUS query as well.
There are no need to build up full list of Orange's IDs, just use not exist:
select *
from
stuff apple_list
where
fruit = 'Apple'
and
not exists (
select null
from stuff orange_instance
where orange_instance.id = apple_list.id
)
or do same thing with outer join:
select
id, fruit
from (
select
apple_list.id,
apple_list.fruit,
nvl2(orange_instance.id, 'orange_here', 'no_orange') orange_flag
from
stuff apple_list,
stuff orange_instance
where
apple_list.fruit = 'Apple'
and
orange_instance.id (+) = apple_list.id
and
orange_instance.fruit (+) = 'Orange'
)
where
orange_flag = 'no_orange'
Second variant needs distinct in select if there are possibility of having two Oranges with same id.
Or you could do it using the MINUS set operator:
SELECT a.ID, a.FRUIT
FROM STUFF a
WHERE a.FRUIT = 'Apple'
MINUS
SELECT b.ID, 'Apple' AS FRUIT
FROM STUFF b
WHERE b.FRUIT = 'Orange'
Best of luck.

How can I acces the output from the first select statement

I have a table Like this
Col1 | Col2
-----------
a | d
b | e
c | a
Now I want to create an statement to get an output like this:
First| Second
-------------------
a | Amsterdamm
b | Berlin
c | Canada
...
So far I have this consturct what is not working
SELECT *
FROM(
SELECT DISTINCT
CASE
when Col1 IS NULL then 'NA'
else Col1
END
FROM Table1
UNION
SELECT DISTINCT
CASE
when Col2 IS NULL then 'NA'
else Col2
END
FROM Table1
) AS First
,
(
SELECT DISTINCT
when First= 'a' then 'Amsterdam'
when First= 'b' then 'Berlin'
when First= 'c' then 'Canada'
) AS Second
;
can you help me with that
Sorry I have to edit my question to be more specific.
Not as familiar with DB2... I'll lookup if it has a concat function in a sec... and it does.
SELECT First, case when first = 'a' then
concat('This is a ',first)
case when first = 'b' then
concat('To Be or not to ',first)
case else
concat('This is a ',first) end as Second
FROM (
SELECT coalesce(col1, 'NA') as First
FROM Table
UNION
SELECT coalesce(col2, 'NA')
FROM table) SRC
WHERE first <> 'NA'
What this does is generate a single inline view called src with a column called first. If col1 or col2 of table are null then it substitutes NA for that value. It then concatenates first and the desired text excluding records with a first value of 'NA'
Or if you just create an inline table with the desired values and join in...
SELECT First, x.b as Second
FROM (
SELECT coalesce(col1, 'NA') as First
FROM Table
UNION
SELECT coalesce(col2, 'NA')
FROM table) SRC
INNER JOIN (select a,b
from (values ('a', 'This is a'),
('b', 'To B or not to' ),
('c', 'I like cat whose name starts with')) as x(a,b)) X;
on X.a = src.first
WHERE first <> 'NA'
Personally I find the 2nd option easier to read. Though if you have meaning for a,b,c I would think you'd want that stored in a table somewhere for additional access. In code seems like a bad place to store data like this that could change.
Assuming you want
a this is a a
b this is a b
c this is a c
d this is a d
e this is a e
thanks to xQbert
I could solve this problem like this
SELECT FirstRow, concat
(
CASE FirstRow
WHEN 'AN' then 'amerstdam'
WHEN 'G' then 'berlin'
ELSE 'NA'
END, ''
) AS SecondRow
FROM(
Select coalesce (Col1, 'NA') as FirstRow
FROM Table1
UNION
Select coalesce (Col2, 'NA')
FROM Table1) SRC
WHERE FirstRow <> 'NA'
;