unexpected results of a left join - sql

I have a table named test_table_1 and a view named temp_test_view_1.
The goal is to find the NEW_CATEGORY_ID for every ID in the test_table_1. The relationship between NEW_CATEGORY_ID and ID is in the view temp_test_view_1
I use the left join for that purpose, with the base table being test_table_1.
In my opinion, the left join returns some unexpected results.
Here is all the data contained in table test_table_1:
select * from test_table_1:
id name
1 'a'
null 'd'
3 'd'
2 'c'
2 'b'
Here is the view script:
create view temp_test_view_1 as
select
id,
id_description,
case when id_category = 'phone_id' then 'phone' else 'other' end as new_id_category
from (
select
1 as id,
'first id' as id_description,
null as id_category
from dummy
union all
select
2 as id,
'second id' as id_description,
'phone_id' as id_category
from dummy
) x
;
Here is the query I use to left join the view to a table and then project the results to see the corresponding new_id_category for every ID in test_table_1:
select
t1.id,
t1.name,
t2.id,
t2.id_description,
t2.new_id_category
from test_table_1 t1
left join temp_test_view_1 t2 on t1.id = t2.id
output is:
id name,id id_description new_id_category
null 'd' null null, 'other'
1 'a' 1 'first id' 'other'
2 'b' 2 'second id' 'phone'
2 'c' 2 'second id' 'phone'
3 'd' null null 'other'
desired output:
id name,id id_description new_id_category
null 'd' null null, null
1 'a' 1 'first id' 'other'
2 'b' 2 'second id' 'phone'
2 'c' 2 'second id' 'phone'
3 'd' null null null
Can someone explain whether the result the query produces is correct and why if so? I expect this left join to return null on the columns retrieved from the view, as u can see from my desired outcome.
I did not test the query in DB systems offered by other vendors.
EDIT:
I tested it on sqlfiddle MS SQL, and it produces the desired output. Here is the link:
sqlfiddle.com/#!18/1c788/2
If someone needs the code itself to reproduce the results in MS SQL, (while he will fail to reproduce the resulst and instead get the desired result):
create view temp_test_view_1 as
select
id,
id_description,
case when id_category = 'phone_id' then 'phone' else 'other' end as new_id_category
from (
select
1 as id,
'first id' as id_description,
null as id_category
union all
select
2 as id,
'second id' as id_description,
'phone_id' as id_category
) x
;
select *
into test_table_1
from (
select
1 as id,
'a' as name
union all
select
2 as id,
'b' as name
union all
select
2 as id,
'c' as name
union all
select
3 as id,
'd' as name
union all
select
null as id,
'd' as name
) x;
select
t1.id,
t1.name,
t2.id,
t2.id_description,
t2.new_id_category
from test_table_1 t1
left join temp_test_view_1 t2 on t1.id = t2.id

Related

how to avoid duplicates in hive query

I have two tables:
table1
the_date | my_id |
02/03/2021,123
02/03/2021, 1234
02/03/2021, 12345
table2
the_date | my_id |seq | txt
02/03/2021, 1234, 1 , 'OK'
02/03/2021, 12345, 1, 'OK'
02/03/2021, 12345, 2, 'HELLO HI THERE'
02/03/2021, 123456, 1, 'Ok'
Here is my code:
WITH AB AS (
SELECT A1.my_id
FROM DB1.table1 A1 , DB1.MSG_REC A2 WHERE
A1.my_id=A2.my_id
),
BC AS (
SELECT AB.the_date
COUNT ( DISTINCT (CASE WHEN (TXT like '%OK%') THEN AB.my_id ELSE NULL END )) AS
CASE1 ,
COUNT ( DISTINCT (CASE WHEN (TXT like '%HELLO HI THERE%') THEN AB.my_id ELSE NULL END )) AS
CASE2
FROM AB left JOIN DB1.my_id BC ON AB.my_id =BC.my_id
The issue that stems from above is I am looping over the value '12345' twice because it satisfies both of the case statements.
That causes data duplicates when capturing metrics of the counts. Is there a way to execute the first case and then perform the second case but exclude looping any of the "my_id' records from the first case.
So for example, when it is time to run the above script and the first case executes, it will pick up the below records and the count would be 3
02/03/2021, 1234, 1 , 'OK'
02/03/2021, 12345, 1, 'OK'
02/03/2021, 123456, 1, 'Ok
The second case should only be looping through the below records and the count would be only 1
02/03/2021, 12345, 2, 'HELLO HI THERE'
CASE1 would be 4 and CASE2 would by 2 if I don't create a condition to circumvent this issue. Any tips or suggestions?
Assign case to each your ID before DISTINCT aggregation . After that do distinct aggregation, in such way you will eliminate same IDs counted in different cases. See comments in the code:
select --do final distinct aggregation
count(distinct (case when assigned_case='CASE1' then my_id else null end ) ) as CASE1,
count(distinct (case when assigned_case='CASE2' then my_id else null end ) ) as CASE2
from
(
select my_id
--assign single CASE to all rows with the same id based on some logic:
case when case1_flag = 1 then 'CASE1'
when case1_flag = 1 then 'CASE2'
else NULL
end as assigned_case
from
(--calculate all CASE flags for each ID
select AB.my_id,
max(CASE WHEN (TXT like '%OK%') THEN 1 ELSE NULL END) over (partition by AB.my_id) as case1_flag
max(CASE WHEN (TXT like '%HELLO HI THERE%') THEN 1 ELSE NULL END) over (partition by AB.my_id) as case2_flag
from ...
) s
) s

Want to Return Empty Rows With a Case When Statement

Lets say I'm using 2 case when statements to group my data, like in the below example:
select case
when group1 = 'A' then 'Large'
when group1 = 'B' then 'Medium'
else 'Small'
end as 'Order Size'
,case
when method = 'Delivery' then 'Delivery'
else 'Pick-up'
end as 'Distribution Method'
,count(distinct(OrderIDs))
from OrderTable
GROUP BY
select case
when group1 = 'A' then 'Large'
when group1 = 'B' then 'Medium'
else 'Small'
end
,case
when method = 'Delivery' then 'Delivery'
else 'Pick-up'
end
Lets also say that there were no "Large" deliveries that were "Pick-Up'. Currently, this query will not return a row with Large,PickUp category.
Is there a way to have a row returned with 0’s if there is nothing that meets the multiple case when criteria?
Use a cross join to generate the rows and left join to bring in the data:
select os.OrderSize, coalesce(d.DistributionMethod, 'Pick-Up') as
count(*)
from (select 'Large' as OrderSize union all
select 'Medium' as OrderSize union all
select 'Small' as OrderSize
) os cross join
(select 'Delivery' as DistributionMethod union all
select 'Pick-Up' as DistributionMethod
) d left join
OrderTable ot
on ( (ot.group1 = 'A' and os.OrderSize = 'Large') or
(ot.group1 = 'B' and os.OrderSize = 'Medium') or
(ot.group1 not in ('A', 'B') and os.OrderSize = 'Small')
) and
ot.method = d.DistributionMethod
group by os.OrderSize, coalesce(d.DistributionMethod, 'Pick-Up');
Not all databases support the creation of a table of constants using this syntax, but there is generally some syntax that does this.
You could select a recordset that contains the required values and then left join your grouped recordset from there. Following is an example for SQL Server where you would join your results to [Groupings].[OrderSize] and [Groupings].[DistributionMethod]:
SELECT *
FROM (
SELECT *
FROM (
SELECT 'Large' AS [OrderSize]
UNION
SELECT 'Medium' AS [OrderSize]
UNION
SELECT 'Small' AS [OrderSize]
) AS [OrderSizes]
CROSS JOIN (
SELECT 'Delivery' AS [DistributionMethod]
UNION
SELECT 'Pick-up' AS [DistributionMethod]
) AS [DistributionMethods]
) AS [Groupings]
LEFT JOIN ...

Pulling data while pivoting at the same time

ID | Type | Code
1 Purchase A1
1 Return B1
1 Exchange C1
2 Purchase D1
2 Return NULL
2 Exchange F1
3 Purchase G1
3 Return H1
3 Exchange I1
4 Purchase J1
4 Exchange K1
Above is sample data. What I want to return is:
ID | Type | Code
1 Purchase A1
1 Return B1
1 Exchange C1
3 Purchase G1
3 Return H1
3 Exchange I1
So if a field is null in code or the values of Purchase, Return and Exchange are not all present for that ID, ignore that ID completely. However there is one last step. I want this data to then be pivoted this way:
ID | Purchase | Return | Exchange
1 A1 B1 C1
3 G1 H1 I1
I asked this yesterday without the pivot portion which you can see here:
SQL query to return data only if ALL necessary columns are present and not NULL
However I forgot to note the last part. I tried to play around with excel but had no luck. I tried to make a temp table but the data is too large to do that so I was wondering if this could all be done in 1 sql statement?
I personally used this query with success:
select t.*
from t
where 3 = (select count(distinct t2.type)
from t t2
where t2.id = t.id and
t2.type in ('Purchase', 'Exchange', 'Return') and
t2.Code is not null
);
So how can we adjust that to include the pivot part. Is that possible?
Quite easily. Just use conditional aggregation:
select t.id,
max(case when type = 'Purchase' then code end) as Purchase,
max(case when type = 'Exchange' then code end) as Exchange,
max(case when type = 'Return' then code end) as Return
from t
where 3 = (select count(distinct t2.type)
from t t2
where t2.id = t.id and
t2.type in ('Purchase', 'Exchange', 'Return') and
t2.Code is not null
)
group by t.id;
This is actually simpler to express (in my opinion) using having without the subquery:
select t.id,
max(case when type = 'Purchase' then code end) as Purchase,
max(case when type = 'Exchange' then code end) as Exchange,
max(case when type = 'Return' then code end) as Return
from t
group by t.id
having max(case when type = 'Purchase' then code end) is not null and
max(case when type = 'Exchange' then code end) is not null and
max(case when type = 'Return' then code end) is not null;
Many databases would allow:
having Purchase is not null and Exchange is not null and Return is not null
But Oracle doesn't allow the use of table aliases in the having clause.
UPDATE - Based on discussion in the question comments, my previous query had a faulty assumption (which I carried over from what I thought I saw in the original query in the question); I've eliminated the bad assumption.
select id
, max(case when type='Purchase' then Code end) Purchase
, max(case when type='Return' then Code end) Return
, max(case when type='Exchange' then Code end) Exchange
from t
where code is not null
and type in ('Purchase', 'Return', 'Exchange')
group by id
having count(distinct type) = 3
I will point out again (as I did in your other thread) that analytic functions will do the job much faster - they need the base table to be read just once, and there are no explicit or implicit joins.
with
test_data ( id, type, code ) as (
select 1, 'Purchase', 'A1' from dual union all
select 1, 'Return' , 'B1' from dual union all
select 1, 'Exchange', 'C1' from dual union all
select 2, 'Purchase', 'D1' from dual union all
select 2, 'Return' , null from dual union all
select 2, 'Exchange', 'F1' from dual union all
select 3, 'Purchase', 'G1' from dual union all
select 3, 'Return' , 'H1' from dual union all
select 3, 'Exchange', 'I1' from dual union all
select 4, 'Purchase', 'J1' from dual union all
select 4, 'Exchange', 'K1' from dual
)
-- end of test data; actual solution (SQL query) begins below this line
select id, purchase, return, exchange
from ( select id, type, code
from ( select id, type, code,
count( distinct case when type in ('Purchase', 'Return', 'Exchange')
then type end
) over (partition by id) as ct_type,
count( case when code is null then 1 end
) over (partition by id) as ct_code
from test_data
)
where ct_type = 3 and ct_code = 0
)
pivot ( min(code) for type in ('Purchase' as purchase, 'Return' as return,
'Exchange' as exchange)
)
;
Output:
ID PURCHASE RETURN EXCHANGE
--- -------- -------- --------
1 A1 B1 C1
3 G1 H1 I1
2 rows selected.

Using CASE to Mark No If No Results From SELECT Statement

is it possible to print "no" if no result found
SELECT mobileno,
CASE
WHEN region = '1234'
THEN 'Yes'
ELSE 'NO'
END
FROM subscriber
WHERE region = '1234'
and status = 1
and mobileno in (77777,88888)
Currently it only print 1 row like
77777,yes
but i want like following
77777,yes
88888,no
Update: One mobileno like 7777 may belongs from two regions then 7777 will get print with NO and YES in two rows if we remove region condition.
Sample Data
sr.No, Name, mobileno, region, status
1, abc, 77777, 1234, 1
2, xyz, 88888, 1222, 1
3, tyu, 22342, 9898, 1
4, abc, 77777, 8787, 1
Sample OutPut
77777, Yes
88888, No
You can 'create' a table by selecting from dual, and left joining :
SELECT t.dummy_num,
CASE WHEN s.mobileno is null then 'No' else 'Yes' end
FROM (SELECT 77777 as dummy_num from dual
UNION select 88888 from dual) t
LEFT JOIN subscriber s
ON(t.dummy_num = s.mobileno and s.region = '1234' and s.status = 1 )
Edit: you can also do it dynamically like this:
SELECT t.mobileno,
CASE WHEN s.mobileno is null then 'No' else 'Yes' end
FROM (select distinct mobileno from subscriber) t
LEFT JOIN subscriber s
ON(t.mobileno= s.mobileno and s.region = '1234' and s.status = 1 )
WHERE t.mobileno IN(777,888,.....)

Multiple rows to one row Oracle SQL

I have the following table
P_ID, PROGR, DATA
1 , 1 , 'DATO A'
1 , 2 , 'DATO B'
1 , 3 , 'DATO C'
2 , 1 , 'DATO D'
2 , 2 , 'DATO E'
3 , 1 , 'DATO G'
and I want to get this result
P_ID, DATA , DATA_1 , DATA_2
1 , 'DATO A', 'DATO B', 'DATO C'
2 , 'DATO D', 'DATO E', NULL
3 , 'DATO G', NULL , NULL
this can be done with a left join with the same table, something like this (not the exact result, but as an example)
select * from
(select * from MYTABLE where PROGR = 1) a
left join
(select * from MYTABLE where PROGR = 2) b
on a.P_ID = b.P_ID
left join
(select * from MYTABLE where PROGR = 3) c
on a.P_ID = c.P_ID;
The problem is that this query is fixed, and need to be rewritten if some P_ID get PROGR = 4. I think that I need to make a procedure, but I have been trying without success.
Thanks in advance.
You can use conditional aggregation:
select t.pid,
max(case when t.progr = 1 then t.data end) as data_1,
max(case when t.progr = 2 then t.data end) as data_2,
max(case when t.progr = 3 then t.data end) as data_3
from mytable t
group by t.pid;
To handle a variable number of columns, I can think of three solutions:
Put in enough columns to handle your data (some reasonable maximum).
Use dynamic SQL (execute immediate in PL/SQL).
Or, combine them into a single column.
Here is the last approach:
select t.pid, listagg(t.data, ', ') within group (order by t.progr)
from mytable t
group by t.pid;
Use below query.
select p_id,max(data_1) as data_1,max(data_2)as data_2,max(data_3) as data_3
from
(select P_ID,
case when progr=1 then
data
end data_1,
case when progr=2 then
data
end data_2,
case when progr=3 then
data
end data_3
from thursday_check)
group by p_id