SQL query and joins - sql

Please see my query below:
select I.OID_CUSTOMER_DIM, I.segment as PISTACHIO_SEGMENT,
MAX(CASE WHEN S.SUBSCRIPTION_TYPE = '5' THEN 'Y' ELSE 'N' END ) PB_SUBS,
max(case when S.SUBSCRIPTION_TYPE ='12' then 'Y' else 'N' end) DAILY_TASTE,
MAX(CASE WHEN S.SUBSCRIPTION_TYPE ='8' THEN 'Y' ELSE 'N' END) COOKING_FOR_TWO
FROM WITH_MAIL_ID i JOIN CUSTOMER_SUBSCRIPTION_FCT S
ON I.IDENTITY_ID = S.IDENTITY_ID
WHERE S.SITE_CODE ='PB'and S.SUBSCRIPTION_END_DATE is null
group by I.oid_customer_dim, I.segment
In this one I am getting 654105 rows, which is lower than the one of the joins table with_mail_id which has 706795 rows.
Now, for the qc purpose my manager is wondering as why I am not having all the rows in my final table. I tried to remove all the filters but the results are still not same in both tables. What am I doing wrong?
I am not very good in SQL yet and this thing is really confusing me.

You're doing an inner join on the two tables, so only rows from WITH_MAIL_ID that can join against CUSTOMER_SUBSCRIPTION_FCT will be returned. Additionally you have a group clause.
First the join. If you want to return all rows regardless of the join condition, you can use a left join, but in this case all the S. columns will be NULL, and you'll have to deal with that.
If you run this, you might see the count is the difference:
select count(*) from WITH_MAIL_ID i
left join CUSTOMER_SUBSCRIPTION_FCT S
on I.IDENTITY_ID = S.IDENTITY_ID
where s.IDENTITY_ID is NULL
The most likely thing however is that it's just the grouping. If you are grouping on two columns and selecting the max of various other columns based on that grouping, you would expect that the number of rows returned is less than the original table, otherwise why bother grouping?
If I have data like this:
groupkey1 value
1 2
1 10
2 1
2 1
Then I group by groupkey1, and select MAX(value) I would get 2 rows [1,2], [2,1], not 4 rows.

Related

Counting Booleans for Distinct and Non Distinct ID numbers

I have a simple table that looks like the following PNG file from the following join:
SELECT *
FROM tableA A
JOIN tableB B ON B.Main_SPACE_ID = A.Main_SPACE_ID
Table A contains Guest_ON and User_Controls (last 2 columns) and Table B contains Trigger_ON and DOCX_ON.
Issue:
What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
The problem is that subspace_ID from table B lives within the main_space_id from table A and therefore creates a situation where I am double counting.
I only want to count the trues for a distinct Main_space ID
Current Data Model
Desired Output:
From the above screenshot, I am trying to get a count of true values without double counting in the case for tableA_MAIN_SPACE_ID.
As you can see, each row is counted for true values as it relates to the subspace_ID (table B) for totals of 12 and 8 (1 if True, 0 if False) and for tableA, I am only counting distinct values so we only count Trues for a single MainspaceID and avoid recounting them.
If someone can advise on how to get this output from my current data model that would be very helpful!
My attempt as follows double counts trues for the Main space ID column..
SELECT
count(CASE WHEN B.TRIGGER_ON THEN 1 END) as TRIGGER_ON,
count(CASE WHEN B.DOCX_ON THEN 1 END) as DOCX_ON,
count(CASE WHEN A.GUEST_ON THEN 1 END) as SPRINTS,
count(CASE WHEN A.USER_CONTROLS THEN 1 END) as SPRINTS
FROM DataModel
What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
You can use conditional aggregation. In Snowflake, you can use the convenient COUNT_IF() for the first two columns. However, for the second two, you need COUNT(DISTINCT) with conditional logic:
SELECT COUNT_IF( B.Trigger_on ) as Trigger_On,
COUNT_IF( B. DOCX_ON ) as DOCX_ON,
COUNT(DISTINCT CASE WHEN A.GUEST_ON THEN A.Main_SPACE_ID END) as GUEST_ON,
COUNT(DISTINCT CASE WHEN A. USER_CONTROLS THEN A.Main_SPACE_ID END) as USER_CONTROLS
FROM tableA A JOIN
tableB B
ON B.Main_SPACE_ID = A.Main_SPACE_ID;
Mabye:
SELECT
COUNT(CASE WHEN B.TRIGGER_ON THEN 1 END) AS TRIGGER_ON,
COUNT(CASE WHEN B.DOCX_ON THEN 1 END) AS DOCX_ON,
(SELECT COUNT(*) FROM (SELECT DISTINCT A.MAIN_SPACE_ID, A.GUEST_ON FROM DataModel WHERE A.GUEST_ON = TRUE) A) AS GUEST_ON
(SELECT COUNT(*) FROM (SELECT DISTINCT A.USER_CONTROLS, A.GUEST_ON FROM DataModel WHERE A.USER_CONTROLS = TRUE) A) AS USER_CONTROLS
FROM DataModel

Partition table based on joined table

We have 2 Tables Lead and Task.
One lead can have multiple Tasks.
We want to determine if a Lead has a Task who's description contains String 'x'.
If the Lead has the String the it should belong to group1 if it doesn't to group2.
Then we want to count the leads per group and week.
The problem we have is that if a Lead has several tasks and one of them has string 'x' in its description and the others don't it is counted in both groups.
We would need something that resembles a break; statement in the IFF clause of the subquery, so that if the first condition = Contain string x is satisfied the other tasks are not counted anymore.
How would we achieve that?
So far we have the following statement:
--SQL:
SELECT LeadDate, GROUP, COUNT(LEAD_ID_T1)
FROM LEAD Lead INNER JOIN
(SELECT DISTINCT LEAD.ID AS LEAD_ID_T1,
IFF(CONTAINS(Task.DESCRIPTION,
'x'),
'GROUP1',
'GROUP2') AS GROUP
FROM TASK Task
RIGHT JOIN LEAD ON TASK.WHO_ID = LEAD.ID
) T1 ON T1.LEAD_ID_T1 = LEAD.ID
GROUP BY LeadDate,GROUP;
Code breaks because it can not aggregate the measures.
Really thankful for any input. This has been bothering me for a few days now.
I am thinking EXISTS with a CASE expression:
select l.*,
(case when exists (select 1
from task t
where t.who_id = l.id and
t.description like '%x%'
)
then 'GROUP1' else 'GROUP2'
end) as the_group
from lead l;
You can also try something like this, CASE with 1 and 0 then take the SUM
SELECT LeadDate,
sum(CASE When t.description like '%x%'then 1 else 0 end) as Group1,
sum(CASE When t.description like '%x%'then 0 else 1 end) as Group2
FROM TASK t
RIGHT JOIN LEAD l ON t.WHO_ID = l.ID
GROUP BY LeadDate;

Why does this not return 0

I have a query like:
select nvl(nvl(sum(a.quantity),0)-nvl(cc.quantityCor,0),0)
from RCV_TRANSACTIONS a
LEFT JOIN (select c.shipment_line_id,c.oe_order_line_id,nvl(sum(c.quantity),0) quantityCor
from RCV_TRANSACTIONS c
where c.TRANSACTION_TYPE='CORRECT'
group by c.shipment_line_id,c.oe_order_line_id) cc on (a.shipment_line_id=cc.shipment_line_id and a.shipment_line_id=7085740)
where a.transaction_type='DELIVER'
and a.shipment_line_id=7085740
group by nvl(cc.quantityCor,0);
The query runs OK, but returns no value. I want it to return 0 if there is no quantity found. Where have I gone wrong?
An aggregation query with a GROUP BY returns no rows if all rows are filtered out.
An aggregation query with no GROUP BY always returns one row, even if all rows are filtered out.
So, just remove the GROUP BY. And change the SELECT to:
select coalesce(sum(a.quantity), 0) - coalesce(max(cc.quantityCor), 0)
I may be wrong, but it seems you merely want to subtract CORRECT quantity from DELIVER quantity for shipment 7085740. You don't need a complicated query for that. Especially your GROUP BY clauses make no sense if that is what you are after.
One way to write this query would be:
select
sum(case when transaction_type = 'DELIVER' then quantity else 0 end) -
sum(case when transaction_type = 'CORRECT' then quantity else 0 end) as diff
from rcv_transactions
where shipment_line_id = 7085740;
I had a query like this and was trying to return 'X' when the item is not valid.
SELECT case when segment1 is not null then segment1 else 'X' end
--INTO v_orgValidItem
FROM mtl_system_items_b
WHERE segment1='1676001000'--'Jul-00'--l_item
and organization_id=168;
..but it was returning NULL.
Changed to use aggregation with no group by and now it returns 'X' when the item is not valid.
SELECT case when max(segment1) is not null then max(segment1) else 'X' end valid
--INTO v_orgValidItem
FROM mtl_system_items_b
WHERE segment1='1676001000'--'Jul-00'--l_item
and organization_id=168;--l_ship_to_organization_id_pb;
Here is another example, proving the order of operations really matters.
When there is no match for this quote number, this query returns NULL:
SELECT MAX(NVL(QUOTE_VENDOR_QUOTE_NUMBER,0))
FROM PO_HEADERS_ALL
WHERE QUOTE_VENDOR_QUOTE_NUMBER='foo.bar';
..reversing the order of MAX and NVL makes all the difference. This query returns the NULL value condition:
SELECT NVL(MAX(QUOTE_VENDOR_QUOTE_NUMBER),0)
FROM PO_HEADERS_ALL
WHERE QUOTE_VENDOR_QUOTE_NUMBER='foo.bar';

SQL query syntax in CASE WHEN ELSE END to count

Writing a query to find the number of ED visits that were discharged from non-ED units.
The column dep.ADT_UNIT_TYPE_C column stores 1 if the unit was an ED unit.
Assume NULL values are non-ED units for the purpose of this query.
Which of the following produces this number?
I am thinking it is A because in my mind, that sound the correct syntax.
COUNT(CASE WHEN THEN ELSE END standard format)
A has that.
B doesn't have the THEN? so it is incorrect syntax?
Please help me understanding the nuances between these choices.
A.)
COUNT( CASE WHEN dep.ADT_UNIT_TYPE_C is NULL OR dep.ADT_UNIT_TYPE_C <> 1 THEN NULL
ELSE 1
END )
B.)
COUNT( CASE WHEN dep.ADT_UNIT_TYPE_C is NULL or dep.ADT_UNIT_TYPE_C <> 1
ELSE NULL
END)
C.)
CASE WHEN dep.ADT_UNIT_TYPE_C Is NULL or dep.ADT_UNIT_TYPE_C <> 1 THEN COUNT (NULL)
ELSE COUNT (1)
END
D.)
CASE WHEN dep.ADT_UNIT_TYPE_C is NULL or dep.ADT_UNIT_TYPE_C <> 1 THEN COUNT(1)
ELSE COUNT(NULL)
END
You can count the records that are returned COUNT(*) and put the condition in the where clause.
If you are using Oracle, you can use NVL.
The sample below is for Oracle, but if using mysql or SQL server, you can use the ISNULL Function.
SELECT COUNT(*) FROM dep WHERE NVL(ADT_UNIT_TYPE_C, 0) != 1
It looks like however, you are joining this to another table, probably a visit table. So, you want to count visits. Visits probably stores some kind of department id or way to join it to departments.
Something like this:
SELECT COUNT(*) FROM visit v, departments d WHERE v.dep_id = d.dep_id AND NVL(d.ADT_UNIT_TYPE_C, 0) !=1
If you want the entire list like shown above, you want to use a group by. This will show you the count for each visit by department type.
SELECT COUNT(*) FROM visit v, departments d GROUP BY d.ADT_UNIT_TYPE_C

Merge two rows replacing nulls in pivot

Here's my sql
SELECT a."incomeNumber"
, (CASE WHEN b."traitName" = 'sometrait1' THEN b."traitValue" END) AS "numberResult"
, (CASE WHEN b."traitName" = 'sometrait2' THEN b."traitValue" END) AS "dateResult"
FROM "request" a
JOIN "traits" b ON a.id=b."requestId"
WHERE b."traitName" = 'sometrait1'
OR b."traitName" = 'sometrait2'
GROUP BY a."incomeNumber"
, b."traitName"
, b."traitValue"
Result
But I want to get one row 99 1 01.03.2018 per request, I can't сome up with solution how to deal with the trait table as sometrait1 and sometrait2 is the two different rows.
I'm using Postgres 9.6 and I want this solution to be plain sql if it's possible.
Ok, I solved my problem. I just need to remove traitName and traitValue from GROUP BY statement and adding MAX to CASE THEN.