Partition table based on joined table - sql

We have 2 Tables Lead and Task.
One lead can have multiple Tasks.
We want to determine if a Lead has a Task who's description contains String 'x'.
If the Lead has the String the it should belong to group1 if it doesn't to group2.
Then we want to count the leads per group and week.
The problem we have is that if a Lead has several tasks and one of them has string 'x' in its description and the others don't it is counted in both groups.
We would need something that resembles a break; statement in the IFF clause of the subquery, so that if the first condition = Contain string x is satisfied the other tasks are not counted anymore.
How would we achieve that?
So far we have the following statement:
--SQL:
SELECT LeadDate, GROUP, COUNT(LEAD_ID_T1)
FROM LEAD Lead INNER JOIN
(SELECT DISTINCT LEAD.ID AS LEAD_ID_T1,
IFF(CONTAINS(Task.DESCRIPTION,
'x'),
'GROUP1',
'GROUP2') AS GROUP
FROM TASK Task
RIGHT JOIN LEAD ON TASK.WHO_ID = LEAD.ID
) T1 ON T1.LEAD_ID_T1 = LEAD.ID
GROUP BY LeadDate,GROUP;
Code breaks because it can not aggregate the measures.
Really thankful for any input. This has been bothering me for a few days now.

I am thinking EXISTS with a CASE expression:
select l.*,
(case when exists (select 1
from task t
where t.who_id = l.id and
t.description like '%x%'
)
then 'GROUP1' else 'GROUP2'
end) as the_group
from lead l;

You can also try something like this, CASE with 1 and 0 then take the SUM
SELECT LeadDate,
sum(CASE When t.description like '%x%'then 1 else 0 end) as Group1,
sum(CASE When t.description like '%x%'then 0 else 1 end) as Group2
FROM TASK t
RIGHT JOIN LEAD l ON t.WHO_ID = l.ID
GROUP BY LeadDate;

Related

Counting Booleans for Distinct and Non Distinct ID numbers

I have a simple table that looks like the following PNG file from the following join:
SELECT *
FROM tableA A
JOIN tableB B ON B.Main_SPACE_ID = A.Main_SPACE_ID
Table A contains Guest_ON and User_Controls (last 2 columns) and Table B contains Trigger_ON and DOCX_ON.
Issue:
What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
The problem is that subspace_ID from table B lives within the main_space_id from table A and therefore creates a situation where I am double counting.
I only want to count the trues for a distinct Main_space ID
Current Data Model
Desired Output:
From the above screenshot, I am trying to get a count of true values without double counting in the case for tableA_MAIN_SPACE_ID.
As you can see, each row is counted for true values as it relates to the subspace_ID (table B) for totals of 12 and 8 (1 if True, 0 if False) and for tableA, I am only counting distinct values so we only count Trues for a single MainspaceID and avoid recounting them.
If someone can advise on how to get this output from my current data model that would be very helpful!
My attempt as follows double counts trues for the Main space ID column..
SELECT
count(CASE WHEN B.TRIGGER_ON THEN 1 END) as TRIGGER_ON,
count(CASE WHEN B.DOCX_ON THEN 1 END) as DOCX_ON,
count(CASE WHEN A.GUEST_ON THEN 1 END) as SPRINTS,
count(CASE WHEN A.USER_CONTROLS THEN 1 END) as SPRINTS
FROM DataModel
What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
You can use conditional aggregation. In Snowflake, you can use the convenient COUNT_IF() for the first two columns. However, for the second two, you need COUNT(DISTINCT) with conditional logic:
SELECT COUNT_IF( B.Trigger_on ) as Trigger_On,
COUNT_IF( B. DOCX_ON ) as DOCX_ON,
COUNT(DISTINCT CASE WHEN A.GUEST_ON THEN A.Main_SPACE_ID END) as GUEST_ON,
COUNT(DISTINCT CASE WHEN A. USER_CONTROLS THEN A.Main_SPACE_ID END) as USER_CONTROLS
FROM tableA A JOIN
tableB B
ON B.Main_SPACE_ID = A.Main_SPACE_ID;
Mabye:
SELECT
COUNT(CASE WHEN B.TRIGGER_ON THEN 1 END) AS TRIGGER_ON,
COUNT(CASE WHEN B.DOCX_ON THEN 1 END) AS DOCX_ON,
(SELECT COUNT(*) FROM (SELECT DISTINCT A.MAIN_SPACE_ID, A.GUEST_ON FROM DataModel WHERE A.GUEST_ON = TRUE) A) AS GUEST_ON
(SELECT COUNT(*) FROM (SELECT DISTINCT A.USER_CONTROLS, A.GUEST_ON FROM DataModel WHERE A.USER_CONTROLS = TRUE) A) AS USER_CONTROLS
FROM DataModel

Calculate number of rows with having clause

I have 3 tables what I'm trying to achieve is to calculate exact number of rows for two kinds of queries.
The first one must count number of accounts which has exactly only one row in accounts_extra for specific service_id.
The second one must count number of accounts which has exactly only one row in accounts_extra and also trial has not ended for specific id
http://sqlfiddle.com/#!15/313db/3
Basically I get in the first query 0 which is correct but in second query I get 1 which is not correct.
I assume that subscription is optional so that's why I get 1 in the second query what should I do to achieve 0 in the second query but still taken into consideration trial_ends_at
Your question is rather hard to follow, but I think this does what you are describing:
SELECT SUM( (cnt = 1)::int ) as count1,
SUM( (cnt = 1 AND cnt2 > 0)::int ) as count2
FROM (SELECT a.id, COUNT(DISTINCT ae.id) AS cnt,
COUNT(ans.id) as cnt2
FROM accounts a JOIN
accounts_extra ae
ON a.id = ae.account_id LEFT JOIN
account_number_subscriptions ans
ON ans.account_id = a.id AND
ans.trial_ends_at > now()
WHERE a.service_id = '101' AND
a.closed = false AND
ae.created_at < '2019-07-01'
GROUP BY a.id
) a;

How to do a COUNT with a WHERE clause?

I actually have a query joining 3 tables to collect informations so I can calculate some KPIs, but my logic is flawed, here is my actual query :
SELECT t.idCustomer, t.nameCustomer
COUNT(DISTINCT t.idTrip),
SUM(
CASE
WHEN t.tripDone <> 1
THEN 1
ELSE 0
END),
SUM(CASE
WHEN t.codeIncident = 'CANCEL'
THEN 1
ELSE 0
END)
FROM
(SELECT customer.idCustomer, customer.nameCustomer, trip.tripDone, incident.codeIncident
FROM CUSTOMER customer
JOIN TRIP trip ON customer.idCustomer = trip.idCustomer
JOIN INCIDENT incident ON trip.idTrip = incident.idTrip) t
GROUP BY t.idCustomer, t.nameCustomer
So, I want to know for each Customer :
COUNT(DISTINCT t.idTrip) -> The number of trips by this customer
Sum when t.tripDone <> 1 -> The number of trips that are done by this customer ( not ingoing )
Sum when t.codeIncident = 'CANCEL' -> The number of trips by this customer where there was a cancellation.
The big mistake I made here, is that a trip can have multiple codeIncidents (example : one record for an idTrip with the codeIncident 'CANCEL' and another record with same idTrip with the codeIncident 'DELAYED'), so when I calculate the Sum when t.tripDone <> 1 I get a result of : '2' instead of '1' (because there are 2 records in my from Clause that have the t.tripDone <> 1 for the same idTrip).
Would you have any idea on how I should process this query so I can do the Sum when tripDone <> 1 only once for each tripId ?
Thanks a lot for the help !
If you need some more infos I'm available, and sorry for my lack of english skills !
It sounds like you want to do the same count(distinct ...) pattern for the columns you're currently summing, but with some logic. You can use case within a count instead in the same way:
...
COUNT(
DISTINCT CASE
WHEN t.tripDone <> 1
THEN t.idTrip
ELSE null
END),
COUNT(
DISTINCT CASE
WHEN t.codeIncident = 'CANCEL'
THEN t.idTrip
ELSE null
END)
The else null is a bit redundant as that's the default. As count() ignores nulls, if the when isn't matched then that trip ID isn't counted.
First Select idTrip field in your inner query that means table "t"

SQL Nested Select statements with COUNT()

I'll try to describe as best I can, but it's hard for me to wrap my whole head around this problem let alone describe it....
I am trying to select multiple results in one query to display the current status of a database. I have the first column as one type of record, and the second column as a sub-category of the first column. The subcategory is then linked to more records underneath that, distinguished by status, forming several more columns. I need to display every main-category/subcategory combination, and then the count of how many of each sub-status there are beneath that subcategory in the subsequent columns. I've got it so that I can display the unique combinations, but I'm not sure how to nest the select statements so that I can select the count of a completely different table from the main query. My problem lies in that to display the main category and sub category, I can pull from one table, but I need to count from a different table. Any ideas on the matter would be greatly appreciated
Here's what I have. The count statements would be replaced with the count of each status:
SELECT wave_num "WAVE NUMBER",
int_tasktype "INT / TaskType",
COUNT (1) total,
COUNT (1) "LOCKED/DISABLED",
COUNT (1) released,
COUNT (1) "PARTIALLY ASSEMBLED",
COUNT (1) assembled
FROM (SELECT DISTINCT
(t.invn_need_type || ' / ' || s.code_desc) int_tasktype,
t.task_genrtn_ref_nbr wave_num
FROM sys_code s, task_hdr t
WHERE t.task_genrtn_ref_nbr IN
(SELECT ship_wave_nbr
FROM ship_wave_parm
WHERE TRUNC (create_date_time) LIKE SYSDATE - 7)
AND s.code_type = '590'
AND s.rec_type = 'S'
AND s.code_id = t.task_type),
ship_wave_parm swp
GROUP BY wave_num, int_tasktype
ORDER BY wave_num
Image here: http://i.imgur.com/JX334.png
Guessing a bit,both regarding your problem and Oracle (which I've - unfortunately - never used), hopefully it will give you some ideas. Sorry for completely messing up the way you write SQL, SELECT ... FROM (SELECT ... WHERE ... IN (SELECT ...)) simply confuses me, so I have to restructure:
with tmp(int_tasktype, wave_num) as
(select distinct (t.invn_need_type || ' / ' || s.code_desc), t.task_genrtn_ref_nbr
from sys_code s
join task_hdr t
on s.code_id = t.task_type
where s.code_type = '590'
and s.rec_type = 'S'
and exists(select 1 from ship_wave_parm p
where t.task_genrtn_ref_nbr = p.ship_wave_nbr
and trunc(p.create_date_time) = sysdate - 7))
select t.wave_num "WAVE NUMBER", t.int_tasktype "INT / TaskType",
count(*) TOTAL,
sum(case when sst.sub_status = 'LOCKED' then 1 end) "LOCKED/DISABLED",
sum(case when sst.sub_status = 'RELEASED' then 1 end) RELEASED,
sum(case when sst.sub_status = 'PARTIAL' then 1 end) "PARTIALLY ASSEMBLED",
sum(case when sst.sub_status = 'ASSEMBLED' then 1 end) ASSEMBLED
from tmp t
join sub_status_table sst
on t.wave_num = sst.wave_num
group by t.wave_num, t.int_tasktype
order by t.wave_num
As you notice, I don't know anything about the table with the substatuses.
You can use inner join, grouping and count to get your result:
suppose tables are as follow :
cat (1)--->(n) subcat (1)----->(n) subcat_detail.
so the query would be :
select cat.title cat_title ,subcat.title subcat_title ,count(*) as cnt from
cat inner join sub_cat on cat.id=subcat.cat_id
inner join subcat_detail on subcat.ID=am.subcat_detail_id
group by cat.title,subcat.title
Generally when you need different counts, you need to use the CASE statment.
select count(*) as total
, case when field1 = "test' then 1 else 0 end as testcount
, case when field2 = 'yes' then 1 else 0 endas field2count
FROM table1

SQL query and joins

Please see my query below:
select I.OID_CUSTOMER_DIM, I.segment as PISTACHIO_SEGMENT,
MAX(CASE WHEN S.SUBSCRIPTION_TYPE = '5' THEN 'Y' ELSE 'N' END ) PB_SUBS,
max(case when S.SUBSCRIPTION_TYPE ='12' then 'Y' else 'N' end) DAILY_TASTE,
MAX(CASE WHEN S.SUBSCRIPTION_TYPE ='8' THEN 'Y' ELSE 'N' END) COOKING_FOR_TWO
FROM WITH_MAIL_ID i JOIN CUSTOMER_SUBSCRIPTION_FCT S
ON I.IDENTITY_ID = S.IDENTITY_ID
WHERE S.SITE_CODE ='PB'and S.SUBSCRIPTION_END_DATE is null
group by I.oid_customer_dim, I.segment
In this one I am getting 654105 rows, which is lower than the one of the joins table with_mail_id which has 706795 rows.
Now, for the qc purpose my manager is wondering as why I am not having all the rows in my final table. I tried to remove all the filters but the results are still not same in both tables. What am I doing wrong?
I am not very good in SQL yet and this thing is really confusing me.
You're doing an inner join on the two tables, so only rows from WITH_MAIL_ID that can join against CUSTOMER_SUBSCRIPTION_FCT will be returned. Additionally you have a group clause.
First the join. If you want to return all rows regardless of the join condition, you can use a left join, but in this case all the S. columns will be NULL, and you'll have to deal with that.
If you run this, you might see the count is the difference:
select count(*) from WITH_MAIL_ID i
left join CUSTOMER_SUBSCRIPTION_FCT S
on I.IDENTITY_ID = S.IDENTITY_ID
where s.IDENTITY_ID is NULL
The most likely thing however is that it's just the grouping. If you are grouping on two columns and selecting the max of various other columns based on that grouping, you would expect that the number of rows returned is less than the original table, otherwise why bother grouping?
If I have data like this:
groupkey1 value
1 2
1 10
2 1
2 1
Then I group by groupkey1, and select MAX(value) I would get 2 rows [1,2], [2,1], not 4 rows.