Count rows with and combine them with math operator - sql

I have a table with different events for a case. I want to calculate how many times event A occurs for each case but subtract the amounts of event B
I have started with a code like this, but it does not work.
SELECT
((SELECT
case_id,
Count(*)
from database
where event = "A"
Group by case_id)
-
(SELECT
case_id,
Count(*)
from database
where event = "B"
Group by case_id)) as count,
case-id,
from database

I suspect that you want conditional aggregation:
select case_id,
( sum(case when event = 'A' then 1 else 0 end) -
sum(case when event = 'B' then 1 else 0 end)
) as diff
from database
group by case_id;

Related

Is there a way to collect the data and inspect in one pass using groupby function

Sample Data of table_1
Have this Query that returns
select
customer,
SUM(CASE WHEN activity IN ( 'a','b')
THEN 1
ELSE 0 END) AS num_activity_a_or_b
from table_1
group by customer
Results:
Want to extend this to return one more column if for a given code say X1 if the Activity is "a" and "c" then return num_of_a_and_c_activity.
A bit stuck how to collect and inpect the code and activities in one pass.
can we combine windowing function to achieve this.
Please advise and help
UPDATE:
based on the updated results, maybe the below query is what you need
So what i assume is that you need both a and c as well x1 .
So I count distinct activities which are a and c and then do integer division by 2. if only a is present then count distinct =1 but 1/2 =0 in integer division.
It is only 1 when both a and c are present.
select
customer,
SUM(CASE WHEN activity IN ( 'a','b')
THEN 1
ELSE 0
END) AS num_activity_a_or_b,
COUNT(DISTINCT CASE WHEN code IN ('x1') AND activity IN ( 'a','c')
THEN activity
ELSE NULL
END)/2 AS num_activity_a_and_c
from table_1
group by customer
Maybe your query can be
select
customer,
SUM(CASE WHEN activity IN ( 'a','b')
THEN 1
ELSE 0
END) AS num_activity_a_or_b,
SUM(CASE WHEN code IN ('x1') AND activity IN ( 'a','c')
THEN 1
ELSE 0
END) AS num_activity_a_or_c
from table_1
group by customer

Group data by one column and creating a new column based on a rows in each group

I have 'Task' and 'Start time' columns in my data. For each Task, there may be one or more Start times. What I want to do is, categorize each task as an 'X' task if all its Start times are equal and as a 'Y' task if all its Start times are not equal.
This is how the table should look like :
This can be achieved by using a group by and counting the distinct start time and using a case to return X or Y.
Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end)
from tasks
group by task;
or this if you want it to look exactly like the picture.
Select tasks.task, tasks.start_time, new.new
from tasks, (Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end) as new
from tasks
group by task) as new
where tasks.task = new.task;
You can view my solution here https://paiza.io/projects/Zu7IBFc-5tFBK8xDuf3hPg?language=mysql P.S. I just use Integer instead of date because I didn't feel like dealing with dates lol.
Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end)
from tasks
group by task;
With group by task get the value of new_column for each task and then join to the table tasks:
select t.id, t.task, g.new_column
from tasks t inner join (
select
task,
(case when count(distinct starttime) = 1 then 'X' else 'Y' end) new_column
from tasks
group by task
) g on g.task = t.task

Sum distinct records in a table with duplicates in Teradata

I have a table that has some duplicates. I can count the distinct records to get the Total Volume. When I try to Sum when the CompTia Code is B92 and run distinct is still counts the dupes.
Here is the query:
select
a.repair_week_period,
count(distinct a.notif_id) as Total_Volume,
sum(distinct case when a.header_comptia_cd = 'B92' then 1 else 0 end) as B92_Sum
FROM artemis_biz_app.aca_service_event a
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
group by 1
order by 1
;
Is There a way to only SUM the distinct records for B92?
I also tried inner joining the table on itself by selecting the distinct notification id and joining on that notification id, but still getting wrong sum counts.
Thanks!
Your B92_Sum currently returns either NULL, 1 or 2, this is definitely no sum.
To sum distinct values you need something like
sum(distinct case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end)
If this column_to_sum is actually the notif_id you get a conditional count but not a sum.
Otherwise the distinct might remove too many vales and then you probably need a Derived Table where you remove duplicates before aggregation:
select
repair_week_period,
--no more distinct needed
count(a.notif_id) as Total_Volume,
sum(case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end) as B92_Sum
FROM
(
select repair_week_period,
notif_id
header_comptia_cd,
column_to_sum
from artemis_biz_app.aca_service_event
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
-- only onw row per notif_id
qualify row_number() over (partition by notif_id order by ???) = 1
) a
group by 1
order by 1
;
#dnoeth It seems the solution to my problem was not to SUM the data, but to count distinct it.
This is how I resolved my problem:
count(distinct case when a.header_comptia_cd = 'B92' then a.notif_id else NULL end) as B92_Sum

SQL Multiple Rows to Single Row Multiple Columns

I am including a SQLFiddle to show as an example of where I am currently at. In the example image you can see that simply grouping you get up to two lines per user depending on their status and how many of those statuses they have.
http://sqlfiddle.com/#!3/9aa649/2
The way I want it to come out is to look like the image below. Having a single line per user with two totaling columns one for Fail Total and one for Pass Total. I have been able to come close but since BOB only has Fails and not Passes this query leaves BOB out of the results. which I want to show BOB as well with his 6 Fail and 0 Pass
select a.PersonID,a.Name,a.Totals as FailTotal,b.Totals as PassTotals from (
select PersonID,Name,Status, COUNT(*) as Totals from UserReport
where Status = 'Fail'
group by PersonID,Name,Status) a
join
(
select PersonID,Name,Status, COUNT(*) as Totals from UserReport
where Status = 'Pass'
group by PersonID,Name,Status) b
on a.PersonID=b.PersonID
The below picture is what I want it to look like. Here is another SQL Fiddle that shows the above query in action
http://sqlfiddle.com/#!3/9aa649/13
Use conditional aggregation if the number of values for status column is fixed.
Fiddle
select PersonID,Name,
sum(case when "status" = 'Fail' then 1 else 0 end) as failedtotal,
sum(case when "status" = 'Pass' then 1 else 0 end) as passedtotals
from UserReport
group by PersonID,Name
Use conditional aggregation:
select PersonID, Name,
sum(case when Status = 'Fail' then 1 else 0 end) as FailedTotal,
sum(case when Status = 'Pass' then 1 else 0 end) as PassedTotal
from UserReport
group by PersonID, Name;
With conditional aggregation:
select PersonID,
Name,
sum(case when Status = 'Fail' then 1 end) as Failed,
sum(case when Status = 'Passed' then 1 end) as Passed
from UserReport
group by PersonID, Name

SQLite3: Return a NULL if no records exist in SUM()

I would like to SUM() while also using a WHERE but when there are no records found for a certain ID I would like it to return NULL instead of just not returning anything.
Initial Code:
SELECT
ID,
SUM(CASE WHEN EVENTS = 3 THEN 1 ELSE 0 END)
FROM Events_ID
WHERE
YEAR = 2012
GROUP BY ID
This would not return an ID if there were no events for it in 2012.
I then changed it to the following that appears to work but is around 100x slower!
SELECT
ID,
(SELECT
SUM(CASE WHEN EVENTS = 3 THEN 1 ELSE 0 END)
FROM EVENTS_ID r WHERE r.ID = t.ID AND r.YEAR = 2012)
FROM (SELECT * FROM Events_ID GROUP BY ID) as t;
Is there anyway to get the output of the second query nearer to the speed of the first?
Is this what you want?
SELECT ID,
SUM(CASE WHEN EVENTS = 3 AND YEAR = 2012 THEN 1 END)
FROM Events_ID
GROUP BY ID;
This will return all ids, with a NULL as a second value if no events match both conditions.