Sum distinct records in a table with duplicates in Teradata

Sum distinct records in a table with duplicates in Teradata - sql

I have a table that has some duplicates. I can count the distinct records to get the Total Volume. When I try to Sum when the CompTia Code is B92 and run distinct is still counts the dupes.
Here is the query:
select
a.repair_week_period,
count(distinct a.notif_id) as Total_Volume,
sum(distinct case when a.header_comptia_cd = 'B92' then 1 else 0 end) as B92_Sum
FROM artemis_biz_app.aca_service_event a
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
group by 1
order by 1
;
Is There a way to only SUM the distinct records for B92?
I also tried inner joining the table on itself by selecting the distinct notification id and joining on that notification id, but still getting wrong sum counts.
Thanks!

Your B92_Sum currently returns either NULL, 1 or 2, this is definitely no sum.
To sum distinct values you need something like
sum(distinct case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end)
If this column_to_sum is actually the notif_id you get a conditional count but not a sum.
Otherwise the distinct might remove too many vales and then you probably need a Derived Table where you remove duplicates before aggregation:
select
repair_week_period,
--no more distinct needed
count(a.notif_id) as Total_Volume,
sum(case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end) as B92_Sum
FROM
(
select repair_week_period,
notif_id
header_comptia_cd,
column_to_sum
from artemis_biz_app.aca_service_event
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
-- only onw row per notif_id
qualify row_number() over (partition by notif_id order by ???) = 1
) a
group by 1
order by 1
;

#dnoeth It seems the solution to my problem was not to SUM the data, but to count distinct it.
This is how I resolved my problem:
count(distinct case when a.header_comptia_cd = 'B92' then a.notif_id else NULL end) as B92_Sum

Related

Count the occurrences of a given list of values in a column using a single SQL query

I would like to get the count of occurrences of a given list of values in a column using a single SQL query. The operations must be optimised for performance.
Please refer the example given below,
Sample Table name - history
code_list
5lysgj
627czl
1lqnd8
627czl
dtrtvp
627czl
esdop9
esdop9
3by104
1lqnd8
Expected Output
Need to get the count of occurrences for these given list of codes 627czl, 1lqnd8, esdop9, aol4m6 in the format given below.
code
count
627czl
3
esdop9
2
1lqnd8
2
aol4m6
0
Method I tried in show below but the count of each input is shown as a new column using this query,
SELECT
sum(case when h.code_list = 'esdop9' then 1 else 0 end) AS count_esdop9,
sum(case when h.code_list = '627czl' then 1 else 0 end) AS count_627czl,
sum(case when h.code_list = '1lqnd8' then 1 else 0 end) AS count_1lqnd8,
sum(case when h.code_list = 'aol4m6' then 1 else 0 end) AS count_aol4m6
FROM history h;
Note - The number inputs need to be given in the query in 10 also the real table has millions of records.

If i properly understand you need to get the count of occurrences for the following codes: 627czl, 1lqnd8, esdop9.
In this case you can try this one:
SELECT code_list, count(*) as count_
FROM history
WHERE code_list in ('627czl','1lqnd8','esdop9')
GROUP BY code_list
ORDER BY count_ DESC;
dbfiddle
If you need to get the count of occurrences for all codes you can run the following query:
SELECT code_list, count(*) as count_
FROM history
GROUP BY code_list
ORDER BY count_ DESC;

you can try to use GROUP BY
Something like this
SELECT code_list, COUNT(1) as 'total' ROM h GROUP by code_list order by 'total' ;

check and compare the count from two tables without relation

I have below tables
Table1: "Demo"
Columns: SSN, sales, Create_DT,Update_Dt
Table2: "Agent"
Columns: SSN,sales, Agent_Name, Create_Dt, Update_DT
Scenario 1 and desired result set:
I want output as 0 if the count of SSN in Demo table is matched with the count of SSN in Agent table
if the count is not matched then I want result as 1
Scenario 2 and desired result set:
I want output as 0 if the sum of sales in Demo table is matched with the sum of sales in Agent table
if the sum is not matched then I want result as 1
Please help on this query part
Thanks

You can write two queries separately to take counts within the result query
SELECT (SELECT count(Demo.SSN) as SSN1 from Demo)!=(SELECT count(Agent.SSN) as SSN2 from Agent) AS Result;
Basically what the inner queries does is it checked whether the counts are equal or not and outputs 1 if it is true and 0 if it is false. Since you have asked to output 1 if it is false I used '!=' sign.
You can try the same procedure in scenario 2 also

For scenario 1
select (Case when (select count(ssn) from Demo)=(select count(ssn) from Agent) then 0 else 1 end) as desired_result
If you want to count unique ssn then:
select (Case when (select count(distinct ssn) from Demo)=(select count(distinct ssn) from Agent) then 0 else 1 end) as desired_result
For scenario 2:
select (Case when (select sum(sales) from Demo)=(select sum(sales) from Agent) then 0 else 1 end) as desired_result

I would suggest one query with both sets of information:
select (d.num_ssn <> a.num_ssn) as have_different_ssn_count,
(d.sales <> a.sales) as have_different_sales
from (select count(distinct ssn) as num_ssn,
coalesce(sum(sales), 0) as sales
from demo
) d cross join
(select count(distinct ssn) as num_ssn,
coalesce(sum(sales), 0) as sales
from agent
) a;
Note: This returns boolean values -- true/false rather than 1/0. If you really want 0/1, then use case:
select (case when d.num_ssn <> a.num_ssn then 1 else 0 end) as have_different_ssn_count,
(case when d.sales <> a.sales then 1 else 0 end) as have_different_sales
It would not surprise me if you were not only interested in the total counts but also that the agent/sales combinations are the same in both tables. If that is the case, please ask a new question with a clear explanation. Sample data and desired results help.

Counting Booleans for Distinct and Non Distinct ID numbers

I have a simple table that looks like the following PNG file from the following join:
SELECT *
FROM tableA A
JOIN tableB B ON B.Main_SPACE_ID = A.Main_SPACE_ID
Table A contains Guest_ON and User_Controls (last 2 columns) and Table B contains Trigger_ON and DOCX_ON.
Issue:
What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
The problem is that subspace_ID from table B lives within the main_space_id from table A and therefore creates a situation where I am double counting.
I only want to count the trues for a distinct Main_space ID
Current Data Model
Desired Output:
From the above screenshot, I am trying to get a count of true values without double counting in the case for tableA_MAIN_SPACE_ID.
As you can see, each row is counted for true values as it relates to the subspace_ID (table B) for totals of 12 and 8 (1 if True, 0 if False) and for tableA, I am only counting distinct values so we only count Trues for a single MainspaceID and avoid recounting them.
If someone can advise on how to get this output from my current data model that would be very helpful!
My attempt as follows double counts trues for the Main space ID column..
SELECT
count(CASE WHEN B.TRIGGER_ON THEN 1 END) as TRIGGER_ON,
count(CASE WHEN B.DOCX_ON THEN 1 END) as DOCX_ON,
count(CASE WHEN A.GUEST_ON THEN 1 END) as SPRINTS,
count(CASE WHEN A.USER_CONTROLS THEN 1 END) as SPRINTS
FROM DataModel

What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
You can use conditional aggregation. In Snowflake, you can use the convenient COUNT_IF() for the first two columns. However, for the second two, you need COUNT(DISTINCT) with conditional logic:
SELECT COUNT_IF( B.Trigger_on ) as Trigger_On,
COUNT_IF( B. DOCX_ON ) as DOCX_ON,
COUNT(DISTINCT CASE WHEN A.GUEST_ON THEN A.Main_SPACE_ID END) as GUEST_ON,
COUNT(DISTINCT CASE WHEN A. USER_CONTROLS THEN A.Main_SPACE_ID END) as USER_CONTROLS
FROM tableA A JOIN
tableB B
ON B.Main_SPACE_ID = A.Main_SPACE_ID;

Mabye:
SELECT
COUNT(CASE WHEN B.TRIGGER_ON THEN 1 END) AS TRIGGER_ON,
COUNT(CASE WHEN B.DOCX_ON THEN 1 END) AS DOCX_ON,
(SELECT COUNT(*) FROM (SELECT DISTINCT A.MAIN_SPACE_ID, A.GUEST_ON FROM DataModel WHERE A.GUEST_ON = TRUE) A) AS GUEST_ON
(SELECT COUNT(*) FROM (SELECT DISTINCT A.USER_CONTROLS, A.GUEST_ON FROM DataModel WHERE A.USER_CONTROLS = TRUE) A) AS USER_CONTROLS
FROM DataModel

Proportion request sql

There is a table of accidents and output the share of accidents number 2 to all accidents I wrote this code, but I can not make it work:
select ((select count("ID") from "DTP" where "REASON"=2)/count("REASON"))
from "DTP"
group by "ID"

Something like this (not tested):
select id, count(case reason when 2 then 1 end)/count(*) as proportion
from your_table
-- where ... (if you need to filter, for example by date)
group by id
;
count(*) counts all the rows in a group (that is, all the rows for each separate id). The case expression returns 1 when the reason is 2 and it returns null otherwise; count counts only non-null values, so it will count the rows where the reason is 2.

You can use avg():
select id,
avg(case when reason = 2 then 1.0 else 0 end)
from "DTP"
group by "ID"
This produces the ratio for each id -- based on your sample query. If you only want one row for all the data, then:
select avg(case when reason = 2 then 1.0 else 0 end)
from "DTP";

SQL Nested Select statements with COUNT()

I'll try to describe as best I can, but it's hard for me to wrap my whole head around this problem let alone describe it....
I am trying to select multiple results in one query to display the current status of a database. I have the first column as one type of record, and the second column as a sub-category of the first column. The subcategory is then linked to more records underneath that, distinguished by status, forming several more columns. I need to display every main-category/subcategory combination, and then the count of how many of each sub-status there are beneath that subcategory in the subsequent columns. I've got it so that I can display the unique combinations, but I'm not sure how to nest the select statements so that I can select the count of a completely different table from the main query. My problem lies in that to display the main category and sub category, I can pull from one table, but I need to count from a different table. Any ideas on the matter would be greatly appreciated
Here's what I have. The count statements would be replaced with the count of each status:
SELECT wave_num "WAVE NUMBER",
int_tasktype "INT / TaskType",
COUNT (1) total,
COUNT (1) "LOCKED/DISABLED",
COUNT (1) released,
COUNT (1) "PARTIALLY ASSEMBLED",
COUNT (1) assembled
FROM (SELECT DISTINCT
(t.invn_need_type || ' / ' || s.code_desc) int_tasktype,
t.task_genrtn_ref_nbr wave_num
FROM sys_code s, task_hdr t
WHERE t.task_genrtn_ref_nbr IN
(SELECT ship_wave_nbr
FROM ship_wave_parm
WHERE TRUNC (create_date_time) LIKE SYSDATE - 7)
AND s.code_type = '590'
AND s.rec_type = 'S'
AND s.code_id = t.task_type),
ship_wave_parm swp
GROUP BY wave_num, int_tasktype
ORDER BY wave_num
Image here: http://i.imgur.com/JX334.png

Guessing a bit,both regarding your problem and Oracle (which I've - unfortunately - never used), hopefully it will give you some ideas. Sorry for completely messing up the way you write SQL, SELECT ... FROM (SELECT ... WHERE ... IN (SELECT ...)) simply confuses me, so I have to restructure:
with tmp(int_tasktype, wave_num) as
(select distinct (t.invn_need_type || ' / ' || s.code_desc), t.task_genrtn_ref_nbr
from sys_code s
join task_hdr t
on s.code_id = t.task_type
where s.code_type = '590'
and s.rec_type = 'S'
and exists(select 1 from ship_wave_parm p
where t.task_genrtn_ref_nbr = p.ship_wave_nbr
and trunc(p.create_date_time) = sysdate - 7))
select t.wave_num "WAVE NUMBER", t.int_tasktype "INT / TaskType",
count(*) TOTAL,
sum(case when sst.sub_status = 'LOCKED' then 1 end) "LOCKED/DISABLED",
sum(case when sst.sub_status = 'RELEASED' then 1 end) RELEASED,
sum(case when sst.sub_status = 'PARTIAL' then 1 end) "PARTIALLY ASSEMBLED",
sum(case when sst.sub_status = 'ASSEMBLED' then 1 end) ASSEMBLED
from tmp t
join sub_status_table sst
on t.wave_num = sst.wave_num
group by t.wave_num, t.int_tasktype
order by t.wave_num
As you notice, I don't know anything about the table with the substatuses.

You can use inner join, grouping and count to get your result:
suppose tables are as follow :
cat (1)--->(n) subcat (1)----->(n) subcat_detail.
so the query would be :
select cat.title cat_title ,subcat.title subcat_title ,count(*) as cnt from
cat inner join sub_cat on cat.id=subcat.cat_id
inner join subcat_detail on subcat.ID=am.subcat_detail_id
group by cat.title,subcat.title

Generally when you need different counts, you need to use the CASE statment.
select count(*) as total
, case when field1 = "test' then 1 else 0 end as testcount
, case when field2 = 'yes' then 1 else 0 endas field2count
FROM table1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sum distinct records in a table with duplicates in Teradata - sql

#dnoeth It seems the solution to my problem was not to SUM the data, but to count distinct it. This is how I resolved my problem: count(distinct case when a.header_comptia_cd = 'B92' then a.notif_id else NULL end) as B92_Sum

Related

Count the occurrences of a given list of values in a column using a single SQL query

check and compare the count from two tables without relation

Counting Booleans for Distinct and Non Distinct ID numbers

Proportion request sql

SQL Nested Select statements with COUNT()

Categories

Resources