SQL Count Distinct returning one extra count - sql

How is this possible that these two methods are returning different results?
Method 1 (returns correct count):
SELECT COUNT(DISTINCT contact_id)
FROM Traffic_Action
WHERE action_type IN ('Schedule a Tour', 'Schedule Follow-up', 'Lost')
Method 2 (returns one extra count):
SELECT COUNT(DISTINCT CASE WHEN action_type IN ('Schedule a Tour', 'Schedule Follow-up', 'Lost') THEN contact_id ELSE 0 END)
FROM Traffic_Action

Remove the else part - as 0 is also counted
SELECT COUNT(DISTINCT CASE WHEN
action_type in ('Schedule a Tour','Schedule Follow-up','Lost') THEN contact_id END)
FROM Traffic_Action

No wonder you are getting two different results.
First query:
Provides you the distinct count of records where action_type in Schedule a Tour, Schedule Follow-up and Lost
SELECT COUNT(DISTINCT contact_id) FROM Traffic_Action WHERE action_type in
('Schedule a Tour','Schedule Follow-up','Lost')
Second query:
In this query any value apart from Schedule a Tour, Schedule Follow-up and Lost is considered as 0, and on taking distinct value, results one row according to your case statement
SELECT COUNT(DISTINCT CASE WHEN action_type in ('Schedule a Tour','Schedule Follow-
up','Lost') THEN contact_id ELSE 0 END) FROM Traffic_Action
In simple words,
In first query you are filtering only three values
In second query you have no filters, but case statement on three values and else condition to return 0 for non matching criteria

That means you have 1 record where contact_id is NULL. Normally, COUNT() ignores NULL values. Your second query converts NULL to zero via the "ELSE" branch. That should be why you see a difference.
You can quickly see for yourself in this example. This will return 2 although there are 3 records
select count(distinct a.col1)
from (
select 1 as Col1
union select 2
union select NULL
) a

Related

Proportion request sql

There is a table of accidents and output the share of accidents number 2 to all accidents I wrote this code, but I can not make it work:
select ((select count("ID") from "DTP" where "REASON"=2)/count("REASON"))
from "DTP"
group by "ID"
Something like this (not tested):
select id, count(case reason when 2 then 1 end)/count(*) as proportion
from your_table
-- where ... (if you need to filter, for example by date)
group by id
;
count(*) counts all the rows in a group (that is, all the rows for each separate id). The case expression returns 1 when the reason is 2 and it returns null otherwise; count counts only non-null values, so it will count the rows where the reason is 2.
You can use avg():
select id,
avg(case when reason = 2 then 1.0 else 0 end)
from "DTP"
group by "ID"
This produces the ratio for each id -- based on your sample query. If you only want one row for all the data, then:
select avg(case when reason = 2 then 1.0 else 0 end)
from "DTP";

How do I count distinct to exclude a value?

Below is the different scales in a POS system. I am trying to count the number of distinct scales that are not 'MANUAL WT'.
This is what I have, but it is returning 2 and not 6.
count (distinct (case when d.SCALE_IN_ID != 'MANUAL WT' then 1 else 0 end)) as Num_Scale
Consider:
select count(distinct case when scale_in_id <> 'MANUAL WT' then scale_in_id end) cnt
from mytable
The problem with your original query is that the case expression turns values to either 0 and 1, and then the aggregate function computes how many distinct values are returned: since values are all 0s or 1s, there are only two distinct values (or one in edge cases): hence the result that you are getting.
A simple WHERE clause will do:
select count(distinct scale_in_id) Num_Scale
from tablename
where scale_in_id <> 'MANUAL WT'

Why does this not return 0

I have a query like:
select nvl(nvl(sum(a.quantity),0)-nvl(cc.quantityCor,0),0)
from RCV_TRANSACTIONS a
LEFT JOIN (select c.shipment_line_id,c.oe_order_line_id,nvl(sum(c.quantity),0) quantityCor
from RCV_TRANSACTIONS c
where c.TRANSACTION_TYPE='CORRECT'
group by c.shipment_line_id,c.oe_order_line_id) cc on (a.shipment_line_id=cc.shipment_line_id and a.shipment_line_id=7085740)
where a.transaction_type='DELIVER'
and a.shipment_line_id=7085740
group by nvl(cc.quantityCor,0);
The query runs OK, but returns no value. I want it to return 0 if there is no quantity found. Where have I gone wrong?
An aggregation query with a GROUP BY returns no rows if all rows are filtered out.
An aggregation query with no GROUP BY always returns one row, even if all rows are filtered out.
So, just remove the GROUP BY. And change the SELECT to:
select coalesce(sum(a.quantity), 0) - coalesce(max(cc.quantityCor), 0)
I may be wrong, but it seems you merely want to subtract CORRECT quantity from DELIVER quantity for shipment 7085740. You don't need a complicated query for that. Especially your GROUP BY clauses make no sense if that is what you are after.
One way to write this query would be:
select
sum(case when transaction_type = 'DELIVER' then quantity else 0 end) -
sum(case when transaction_type = 'CORRECT' then quantity else 0 end) as diff
from rcv_transactions
where shipment_line_id = 7085740;
I had a query like this and was trying to return 'X' when the item is not valid.
SELECT case when segment1 is not null then segment1 else 'X' end
--INTO v_orgValidItem
FROM mtl_system_items_b
WHERE segment1='1676001000'--'Jul-00'--l_item
and organization_id=168;
..but it was returning NULL.
Changed to use aggregation with no group by and now it returns 'X' when the item is not valid.
SELECT case when max(segment1) is not null then max(segment1) else 'X' end valid
--INTO v_orgValidItem
FROM mtl_system_items_b
WHERE segment1='1676001000'--'Jul-00'--l_item
and organization_id=168;--l_ship_to_organization_id_pb;
Here is another example, proving the order of operations really matters.
When there is no match for this quote number, this query returns NULL:
SELECT MAX(NVL(QUOTE_VENDOR_QUOTE_NUMBER,0))
FROM PO_HEADERS_ALL
WHERE QUOTE_VENDOR_QUOTE_NUMBER='foo.bar';
..reversing the order of MAX and NVL makes all the difference. This query returns the NULL value condition:
SELECT NVL(MAX(QUOTE_VENDOR_QUOTE_NUMBER),0)
FROM PO_HEADERS_ALL
WHERE QUOTE_VENDOR_QUOTE_NUMBER='foo.bar';

Sum distinct records in a table with duplicates in Teradata

I have a table that has some duplicates. I can count the distinct records to get the Total Volume. When I try to Sum when the CompTia Code is B92 and run distinct is still counts the dupes.
Here is the query:
select
a.repair_week_period,
count(distinct a.notif_id) as Total_Volume,
sum(distinct case when a.header_comptia_cd = 'B92' then 1 else 0 end) as B92_Sum
FROM artemis_biz_app.aca_service_event a
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
group by 1
order by 1
;
Is There a way to only SUM the distinct records for B92?
I also tried inner joining the table on itself by selecting the distinct notification id and joining on that notification id, but still getting wrong sum counts.
Thanks!
Your B92_Sum currently returns either NULL, 1 or 2, this is definitely no sum.
To sum distinct values you need something like
sum(distinct case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end)
If this column_to_sum is actually the notif_id you get a conditional count but not a sum.
Otherwise the distinct might remove too many vales and then you probably need a Derived Table where you remove duplicates before aggregation:
select
repair_week_period,
--no more distinct needed
count(a.notif_id) as Total_Volume,
sum(case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end) as B92_Sum
FROM
(
select repair_week_period,
notif_id
header_comptia_cd,
column_to_sum
from artemis_biz_app.aca_service_event
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
-- only onw row per notif_id
qualify row_number() over (partition by notif_id order by ???) = 1
) a
group by 1
order by 1
;
#dnoeth It seems the solution to my problem was not to SUM the data, but to count distinct it.
This is how I resolved my problem:
count(distinct case when a.header_comptia_cd = 'B92' then a.notif_id else NULL end) as B92_Sum

Counting null and non-null values in a single query

I have a table
create table us
(
a number
);
Now I have data like:
a
1
2
3
4
null
null
null
8
9
Now I need a single query to count null and not null values in column a
This works for Oracle and SQL Server (you might be able to get it to work on another RDBMS):
select sum(case when a is null then 1 else 0 end) count_nulls
, count(a) count_not_nulls
from us;
Or:
select count(*) - count(a), count(a) from us;
If I understood correctly you want to count all NULL and all NOT NULL in a column...
If that is correct:
SELECT count(*) FROM us WHERE a IS NULL
UNION ALL
SELECT count(*) FROM us WHERE a IS NOT NULL
Edited to have the full query, after reading the comments :]
SELECT COUNT(*), 'null_tally' AS narrative
FROM us
WHERE a IS NULL
UNION
SELECT COUNT(*), 'not_null_tally' AS narrative
FROM us
WHERE a IS NOT NULL;
Here is a quick and dirty version that works on Oracle :
select sum(case a when null then 1 else 0) "Null values",
sum(case a when null then 0 else 1) "Non-null values"
from us
for non nulls
select count(a)
from us
for nulls
select count(*)
from us
minus
select count(a)
from us
Hence
SELECT COUNT(A) NOT_NULLS
FROM US
UNION
SELECT COUNT(*) - COUNT(A) NULLS
FROM US
ought to do the job
Better in that the column titles come out correct.
SELECT COUNT(A) NOT_NULL, COUNT(*) - COUNT(A) NULLS
FROM US
In some testing on my system, it costs a full table scan.
As i understood your query, You just run this script and get Total Null,Total NotNull rows,
select count(*) - count(a) as 'Null', count(a) as 'Not Null' from us;
usually i use this trick
select sum(case when a is null then 0 else 1 end) as count_notnull,
sum(case when a is null then 1 else 0 end) as count_null
from tab
group by a
Just to provide yet another alternative, Postgres 9.4+ allows applying a FILTER to aggregates:
SELECT
COUNT(*) FILTER (WHERE a IS NULL) count_nulls,
COUNT(*) FILTER (WHERE a IS NOT NULL) count_not_nulls
FROM us;
SQLFiddle: http://sqlfiddle.com/#!17/80a24/5
This is little tricky. Assume the table has just one column, then the Count(1) and Count(*) will give different values.
set nocount on
declare #table1 table (empid int)
insert #table1 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(NULL),(11),(12),(NULL),(13),(14);
select * from #table1
select COUNT(1) as "COUNT(1)" from #table1
select COUNT(empid) "Count(empid)" from #table1
Query Results
As you can see in the image, The first result shows the table has 16 rows. out of which two rows are NULL. So when we use Count(*) the query engine counts the number of rows, So we got count result as 16. But in case of Count(empid) it counted the non-NULL-values in the column empid. So we got the result as 14.
so whenever we are using COUNT(Column) make sure we take care of NULL values as shown below.
select COUNT(isnull(empid,1)) from #table1
will count both NULL and Non-NULL values.
Note: Same thing applies even when the table is made up of more than one column. Count(1) will give total number of rows irrespective of NULL/Non-NULL values. Only when the column values are counted using Count(Column) we need to take care of NULL values.
I had a similar issue: to count all distinct values, counting null values as 1, too. A simple count doesn't work in this case, as it does not take null values into account.
Here's a snippet that works on SQL and does not involve selection of new values.
Basically, once performed the distinct, also return the row number in a new column (n) using the row_number() function, then perform a count on that column:
SELECT COUNT(n)
FROM (
SELECT *, row_number() OVER (ORDER BY [MyColumn] ASC) n
FROM (
SELECT DISTINCT [MyColumn]
FROM [MyTable]
) items
) distinctItems
Try this..
SELECT CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END a,
Count(1)
FROM us
GROUP BY CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END
Here are two solutions:
Select count(columnname) as countofNotNulls, count(isnull(columnname,1))-count(columnname) AS Countofnulls from table name
OR
Select count(columnname) as countofNotNulls, count(*)-count(columnname) AS Countofnulls from table name
Try
SELECT
SUM(ISNULL(a)) AS all_null,
SUM(!ISNULL(a)) AS all_not_null
FROM us;
Simple!
If you're using MS Sql Server...
SELECT COUNT(0) AS 'Null_ColumnA_Records',
(
SELECT COUNT(0)
FROM your_table
WHERE ColumnA IS NOT NULL
) AS 'NOT_Null_ColumnA_Records'
FROM your_table
WHERE ColumnA IS NULL;
I don't recomend you doing this... but here you have it (in the same table as result)
use ISNULL embedded function.
All the answers are either wrong or extremely out of date.
The simple and correct way of doing this query is using COUNT_IF function.
SELECT
COUNT_IF(a IS NULL) AS nulls,
COUNT_IF(a IS NOT NULL) AS not_nulls
FROM
us
SELECT SUM(NULLs) AS 'NULLS', SUM(NOTNULLs) AS 'NOTNULLs' FROM
(select count(*) AS 'NULLs', 0 as 'NOTNULLs' FROM us WHERE a is null
UNION select 0 as 'NULLs', count(*) AS 'NOTNULLs' FROM us WHERE a is not null) AS x
It's fugly, but it will return a single record with 2 cols indicating the count of nulls vs non nulls.
This works in T-SQL. If you're just counting the number of something and you want to include the nulls, use COALESCE instead of case.
IF OBJECT_ID('tempdb..#us') IS NOT NULL
DROP TABLE #us
CREATE TABLE #us
(
a INT NULL
);
INSERT INTO #us VALUES (1),(2),(3),(4),(NULL),(NULL),(NULL),(8),(9)
SELECT * FROM #us
SELECT CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END AS 'NULL?',
COUNT(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END) AS 'Count'
FROM #us
GROUP BY CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END
SELECT COALESCE(CAST(a AS NVARCHAR),'NULL') AS a,
COUNT(COALESCE(CAST(a AS NVARCHAR),'NULL')) AS 'Count'
FROM #us
GROUP BY COALESCE(CAST(a AS NVARCHAR),'NULL')
Building off of Alberto, I added the rollup.
SELECT [Narrative] = CASE
WHEN [Narrative] IS NULL THEN 'count_total' ELSE [Narrative] END
,[Count]=SUM([Count]) FROM (SELECT COUNT(*) [Count], 'count_nulls' AS [Narrative]
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NULL
UNION
SELECT COUNT(*), 'count_not_nulls ' AS narrative
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NOT NULL) S
GROUP BY [Narrative] WITH CUBE;
SELECT
ALL_VALUES
,COUNT(ALL_VALUES)
FROM(
SELECT
NVL2(A,'NOT NULL','NULL') AS ALL_VALUES
,NVL(A,0)
FROM US
)
GROUP BY ALL_VALUES
select count(isnull(NullableColumn,-1))
if its mysql, you can try something like this.
select
(select count(*) from TABLENAME WHERE a = 'null') as total_null,
(select count(*) from TABLENAME WHERE a != 'null') as total_not_null
FROM TABLENAME
Just in case you wanted it in a single record:
select
(select count(*) from tbl where colName is null) Nulls,
(select count(*) from tbl where colName is not null) NonNulls
;-)
for counting not null values
select count(*) from us where a is not null;
for counting null values
select count(*) from us where a is null;
I created the table in postgres 10 and both of the following worked:
select count(*) from us
and
select count(a is null) from us
In my case I wanted the "null distribution" amongst multiple columns:
SELECT
(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS a_null,
(CASE WHEN b IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS b_null,
(CASE WHEN c IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS c_null,
...
count(*)
FROM us
GROUP BY 1, 2, 3,...
ORDER BY 1, 2, 3,...
As per the '...' it is easily extendable to more columns, as many as needed
Number of elements where a is null:
select count(a) from us where a is null;
Number of elements where a is not null:
select count(a) from us where a is not null;