Is there a way to avoid columns from GROUP BY - sql

My table has columns such as ID,Perdium and Location so I want to calculate all the perdiums given to an employee and the perdium share given in NY. The issue which I am facing is that SQL Server engine is throwing as error stating that location column isnt present in the GROUP BY clause(as needed in my use-case).If I include the location in the Group By clause I always get NYPerdiumShare as 1 which is not what I am expecting. Is there any workaround to this?
WITH CTE_Employee AS
(
SELECT ID,
SUM(Perdium) AS TotalPerdium,
CASE WHEN Location='NY' THEN SUM(Perdium) ELSE NULL END AS NYPerdium FROM EmployeePerdium
GROUP BY ID
)
SELECT ID,
TotalPerdium,
NYPerdium/TotalPerdium AS NYPerdiumShare
FROM CTE_Employee

You can eliminate the need to group by on anything other than ID by rewriting your query as follows to hide CASE inside an aggregate function:
WITH CTE_Employee AS (
SELECT
ID
, SUM(Perdium) AS TotalPerdium
, SUM(CASE WHEN Location='NY' THEN Perdium ELSE 0 END) AS NYPerdium
FROM EmployeePerdium
GROUP BY ID
)
SELECT
ID
, TotalPerdium
, NYPerdium/TotalPerdium AS NYPerdiumShare
FROM CTE_Employee

You don't need a cte here. Just use the sum window function.
SELECT DISTINCT
ID,
SUM(Perdium) OVER() as TotalPerdium
SUM(CASE WHEN Location='NY' THEN 1.0*Perdium ELSE 0 END) OVER(PARTITION BY ID)
/SUM(Perdium) OVER() AS NYPerdium
FROM EmployeePerdium

Related

find duplicate row in the same table and mark them in sql

I have table 'workadress' and it contain 6 columns:
work_ref,work_street ,work_zip,workTN,...
I want to find duplicate rows in the same table depending on:
If (work_street, work_zip) are duplicate together, then you should look at workTN. If it is the same then put value ' ok ', but if workTN is not the same, put 'not ok'. How can I do it with SQL?
Result like:
You can use window functions:
select t.*,
(case when min(workTn) over (partition by work_street, work_zip) =
max(workTn) over (partition by work_street, work_zip)
then 'ok' else 'not ok'
end) as result
from t;
I think just a simple group by and count should be enough to do the job like so:
select
t.*,
case when dups.dups = 1 then 'OK' else 'not OK' end
from my_table t
join (
select work_street, work_zip, count(distinct workTN) dups
from my_table
group by work_street, work_zip
) dups on dups.work_street = t.work_street amd dups.work_zip = t.work_zip

Sum distinct records in a table with duplicates in Teradata

I have a table that has some duplicates. I can count the distinct records to get the Total Volume. When I try to Sum when the CompTia Code is B92 and run distinct is still counts the dupes.
Here is the query:
select
a.repair_week_period,
count(distinct a.notif_id) as Total_Volume,
sum(distinct case when a.header_comptia_cd = 'B92' then 1 else 0 end) as B92_Sum
FROM artemis_biz_app.aca_service_event a
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
group by 1
order by 1
;
Is There a way to only SUM the distinct records for B92?
I also tried inner joining the table on itself by selecting the distinct notification id and joining on that notification id, but still getting wrong sum counts.
Thanks!
Your B92_Sum currently returns either NULL, 1 or 2, this is definitely no sum.
To sum distinct values you need something like
sum(distinct case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end)
If this column_to_sum is actually the notif_id you get a conditional count but not a sum.
Otherwise the distinct might remove too many vales and then you probably need a Derived Table where you remove duplicates before aggregation:
select
repair_week_period,
--no more distinct needed
count(a.notif_id) as Total_Volume,
sum(case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end) as B92_Sum
FROM
(
select repair_week_period,
notif_id
header_comptia_cd,
column_to_sum
from artemis_biz_app.aca_service_event
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
-- only onw row per notif_id
qualify row_number() over (partition by notif_id order by ???) = 1
) a
group by 1
order by 1
;
#dnoeth It seems the solution to my problem was not to SUM the data, but to count distinct it.
This is how I resolved my problem:
count(distinct case when a.header_comptia_cd = 'B92' then a.notif_id else NULL end) as B92_Sum

pivot table returns more than 1 row for the same ID

I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.
If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID

Constructing A Query In BigQuery With CASE Statements

So I'm trying to construct a query in BigQuery that I'm struggling with for a final part.
As of now I have:
SELECT
UNIQUE(Name) as SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) as RevenueGenerated
FROM (
SELECT
mantaSubscriptionIdmetadata,
planIdmetadata,
INTEGER(Amount) as RevenueGenerated
FROM
[sample_internal_data.charge0209]
WHERE
revenueSourcemetadata = 'new'
AND
Status = 'Paid'
GROUP BY
mantaSubscriptionIdmetadata,
planIdmetadata,
RevenueGenerated
)a
JOIN (
SELECT
id,
Name,
Interval
FROM
[sample_internal_data.subplans]
WHERE
id in ('150017','150030','150033','150019')
GROUP BY
id,
Name,
Interval )b
ON
a.planIdmetadata = b.id
GROUP BY
ID,
Interval,
Name
ORDER BY
Interval ASC
The resulting query looks like this
Which is exactly what I'm looking for up to that point.
Now what I'm stuck on this. There is another column I need to add called SalesRepName. The resulting field will either be null or not null. If its null it means it was sold online. If its not null, it means it was sold via telephone. What I want to do is create two additional columns where it says how many were sold via telesales and via online. The sum total of the two columns will always equal the SubsPurchased total.
Can anyone help?
You can include case statements within aggregate functions. Here you could choose sum(case when SalesRepName is null then 1 else 0 end) as online and sum(case when SalesRepName is not null then 1 else 0 end) as telesales.
count(case when SalesRepName is null then 1 end) as online would give the same result. Using sum in these situations is simply my personal preference.
Note that omitting the else clause is equivalent to setting else null, and null isn't counted by count. This can be very useful in combination with exact_count_distinct, which has no equivalent in terms of sum.
Try below:
it assumes your SalesRepName field is in [sample_internal_data.charge0209] table
and then it uses "tiny version" of SUM(CASE ... WHEN ...) which works when you need 0 or 1 as a result to be SUM'ed
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telsales
SELECT
UNIQUE(Name) AS SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) AS RevenueGenerated,
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telesales
FROM (
SELECT SalesRepName, mantaSubscriptionIdmetadata, planIdmetadata, INTEGER(Amount) AS RevenueGenerated
FROM [sample_internal_data.charge0209]
WHERE revenueSourcemetadata = 'new'
AND Status = 'Paid'
GROUP BY mantaSubscriptionIdmetadata, planIdmetadata, RevenueGenerated
)a
JOIN (
SELECT id, Name, Interval
FROM [sample_internal_data.subplans]
WHERE id IN ('150017','150030','150033','150019')
GROUP BY id, Name, Interval
)b
ON a.planIdmetadata = b.id
GROUP BY ID, Interval, Name
ORDER BY Interval ASC

Counting null and non-null values in a single query

I have a table
create table us
(
a number
);
Now I have data like:
a
1
2
3
4
null
null
null
8
9
Now I need a single query to count null and not null values in column a
This works for Oracle and SQL Server (you might be able to get it to work on another RDBMS):
select sum(case when a is null then 1 else 0 end) count_nulls
, count(a) count_not_nulls
from us;
Or:
select count(*) - count(a), count(a) from us;
If I understood correctly you want to count all NULL and all NOT NULL in a column...
If that is correct:
SELECT count(*) FROM us WHERE a IS NULL
UNION ALL
SELECT count(*) FROM us WHERE a IS NOT NULL
Edited to have the full query, after reading the comments :]
SELECT COUNT(*), 'null_tally' AS narrative
FROM us
WHERE a IS NULL
UNION
SELECT COUNT(*), 'not_null_tally' AS narrative
FROM us
WHERE a IS NOT NULL;
Here is a quick and dirty version that works on Oracle :
select sum(case a when null then 1 else 0) "Null values",
sum(case a when null then 0 else 1) "Non-null values"
from us
for non nulls
select count(a)
from us
for nulls
select count(*)
from us
minus
select count(a)
from us
Hence
SELECT COUNT(A) NOT_NULLS
FROM US
UNION
SELECT COUNT(*) - COUNT(A) NULLS
FROM US
ought to do the job
Better in that the column titles come out correct.
SELECT COUNT(A) NOT_NULL, COUNT(*) - COUNT(A) NULLS
FROM US
In some testing on my system, it costs a full table scan.
As i understood your query, You just run this script and get Total Null,Total NotNull rows,
select count(*) - count(a) as 'Null', count(a) as 'Not Null' from us;
usually i use this trick
select sum(case when a is null then 0 else 1 end) as count_notnull,
sum(case when a is null then 1 else 0 end) as count_null
from tab
group by a
Just to provide yet another alternative, Postgres 9.4+ allows applying a FILTER to aggregates:
SELECT
COUNT(*) FILTER (WHERE a IS NULL) count_nulls,
COUNT(*) FILTER (WHERE a IS NOT NULL) count_not_nulls
FROM us;
SQLFiddle: http://sqlfiddle.com/#!17/80a24/5
This is little tricky. Assume the table has just one column, then the Count(1) and Count(*) will give different values.
set nocount on
declare #table1 table (empid int)
insert #table1 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(NULL),(11),(12),(NULL),(13),(14);
select * from #table1
select COUNT(1) as "COUNT(1)" from #table1
select COUNT(empid) "Count(empid)" from #table1
Query Results
As you can see in the image, The first result shows the table has 16 rows. out of which two rows are NULL. So when we use Count(*) the query engine counts the number of rows, So we got count result as 16. But in case of Count(empid) it counted the non-NULL-values in the column empid. So we got the result as 14.
so whenever we are using COUNT(Column) make sure we take care of NULL values as shown below.
select COUNT(isnull(empid,1)) from #table1
will count both NULL and Non-NULL values.
Note: Same thing applies even when the table is made up of more than one column. Count(1) will give total number of rows irrespective of NULL/Non-NULL values. Only when the column values are counted using Count(Column) we need to take care of NULL values.
I had a similar issue: to count all distinct values, counting null values as 1, too. A simple count doesn't work in this case, as it does not take null values into account.
Here's a snippet that works on SQL and does not involve selection of new values.
Basically, once performed the distinct, also return the row number in a new column (n) using the row_number() function, then perform a count on that column:
SELECT COUNT(n)
FROM (
SELECT *, row_number() OVER (ORDER BY [MyColumn] ASC) n
FROM (
SELECT DISTINCT [MyColumn]
FROM [MyTable]
) items
) distinctItems
Try this..
SELECT CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END a,
Count(1)
FROM us
GROUP BY CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END
Here are two solutions:
Select count(columnname) as countofNotNulls, count(isnull(columnname,1))-count(columnname) AS Countofnulls from table name
OR
Select count(columnname) as countofNotNulls, count(*)-count(columnname) AS Countofnulls from table name
Try
SELECT
SUM(ISNULL(a)) AS all_null,
SUM(!ISNULL(a)) AS all_not_null
FROM us;
Simple!
If you're using MS Sql Server...
SELECT COUNT(0) AS 'Null_ColumnA_Records',
(
SELECT COUNT(0)
FROM your_table
WHERE ColumnA IS NOT NULL
) AS 'NOT_Null_ColumnA_Records'
FROM your_table
WHERE ColumnA IS NULL;
I don't recomend you doing this... but here you have it (in the same table as result)
use ISNULL embedded function.
All the answers are either wrong or extremely out of date.
The simple and correct way of doing this query is using COUNT_IF function.
SELECT
COUNT_IF(a IS NULL) AS nulls,
COUNT_IF(a IS NOT NULL) AS not_nulls
FROM
us
SELECT SUM(NULLs) AS 'NULLS', SUM(NOTNULLs) AS 'NOTNULLs' FROM
(select count(*) AS 'NULLs', 0 as 'NOTNULLs' FROM us WHERE a is null
UNION select 0 as 'NULLs', count(*) AS 'NOTNULLs' FROM us WHERE a is not null) AS x
It's fugly, but it will return a single record with 2 cols indicating the count of nulls vs non nulls.
This works in T-SQL. If you're just counting the number of something and you want to include the nulls, use COALESCE instead of case.
IF OBJECT_ID('tempdb..#us') IS NOT NULL
DROP TABLE #us
CREATE TABLE #us
(
a INT NULL
);
INSERT INTO #us VALUES (1),(2),(3),(4),(NULL),(NULL),(NULL),(8),(9)
SELECT * FROM #us
SELECT CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END AS 'NULL?',
COUNT(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END) AS 'Count'
FROM #us
GROUP BY CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END
SELECT COALESCE(CAST(a AS NVARCHAR),'NULL') AS a,
COUNT(COALESCE(CAST(a AS NVARCHAR),'NULL')) AS 'Count'
FROM #us
GROUP BY COALESCE(CAST(a AS NVARCHAR),'NULL')
Building off of Alberto, I added the rollup.
SELECT [Narrative] = CASE
WHEN [Narrative] IS NULL THEN 'count_total' ELSE [Narrative] END
,[Count]=SUM([Count]) FROM (SELECT COUNT(*) [Count], 'count_nulls' AS [Narrative]
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NULL
UNION
SELECT COUNT(*), 'count_not_nulls ' AS narrative
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NOT NULL) S
GROUP BY [Narrative] WITH CUBE;
SELECT
ALL_VALUES
,COUNT(ALL_VALUES)
FROM(
SELECT
NVL2(A,'NOT NULL','NULL') AS ALL_VALUES
,NVL(A,0)
FROM US
)
GROUP BY ALL_VALUES
select count(isnull(NullableColumn,-1))
if its mysql, you can try something like this.
select
(select count(*) from TABLENAME WHERE a = 'null') as total_null,
(select count(*) from TABLENAME WHERE a != 'null') as total_not_null
FROM TABLENAME
Just in case you wanted it in a single record:
select
(select count(*) from tbl where colName is null) Nulls,
(select count(*) from tbl where colName is not null) NonNulls
;-)
for counting not null values
select count(*) from us where a is not null;
for counting null values
select count(*) from us where a is null;
I created the table in postgres 10 and both of the following worked:
select count(*) from us
and
select count(a is null) from us
In my case I wanted the "null distribution" amongst multiple columns:
SELECT
(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS a_null,
(CASE WHEN b IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS b_null,
(CASE WHEN c IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS c_null,
...
count(*)
FROM us
GROUP BY 1, 2, 3,...
ORDER BY 1, 2, 3,...
As per the '...' it is easily extendable to more columns, as many as needed
Number of elements where a is null:
select count(a) from us where a is null;
Number of elements where a is not null:
select count(a) from us where a is not null;