find duplicate row in the same table and mark them in sql

find duplicate row in the same table and mark them in sql - sql

I have table 'workadress' and it contain 6 columns:
work_ref,work_street ,work_zip,workTN,...
I want to find duplicate rows in the same table depending on:
If (work_street, work_zip) are duplicate together, then you should look at workTN. If it is the same then put value ' ok ', but if workTN is not the same, put 'not ok'. How can I do it with SQL?
Result like:

You can use window functions:
select t.*,
(case when min(workTn) over (partition by work_street, work_zip) =
max(workTn) over (partition by work_street, work_zip)
then 'ok' else 'not ok'
end) as result
from t;

I think just a simple group by and count should be enough to do the job like so:
select
t.*,
case when dups.dups = 1 then 'OK' else 'not OK' end
from my_table t
join (
select work_street, work_zip, count(distinct workTN) dups
from my_table
group by work_street, work_zip
) dups on dups.work_street = t.work_street amd dups.work_zip = t.work_zip

Related

How to check unique values in SQL

I have a table named Bank that contains a Bank_Values column. I need a calculated Bank_Value_Unique column to shows whether each Bank_Value exists somewhere else in the table (i.e. whether its count is greater than 1).
I prepared this query, but it does not work. Could anyone help me with this and/or modify this query?
SELECT
CASE
WHEN NULLIF(LTRIM(RTRIM(Bank_Value)), '') =
(SELECT Bank_Value
FROM [Bank]
GROUP BY Bank_Value
HAVING COUNT(*) = 1)
THEN '0' ELSE '1'
END AS Bank_Key_Unique
FROM [Bank]

A windowed count should work:
SELECT
*,
CASE
COUNT(*) OVER (PARTITION BY Bank_Value)
WHEN 1 THEN 1 ELSE 0
END AS Bank_Value_Unique
FROM
Bank
;

It works also, but I found solution also:
select CASE WHEN NULLIF(LTRIM(RTRIM(Bank_Value)),'') =
(select Bank_Value
from Bank
group by Bank_Value
having (count(distinct Bank_Value) > 2 )) THEN '1' ELSE '0' END AS
Bank_Value_Uniquness
from Bank
It was missing "distinct" in having part.

SQL Select a specific value in the group

I have this following table
Dept---------- Sub_Dept---- Dept Type
Sales.............Advertising........A
Sales.............Marketing......... B
Sales.............Analytics.......... C
Operations.....IT..................... C
Operations.....Settlement........C
And the result should be if a department got a department type as A then change all record of that department to A, else keep it same
Dept---------- Sub_Dept---- Dept Type
Sales.............Advertising........A
Sales.............Marketing......... A
Sales.............Analytics.......... A
Operations.....IT..................... C
Operations.....Settlement........C
Anybody can give a suggestion on this? I thought of using the GROUP BY but have to output the Sub Department as well
Thanks a lot

I would do:
update t
set depttype = 'a'
where exists (select 1 from t t2 where t2.dept = t.dept and t2.dept = 'a') and
t.dept <> 'a';
If you just want a select, then do:
select t.*,
(case when sum(case when depttype = 'a' then 1 else 0 end) over (partition by dept) > 1
then 'a'
else depttype
end) as new_depttype
from t;

Use below query
select a11.dept, a12.Sub_Dept, (case when a12.min_dep_type='A' then 'A' else a11.dep_type) as dep_type
from tab a11
JOIN (select dept, min(dep_type) min_dep_type from tab group by dept) a12
on a11.dept = a12.dept

Try this:
update table
set depttype= case when dept in (select dept from table where depttype='a') then 'a' else depttype end

This should work:
select a.dept, a.sub_dept,
case when b.dept is not null then 'A' else dept_type end as dept_type
from aTable a
left join(
select distinct Dept from aTable where dept_type = 'A'
)
b on b.dept = a.dept

You could use analytic functions to check whether exists the specific value in the group.
Try below query:
SELECT t.Dept,
t.Sub_Dept,
NVL(MIN(CASE WHEN t.Dept_Type = 'A'
THEN Dept_Type END) OVER (PARTITION BY t.Dept), t.Dept_Type) AS Dept_Type
FROM table_1 t
Using the analytic function MIN(), you can search for the value of 'A' (if it does exist inside the group). MIN works for non-null values only, so if you don't have any 'A' in the group, the result will be NULL.
At this point, you can use NVL to choose whether to print the value found in the group or the actual dept_type of the row.

Is there a way to avoid columns from GROUP BY

My table has columns such as ID,Perdium and Location so I want to calculate all the perdiums given to an employee and the perdium share given in NY. The issue which I am facing is that SQL Server engine is throwing as error stating that location column isnt present in the GROUP BY clause(as needed in my use-case).If I include the location in the Group By clause I always get NYPerdiumShare as 1 which is not what I am expecting. Is there any workaround to this?
WITH CTE_Employee AS
(
SELECT ID,
SUM(Perdium) AS TotalPerdium,
CASE WHEN Location='NY' THEN SUM(Perdium) ELSE NULL END AS NYPerdium FROM EmployeePerdium
GROUP BY ID
)
SELECT ID,
TotalPerdium,
NYPerdium/TotalPerdium AS NYPerdiumShare
FROM CTE_Employee

You can eliminate the need to group by on anything other than ID by rewriting your query as follows to hide CASE inside an aggregate function:
WITH CTE_Employee AS (
SELECT
ID
, SUM(Perdium) AS TotalPerdium
, SUM(CASE WHEN Location='NY' THEN Perdium ELSE 0 END) AS NYPerdium
FROM EmployeePerdium
GROUP BY ID
)
SELECT
ID
, TotalPerdium
, NYPerdium/TotalPerdium AS NYPerdiumShare
FROM CTE_Employee

You don't need a cte here. Just use the sum window function.
SELECT DISTINCT
ID,
SUM(Perdium) OVER() as TotalPerdium
SUM(CASE WHEN Location='NY' THEN 1.0*Perdium ELSE 0 END) OVER(PARTITION BY ID)
/SUM(Perdium) OVER() AS NYPerdium
FROM EmployeePerdium

Sum distinct records in a table with duplicates in Teradata

I have a table that has some duplicates. I can count the distinct records to get the Total Volume. When I try to Sum when the CompTia Code is B92 and run distinct is still counts the dupes.
Here is the query:
select
a.repair_week_period,
count(distinct a.notif_id) as Total_Volume,
sum(distinct case when a.header_comptia_cd = 'B92' then 1 else 0 end) as B92_Sum
FROM artemis_biz_app.aca_service_event a
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
group by 1
order by 1
;
Is There a way to only SUM the distinct records for B92?
I also tried inner joining the table on itself by selecting the distinct notification id and joining on that notification id, but still getting wrong sum counts.
Thanks!

Your B92_Sum currently returns either NULL, 1 or 2, this is definitely no sum.
To sum distinct values you need something like
sum(distinct case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end)
If this column_to_sum is actually the notif_id you get a conditional count but not a sum.
Otherwise the distinct might remove too many vales and then you probably need a Derived Table where you remove duplicates before aggregation:
select
repair_week_period,
--no more distinct needed
count(a.notif_id) as Total_Volume,
sum(case when a.header_comptia_cd = 'B92' then column_to_sum else 0 end) as B92_Sum
FROM
(
select repair_week_period,
notif_id
header_comptia_cd,
column_to_sum
from artemis_biz_app.aca_service_event
where a.Sales_Org_Cd = '8210'
and a.notif_creation_dt >= current_date - 180
-- only onw row per notif_id
qualify row_number() over (partition by notif_id order by ???) = 1
) a
group by 1
order by 1
;

#dnoeth It seems the solution to my problem was not to SUM the data, but to count distinct it.
This is how I resolved my problem:
count(distinct case when a.header_comptia_cd = 'B92' then a.notif_id else NULL end) as B92_Sum

Counting null and non-null values in a single query

I have a table
create table us
(
a number
);
Now I have data like:
a
1
2
3
4
null
null
null
8
9
Now I need a single query to count null and not null values in column a

This works for Oracle and SQL Server (you might be able to get it to work on another RDBMS):
select sum(case when a is null then 1 else 0 end) count_nulls
, count(a) count_not_nulls
from us;
Or:
select count(*) - count(a), count(a) from us;

If I understood correctly you want to count all NULL and all NOT NULL in a column...
If that is correct:
SELECT count(*) FROM us WHERE a IS NULL
UNION ALL
SELECT count(*) FROM us WHERE a IS NOT NULL
Edited to have the full query, after reading the comments :]
SELECT COUNT(*), 'null_tally' AS narrative
FROM us
WHERE a IS NULL
UNION
SELECT COUNT(*), 'not_null_tally' AS narrative
FROM us
WHERE a IS NOT NULL;

Here is a quick and dirty version that works on Oracle :
select sum(case a when null then 1 else 0) "Null values",
sum(case a when null then 0 else 1) "Non-null values"
from us

for non nulls
select count(a)
from us
for nulls
select count(*)
from us
minus
select count(a)
from us
Hence
SELECT COUNT(A) NOT_NULLS
FROM US
UNION
SELECT COUNT(*) - COUNT(A) NULLS
FROM US
ought to do the job
Better in that the column titles come out correct.
SELECT COUNT(A) NOT_NULL, COUNT(*) - COUNT(A) NULLS
FROM US
In some testing on my system, it costs a full table scan.

As i understood your query, You just run this script and get Total Null,Total NotNull rows,
select count(*) - count(a) as 'Null', count(a) as 'Not Null' from us;

usually i use this trick
select sum(case when a is null then 0 else 1 end) as count_notnull,
sum(case when a is null then 1 else 0 end) as count_null
from tab
group by a

Just to provide yet another alternative, Postgres 9.4+ allows applying a FILTER to aggregates:
SELECT
COUNT(*) FILTER (WHERE a IS NULL) count_nulls,
COUNT(*) FILTER (WHERE a IS NOT NULL) count_not_nulls
FROM us;
SQLFiddle: http://sqlfiddle.com/#!17/80a24/5

This is little tricky. Assume the table has just one column, then the Count(1) and Count(*) will give different values.
set nocount on
declare #table1 table (empid int)
insert #table1 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(NULL),(11),(12),(NULL),(13),(14);
select * from #table1
select COUNT(1) as "COUNT(1)" from #table1
select COUNT(empid) "Count(empid)" from #table1
Query Results
As you can see in the image, The first result shows the table has 16 rows. out of which two rows are NULL. So when we use Count(*) the query engine counts the number of rows, So we got count result as 16. But in case of Count(empid) it counted the non-NULL-values in the column empid. So we got the result as 14.
so whenever we are using COUNT(Column) make sure we take care of NULL values as shown below.
select COUNT(isnull(empid,1)) from #table1
will count both NULL and Non-NULL values.
Note: Same thing applies even when the table is made up of more than one column. Count(1) will give total number of rows irrespective of NULL/Non-NULL values. Only when the column values are counted using Count(Column) we need to take care of NULL values.

I had a similar issue: to count all distinct values, counting null values as 1, too. A simple count doesn't work in this case, as it does not take null values into account.
Here's a snippet that works on SQL and does not involve selection of new values.
Basically, once performed the distinct, also return the row number in a new column (n) using the row_number() function, then perform a count on that column:
SELECT COUNT(n)
FROM (
SELECT *, row_number() OVER (ORDER BY [MyColumn] ASC) n
FROM (
SELECT DISTINCT [MyColumn]
FROM [MyTable]
) items
) distinctItems

Try this..
SELECT CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END a,
Count(1)
FROM us
GROUP BY CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END

Here are two solutions:
Select count(columnname) as countofNotNulls, count(isnull(columnname,1))-count(columnname) AS Countofnulls from table name
OR
Select count(columnname) as countofNotNulls, count(*)-count(columnname) AS Countofnulls from table name

Try
SELECT
SUM(ISNULL(a)) AS all_null,
SUM(!ISNULL(a)) AS all_not_null
FROM us;
Simple!

If you're using MS Sql Server...
SELECT COUNT(0) AS 'Null_ColumnA_Records',
(
SELECT COUNT(0)
FROM your_table
WHERE ColumnA IS NOT NULL
) AS 'NOT_Null_ColumnA_Records'
FROM your_table
WHERE ColumnA IS NULL;
I don't recomend you doing this... but here you have it (in the same table as result)

use ISNULL embedded function.

All the answers are either wrong or extremely out of date.
The simple and correct way of doing this query is using COUNT_IF function.
SELECT
COUNT_IF(a IS NULL) AS nulls,
COUNT_IF(a IS NOT NULL) AS not_nulls
FROM
us

SELECT SUM(NULLs) AS 'NULLS', SUM(NOTNULLs) AS 'NOTNULLs' FROM
(select count(*) AS 'NULLs', 0 as 'NOTNULLs' FROM us WHERE a is null
UNION select 0 as 'NULLs', count(*) AS 'NOTNULLs' FROM us WHERE a is not null) AS x
It's fugly, but it will return a single record with 2 cols indicating the count of nulls vs non nulls.

This works in T-SQL. If you're just counting the number of something and you want to include the nulls, use COALESCE instead of case.
IF OBJECT_ID('tempdb..#us') IS NOT NULL
DROP TABLE #us
CREATE TABLE #us
(
a INT NULL
);
INSERT INTO #us VALUES (1),(2),(3),(4),(NULL),(NULL),(NULL),(8),(9)
SELECT * FROM #us
SELECT CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END AS 'NULL?',
COUNT(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END) AS 'Count'
FROM #us
GROUP BY CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END
SELECT COALESCE(CAST(a AS NVARCHAR),'NULL') AS a,
COUNT(COALESCE(CAST(a AS NVARCHAR),'NULL')) AS 'Count'
FROM #us
GROUP BY COALESCE(CAST(a AS NVARCHAR),'NULL')

Building off of Alberto, I added the rollup.
SELECT [Narrative] = CASE
WHEN [Narrative] IS NULL THEN 'count_total' ELSE [Narrative] END
,[Count]=SUM([Count]) FROM (SELECT COUNT(*) [Count], 'count_nulls' AS [Narrative]
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NULL
UNION
SELECT COUNT(*), 'count_not_nulls ' AS narrative
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NOT NULL) S
GROUP BY [Narrative] WITH CUBE;

SELECT
ALL_VALUES
,COUNT(ALL_VALUES)
FROM(
SELECT
NVL2(A,'NOT NULL','NULL') AS ALL_VALUES
,NVL(A,0)
FROM US
)
GROUP BY ALL_VALUES

select count(isnull(NullableColumn,-1))

if its mysql, you can try something like this.
select
(select count(*) from TABLENAME WHERE a = 'null') as total_null,
(select count(*) from TABLENAME WHERE a != 'null') as total_not_null
FROM TABLENAME

Just in case you wanted it in a single record:
select
(select count(*) from tbl where colName is null) Nulls,
(select count(*) from tbl where colName is not null) NonNulls
;-)

for counting not null values
select count(*) from us where a is not null;
for counting null values
select count(*) from us where a is null;

I created the table in postgres 10 and both of the following worked:
select count(*) from us
and
select count(a is null) from us

In my case I wanted the "null distribution" amongst multiple columns:
SELECT
(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS a_null,
(CASE WHEN b IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS b_null,
(CASE WHEN c IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS c_null,
...
count(*)
FROM us
GROUP BY 1, 2, 3,...
ORDER BY 1, 2, 3,...
As per the '...' it is easily extendable to more columns, as many as needed

Number of elements where a is null:
select count(a) from us where a is null;
Number of elements where a is not null:
select count(a) from us where a is not null;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

find duplicate row in the same table and mark them in sql - sql

You can use window functions: select t.*, (case when min(workTn) over (partition by work_street, work_zip) = max(workTn) over (partition by work_street, work_zip) then 'ok' else 'not ok' end) as result from t;

Related

How to check unique values in SQL

SQL Select a specific value in the group

Is there a way to avoid columns from GROUP BY

Sum distinct records in a table with duplicates in Teradata

Counting null and non-null values in a single query

Categories

Resources