Duplicate Counts - TSQL

Duplicate Counts - TSQL - sql

I want to get All records that has duplicate values for SOME of the fields (i.e. Key columns).
My code:
CREATE TABLE #TEMP (ID int, Descp varchar(5), Extra varchar(6))
INSERT INTO #Temp
SELECT 1,'One','Extra1'
UNION ALL
SELECT 2,'Two','Extra2'
UNION ALL
SELECT 3,'Three','Extra3'
UNION ALL
SELECT 1,'One','Extra4'
SELECT ID, Descp, Extra FROM #TEMP
;WITH Temp_CTE AS
(SELECT *
, ROW_NUMBER() OVER (PARTITION BY ID, Descp ORDER BY (SELECT 0))
AS DuplicateRowNumber
FROM #TEMP
)
SELECT * FROM Temp_cte
DROP TABLE #TEMP
The last column tells me how many times each row has appeared based on ID and Descp values.
I want that row but I ALSO need another column* that indicates both rows for ID = 1 and Descp = 'One' has showed up more than once.
So an extra column* (i.e. MultipleOccurances (bool)) which has 1 for two rows with ID = 1 and Descp = 'One' and 0 for other rows as they are only showing up once.
How can I achieve that? (I want to avoid using Count(1)>1 or something if possible.
Edit:
Desired output:
ID Descp Extra DuplicateRowNumber IsMultiple
1 One Extra1 1 1
1 One Extra4 2 1
2 Two Extra2 1 0
3 Three Extra3 1 0
SQL Fiddle

You say "I want to avoid using Count" but it is probably the best way. It uses the partitioning you already have on the row_number
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID, Descp
ORDER BY (SELECT 0)) AS DuplicateRowNumber,
CASE
WHEN COUNT(*) OVER (PARTITION BY ID, Descp) > 1 THEN 1
ELSE 0
END AS IsMultiple
FROM #Temp
And the execution plan just shows a single sort

Well, I have this solution, but using a Count...
SELECT T1.*,
ROW_NUMBER() OVER (PARTITION BY T1.ID, T1.Descp ORDER BY (SELECT 0)) AS DuplicateRowNumber,
CASE WHEN T2.C = 1 THEN 0 ELSE 1 END MultipleOcurrences FROM #temp T1
INNER JOIN
(SELECT ID, Descp, COUNT(1) C FROM #TEMP GROUP BY ID, Descp) T2
ON T1.ID = T2.ID AND T1.Descp = T2.Descp

Related

how to find all column records are same or not in group by column in SQL

How to find all column values are same in Group by of rows in table
CREATE TABLE #Temp (ID int,Value char(1))
insert into #Temp (ID ,Value ) ( Select 1 ,'A' union all Select 1 ,'W' union all Select 1 ,'I' union all Select 2 ,'I' union all Select 2 ,'I' union all Select 3 ,'A' union all Select 3 ,'B' union all Select 3 ,'1' )
select * from #Temp
Sample Table:
How to find all column value of 'Value' column are same or not if group by 'ID' Column.
Ex: select ID from #Temp group by ID
For ID 1 - Value column records are A, W, I - Not Same
For ID 2 - Value column records are I, I - Same
For ID 3 - Value column records are A, B, 1 - Not Same
I want the query to get a result like below

When all items in the group are the same, COUNT(DISTINCT Value) would be 1:
SELECT Id
, CASE WHEN COUNT(DISTINCT Value)=1 THEN 'Same' ELSE 'Not Same' END AS Result
FROM MyTable
GROUP BY Id

If you're using T-SQL, perhaps this will work for you:
SELECT t.ID,
CASE WHEN MAX(t.RN) > 1 THEN 'Same' ELSE 'Not Same' END AS GroupResults
FROM(
SELECT *, ROW_NUMBER() OVER(PARTITION BY ID, VALUE ORDER BY ID) RN
FROM #Temp
) t
GROUP BY t.ID

Usally that's rather easy: Aggregate per ID and count distinct values or compare minimum and maximum value.
However, neither COUNT(DISTINCT value) nor MIN(value) nor MAX(value) take nulls into consideration. So for an ID having value 'A' and null, these would detect uniqueness. Maybe this is what you want or nulls don't even occur in your data.
But if you want nulls to count as a value, then select distinct values first (where null gets a row too) and count then:
select id, case when count(*) = 1 then 'same' else 'not same' end as result
from (select distinct id, value from #temp) dist
group by id
order by id;
Rextester demo: http://rextester.com/KCZD88697

SQL Get rows based on conditions

I'm currently having trouble writing the business logic to get rows from a table with id's and a flag which I have appended to it.
For example,
id: id seq num: flag: Date:
A 1 N ..
A 2 N ..
A 3 N
A 4 Y
B 1 N
B 2 Y
B 3 N
C 1 N
C 2 N
The end result I'm trying to achieve is that:
For each unique ID I just want to retrieve one row with the condition for that row being that
If the flag was a "Y" then return that row.
Else return the last "N" row.
Another thing to note is that the 'Y' flag is not always necessarily the last
I've been trying to get a case condition using a partition like
OVER (PARTITION BY A."ID" ORDER BY A."Seq num") but so far no luck.
-- EDIT:
From the table, the sample result would be:
id: id seq num: flag: date:
A 4 Y ..
B 2 Y ..
C 2 N ..

Using a window clause is the right idea. You should partition the results by the ID (as you've done), and order them so the Y flag rows come first, then all the N flag rows in descending date order, and pick the first for each id:
SELECT id, id_seq_num, flag, date
FROM (SELECT id, id_seq_num, flag, date,
ROW_NUMBER() OVER (PARTITION BY id
ORDER BY CASE flag WHEN 'Y' THEN 0
ELSE 1
END ASC,
date ASC) AS rk
FROM mytable) t
WHERE rk = 1

My approach is to take a UNION of two queries. The first query simply selects all Yes records, assuming that Yes only appears once per ID group. The second query targets only those ID having no Yes anywhere. For those records, we use the row number to select the most recent No record.
WITH cte1 AS (
SELECT id
FROM yourTable
GROUP BY id
HAVING SUM(CASE WHEN flag = 'Y' THEN 1 ELSE 0 END) = 0
),
cte2 AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY t1.id ORDER BY t1."id seq" DESC) rn
FROM yourTable t1
INNER JOIN cte1 t2
ON t1.id = t2.id
)
SELECT *
FROM yourTable
WHERE flag = 'Y'
UNION ALL
SELECT *
FROM cte2 t2
WHERE t2.rn = 1

Here's one way (with quite generic SQL):
select t1.*
from Table1 as t1
where t1.id_seq_num = COALESCE(
(select max(id_seq_num) from Table1 as T2 where t1.id = t2.id and t2.flag = 'Y') ,
(select max(id_seq_num) from Table1 as T3 where t1.id = t3.id and t3.flag = 'N') )
Available in a fiddle here: http://sqlfiddle.com/#!9/5f7f9/6

SELECT DISTINCT id, flag
FROM yourTable

Select first missing id above 0

I have in my columns (ID) values
5
6
9
I want to select first missing IDfrom above 0. My desire select value will be 1.(if 1 exists then it will selects 2 and so on...).
I'm using this code:
SELECT MIN(id) As MinMissingId FROM table1 where id>=0
But my result is first existing ID and not missing

This will return the next unused id starting with 1, works in all cases, e.g. table is empty or there's no gap:
WITH cte AS
(
SELECT id FROM tab
UNION ALL
SELECT 0
)
SELECT MIN(id) + 1
FROM cte
WHERE NOT EXISTS
(
SELECT *
FROM tab
WHERE tab.id = cte.id + 1
)

You can do this by building on the answer in your related question.
select (case when min(id) > 0 then 1 else min(id) + 1 end)
from table1 t
where not exists (select 1
from table1 t2
where t2.id = t.id + 1
);
The idea is to find the first id that is missing. If it is bigger

SQL group by if values are close

Class| Value
-------------
A | 1
A | 2
A | 3
A | 10
B | 1
I am not sure whether it is practical to achieve this using SQL.
If the difference of values are less than 5 (or x), then group the rows (of course with the same Class)
Expected result
Class| ValueMin | ValueMax
---------------------------
A | 1 | 3
A | 10 | 10
B | 1 | 1
For fixed intervals, we can easily use "GROUP BY". But now the grouping is based on nearby row's value. So if the values are consecutive or very close, they will be "chained together".
Thank you very much
Assuming MSSQL

You are trying to group things by gaps between values. The easiest way to do this is to use the lag() function to find the gaps:
select class, min(value) as minvalue, max(value) as maxvalue
from (select class, value,
sum(IsNewGroup) over (partition by class order by value) as GroupId
from (select class, value,
(case when lag(value) over (partition by class order by value) > value - 5
then 0 else 1
end) as IsNewGroup
from t
) t
) t
group by class, groupid;
Note that this assumes SQL Server 2012 for the use of lag() and cumulative sum.

Update:
*This answer is incorrect*
Assuming the table you gave is called sd_test, the following query will give you the output you are expecting
In short, we need a way to find what was the value on the previous row. This is determined using a join on row ids. Then create a group to see if the difference is less than 5. and then it is just regular 'Group By'.
If your version of SQL Server supports windowing functions with partitioning the code would be much more readable.
SELECT
A.CLASS
,MIN(A.VALUE) AS MIN_VALUE
,MAX(A.VALUE) AS MAX_VALUE
FROM
(SELECT
ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
,CLASS
,VALUE
FROM SD_TEST) AS A
LEFT JOIN
(SELECT
ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
,CLASS
,VALUE
FROM SD_TEST) AS B
ON A.CLASS = B.CLASS AND A.ROW_ID=B.ROW_ID+1
GROUP BY A.CLASS,CASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END
ORDER BY A.CLASS,cASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END DESC
ps: I think the above is ANSI compliant. So should run in most SQL variants. Someone can correct me if it is not.

These give the correct result, using the fact that you must have the same number of group starts as ends and that they will both be in ascending order.
if object_id('tempdb..#temp') is not null drop table #temp
create table #temp (class char(1),Value int);
insert into #temp values ('A',1);
insert into #temp values ('A',2);
insert into #temp values ('A',3);
insert into #temp values ('A',10);
insert into #temp values ('A',13);
insert into #temp values ('A',14);
insert into #temp values ('b',7);
insert into #temp values ('b',8);
insert into #temp values ('b',9);
insert into #temp values ('b',12);
insert into #temp values ('b',22);
insert into #temp values ('b',26);
insert into #temp values ('b',67);
Method 1 Using CTE and row offsets
with cte as
(select distinct class,value,ROW_NUMBER() over ( partition by class order by value ) as R from #temp),
cte2 as
(
select
c1.class
,c1.value
,c2.R as PreviousRec
,c3.r as NextRec
from
cte c1
left join cte c2 on (c1.class = c2.class and c1.R= c2.R+1 and c1.Value < c2.value + 5)
left join cte c3 on (c1.class = c3.class and c1.R= c3.R-1 and c1.Value > c3.value - 5)
)
select
Starts.Class
,Starts.Value as StartValue
,Ends.Value as EndValue
from
(
select
class
,value
,row_number() over ( partition by class order by value ) as GroupNumber
from cte2
where PreviousRec is null) as Starts join
(
select
class
,value
,row_number() over ( partition by class order by value ) as GroupNumber
from cte2
where NextRec is null) as Ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber
** Method 2 Inline views using not exists **
select
Starts.Class
,Starts.Value as StartValue
,Ends.Value as EndValue
from
(
select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
from
(select distinct class,value from #temp) as T
where not exists (select 1 from #temp where class=t.class and Value < t.Value and Value > t.Value -5 )
) Starts join
(
select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
from
(select distinct class,value from #temp) as T
where not exists (select 1 from #temp where class=t.class and Value > t.Value and Value < t.Value +5 )
) ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber
In both methods I use a select distinct to begin because if you have a dulpicate entry at a group start or end things go awry without it.

Here is one way of getting the information you are after:
SELECT Under5.Class,
(
SELECT MIN(m2.Value)
FROM MyTable AS m2
WHERE m2.Value < 5
AND m2.Class = Under5.Class
) AS ValueMin,
(
SELECT MAX(m3.Value)
FROM MyTable AS m3
WHERE m3.Value < 5
AND m3.Class = Under5.Class
) AS ValueMax
FROM
(
SELECT DISTINCT m1.Class
FROM MyTable AS m1
WHERE m1.Value < 5
) AS Under5
UNION
SELECT Over4.Class,
(
SELECT MIN(m4.Value)
FROM MyTable AS m4
WHERE m4.Value >= 5
AND m4.Class = Over4.Class
) AS ValueMin,
(
SELECT Max(m5.Value)
FROM MyTable AS m5
WHERE m5.Value >= 5
AND m5.Class = Over4.Class
) AS ValueMax
FROM
(
SELECT DISTINCT m6.Class
FROM MyTable AS m6
WHERE m6.Value >= 5
) AS Over4

Using SQL to get the previous rows data

I have a requirement where I need to get data from the previous row to use in a calculation to give a status to the current row. It's a history table. The previous row will let me know if a data has changed in a date field.
I've looked up using cursors and it seems a little complicated. Is this the best way to go?
I've also tried to assgin a value to a new field...
newField =(Select field1 from Table1 where "previous row") previous row is where I seem to get stuck. I can't figure out how to select the row beneath the current row.
I'm using SQL Server 2005
Thanks in advance.

-- Test data
declare #T table (ProjectNumber int, DateChanged datetime, Value int)
insert into #T
select 1, '2001-01-01', 1 union all
select 1, '2001-01-02', 1 union all
select 1, '2001-01-03', 3 union all
select 1, '2001-01-04', 3 union all
select 1, '2001-01-05', 4 union all
select 2, '2001-01-01', 1 union all
select 2, '2001-01-02', 2
-- Get CurrentValue and PreviousValue with a Changed column
;with cte as
(
select *,
row_number() over(partition by ProjectNumber order by DateChanged) as rn
from #T
)
select
C.ProjectNumber,
C.Value as CurrentValue,
P.Value as PreviousValue,
case C.Value when P.Value then 0 else 1 end as Changed
from cte as C
inner join cte as P
on C.ProjectNumber = P.ProjectNumber and
C.rn = P.rn + 1
-- Count the number of changes per project
;with cte as
(
select *,
row_number() over(partition by ProjectNumber order by DateChanged) as rn
from #T
)
select
C.ProjectNumber,
sum(case C.Value when P.Value then 0 else 1 end) as ChangeCount
from cte as C
inner join cte as P
on C.ProjectNumber = P.ProjectNumber and
C.rn = P.rn + 1
group by C.ProjectNumber

This really depends on what tells you a row is a "Previous Row". however, a self join should do what you want:
select *
from Table1 this
join Table2 prev on this.incrementalID = prev.incrementalID+1

If you have the following table
CREATE TABLE MyTable (
Id INT NOT NULL,
ChangeDate DATETIME NOT NULL,
.
.
.
)
The following query will return the previous record for any record from MyTable.
SELECT tbl.Id,
tbl.ChangeDate,
hist.Id,
hist.ChangeDate
FROM MyTable tbl
INNER JOIN MyTable hist
ON hist.Id = tbl.Id
AND hiost.ChangeDate = (SELECT MAX(ChangeDate)
FROM MyTable sub
WHERE sub.Id = tbl.Id AND sub.ChangeDate < tbl.ChangeDate)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Duplicate Counts - TSQL - sql

Related

how to find all column records are same or not in group by column in SQL

SQL Get rows based on conditions

Select first missing id above 0

SQL group by if values are close

Using SQL to get the previous rows data

Categories

Resources