Get only 1 occurrence for the duplicate records

Get only 1 occurrence for the duplicate records - sql

select DISTINCT a.Schooldistricttown, a.Schooldistrictnum
from [Legacy].[dbo].[MyTables] as a
It returns :
Can you tell me how to get only one occurrence of a.Schooldistricttown?
I have tried with DISTINCT and GROUP BY. But it's not working.
Note : I need to show both columns also.

Two options, if it doesn't matter which value you get in Schooldistrictnum then group by with MAX()/MIN() will solve this:
SELECT a.Schooldistricttown,MAX(a.Schooldistrictnum)
from [Legacy].[dbo].[MyTables] a
GROUP BY a.Schooldistricttown
If you do care, use ROW_NUMBER() :
SELECT s.Schooldistricttown,s.Schooldistrictnum
FROM (
SELECT a.Schooldistricttown,a.Schooldistrictnum,
ROW_NUMBER() OVER(PARTITION BY a.Schooldistricttown ORDER BY a.<ORDER_COLUMN>) as rnk
from [Legacy].[dbo].[MyTables] a) s
WHERE s.rnk = 1
You need to replace <ORDER_COLUMN> with the actual column that you decide which value you want by it

If you want all Schooldistrictnum, use this
SELECT DISTINCT
a.Schooldistricttown,
(
SELECT DISTINCT
ina.Schooldistrictnum + ', ' AS [text()]
FROM
[Legacy].[dbo].[MyTables] as ina
WHERE
ina.Schooldistricttown = a.Schooldistricttown
FOR XML PATH ('')
) AS Schooldistrictnum
FROM [Legacy].[dbo].[MyTables] as a

Related

SQL - Using a Group By with Adding More Columns Showing Additional Rows

I have the following SQL statement:
SELECT H1.INCIDENT_NUMBER,
H1.HISTORY_START_DATE,
H1.ASSIGNED_GROUP,
H1.STATUS
FROM INCIDENT_HISTORY_PUBLIC as H1
WHERE H1.INCIDENT_NUMBER IN (
SELECT INCIDENT_NUMBER
FROM INCIDENT_HISTORY_PUBLIC
WHERE ASSIGNED_GROUP LIKE ' DS$_%' ESCAPE '$'
)
ORDER BY H1.INCIDENT_NUMBER
Part of the results are shown as below:
What I'm trying to accomplish from here is for each INCIDENT_NUMBER, grab the MAX(HISTORY_START_DATE). I've tried using the 'Group By' but I need to keep the ASSIGNED_GROUP AND STATUS columns and when I add them back into the 'Group By,' I'm getting multiple rows again for each INCIDENT_NUMBER.
Results I am looking for:
Do I need a subquery or something? What am I missing?

You need to use Row_Number() with Partition by like this:
SELECT *
FROM (
SELECT H1.INCIDENT_NUMBER,
H1.ASSIGNED_GROUP,
H1.HISTORY_START_DATE AS MAX_Date
H1.STATUS,
Row_Number() over (Partition by H1.INCIDENT_NUMBER order by H1.HISTORY_START_DATE desc) rw
FROM INCIDENT_HISTORY_PUBLIC as H1
WHERE H1.INCIDENT_NUMBER IN (
SELECT INCIDENT_NUMBER
FROM INCIDENT_HISTORY_PUBLIC
WHERE ASSIGNED_GROUP LIKE ' DS$_%' ESCAPE '$'
)) t
where t.rw=1
order by t.INCIDENT_NUMBER

How to replace IN CLAUSE USING EXISTS?

select
TV.ATTRIBUTE
FROM
TABLE_VALUE TV
WHERE
TV.NUMBERS IN (SELECT MAX(TV1.NUMBERS) FROM TABLE_VALUE TV1
WHERE TV.UNIQUE_ID=TV1.UNIQUE_ID GROUP BY UNIQUE_ID )

I'm not sure exists would help here, because - as you put it - for each unique_id there be many numbers values, and you want to select attribute for highest numbers for that particular unique_id.
exists is useful when you want to check whether something ... well, exists, but that's not the case here.

You do not want EXISTS, instead you can use the RANK or DENSE_RANK analytic functions:
SELECT attribute
FROM (
SELECT attribute,
DENSE_RANK() OVER (PARTITION BY unique_id ORDER BY numbers DESC) AS rnk
FROM table_value
)
WHERE rnk = 1
or use the MAX analytic function:
SELECT attribute
FROM (
SELECT attribute,
numbers,
MAX(numbers) OVER (PARTITION BY unique_id) AS max_numbers
FROM table_value
)
WHERE numbers = max_numbers;
Either option will only read from the table once.
If you really did want to use EXISTS (or IN) then it will be less efficient as you will query the same table twice but you can do it with a HAVING clause:
SELECT tv.attribute
FROM table_value tv
WHERE EXISTS(
SELECT 1
FROM table_value tv1
WHERE tv1.unique_id = tv.unique_id
HAVING MAX(tv1.numbers) = tv.numbers
)
fiddle

Remove duplicate records except the first record in SQL

I want to remove all duplicate records except the first one.
Like :
NAME
R
R
rajesh
YOGESH
YOGESH
Now in the above I want to remove the second "R" and the second "YOGESH".
I have only one column whose name is "NAME".

Use a CTE (I have several of these in production).
;WITH duplicateRemoval as (
SELECT
[name]
,ROW_NUMBER() OVER(PARTITION BY [name] ORDER BY [name]) ranked
from #myTable
ORDER BY name
)
DELETE
FROM duplicateRemoval
WHERE ranked > 1;
Explanation: The CTE will grab all of your records and apply a row number for each unique entry. Each additional entry will get an incrementing number. Replace the DELETE with a SELECT * in order to see what it does.

Seems like a simple distinct modifier would do the trick:
SELECT DISTINCT name
FROM mytable

This is bigger code but it works perfectly where you don't take the original row but find all the duplicate Rows
select majorTable.RowID,majorTable.Name,majorTable.Value from
(select outerTable.Name, outerTable.Value, RowID, ROW_NUMBER()
over(partition by outerTable.Name,outerTable.Value order by RowID)
as RowNo from #Your_Table outerTable inner join
(select Name, Value,COUNT(*) as duplicateRows FROM #Your_Table group by Name, Value
having COUNT(*)>1)innerTable on innerTable.Name = outerTable.Name
and innerTable.Value = outerTable.Value)majorTable where MajorTable.ROwNo <>1

solution for writing query

I have a match table with winningteamid and stadiumid as attributes,
I need to retrieve the winningteamid which has won all its games in the same stadium.
I tried this, and I'm getting additional unwanted rows:
select winningteamid
from match
group by winningteamid
having count(winningteamid) in (
select count(*) from match group by (winningteamid,stadiumid)

Try this please (please write which RDBMS you are using):
with cte as (
select winningteamid, stadiumid, count(stadiumid) over (partition by winningteamid) as count
from match
group by winningteamid, stadiumid
)
select * from cte where count = 1;

You should use this:
SELECT MAX(winningteamid)
FROM (
SELECT DISTINCT winningteamid, stadiumid
FROM match
)
GROUP BY stadiumid
HAVING COUNT(*) = 1;

I suppose it's as easy as using HAVING to verify only one distinct stadiumid:
select winningteamid
from match
group by winningteamid
having count(distinct stadiumid) = 1

How to avoid order by in group by query result [duplicate]

I am trying to display the records,order as in the where clause..
example:
select name from table where name in ('Yaksha','Arun','Naveen');
It displays Arun,Naveen,Yaksha (alphabetical order)
I want display it as same order i.e 'Yaksha''Arun','Naveen'
how to display this...
I am using oracle db.

Add this ORDER BY at the query's end:
order by case name when 'Yaksha' then 1
when 'Arun' then 2
when 'Naveen' then 3
end
(There's no other way to get that order. You need an ORDER BY to get a specific result set order.)

It may be a bit clunky, but you can create a custom ordering with a case expression:
SELECT *
FROM my_table
WHERE name IN ('Yaksha', 'Arun','Naveen')
ORDER BY CASE name WHEN 'Yaksha' THEN 1
WHEN 'Arun' THEN 2
WHEN 'Naveen' THEN 3
END ASC
A slightly longer option, but one that prevents duplication of the string literals is to use a subquery:
SELECT m.*
FROM my_table m
JOIN (SELECT 'Yaksha' AS name, 1 AS name_order FROM dual
UNION ALL
SELECT 'Arun' AS name, 2 AS name_order FROM dual
UNION ALL
SELECT 'Naveen' AS name, 3 AS name_order FROM dual) o
ON o.name = m.name
ORDER BY o.name_order ASC

You can try with something like the following:
SELECT *
FROM test
WHERE name IN ( 'Yaksha', 'Arun', 'Naveen' )
ORDER BY instr ( q'['Yaksha', 'Arun', 'Naveen']', name ) ASC
This way could be useful if your IN list is somehow dynamic.

If the list of values is dynamic or you just don't want to repeat the values you could use (or abuse, depending on your point of view) a table collection, and join your real table to a table collection expression instead of using IN:
select your_table.name
from table(sys.odcivarchar2list('Yaksha','Arun','Naveen')) t
join your_table on your_table.name = t.column_value;
Which will generally work, but of course without an order-by clause is not guaranteed to work, so you can use an inline view to assign the order:
select your_table.name from (
select row_number() over (order by null) as rn, column_value as name
from table(sys.odcivarchar2list('Yaksha','Arun','Naveen'))
) t
join your_table on your_table.name = t.name
order by t.rn;
This still relies on row_number() over (order by null) using the order of the elements in the collection; which relies on collection unnesting preserving the element order. I don't think that's guaranteed either, so there is still some risk involved.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Get only 1 occurrence for the duplicate records - sql

select DISTINCT a.Schooldistricttown, a.Schooldistrictnum from [Legacy].[dbo].[MyTables] as a It returns : Can you tell me how to get only one occurrence of a.Schooldistricttown? I have tried with DISTINCT and GROUP BY. But it's not working. Note : I need to show both columns also.

Related

SQL - Using a Group By with Adding More Columns Showing Additional Rows

How to replace IN CLAUSE USING EXISTS?

Remove duplicate records except the first record in SQL

solution for writing query

How to avoid order by in group by query result [duplicate]

Categories

Resources