Selecting TOP 1 Columns where duplicate exists and selecting all where no duplicate exists

Selecting TOP 1 Columns where duplicate exists and selecting all where no duplicate exists - sql

Given the list of Names, Accounts and Positions I am trying to:
Select the 1st position where there are more than 1 records with the same Name and Account
If there is only 1 record with the Name and Account, then select details.
My current query looks like the following:
SELECT *
FROM CTE cte1
JOIN
(
SELECT Name, OppName FROM CTE GROUP BY Name, OppName HAVING COUNT(Name)>1
) as cte2
on cte2.Name = cte1.Name and cte2.OppName = cte1.OppName
ORDER BY cte1.OppName, cte1.Name
I have not posted the rest of the CTE query as it is way to long.
However, this is only providing me with the results where the Name and Accounts are the same and the Positions are different.
I.E. If Oera worked at Christie's as a Sales Analyst and a Developer It would only Select the record where Oera worked at Christie's as a Developer.
How do I modify this query accordingly?

Are you looking for something like this?
SELECT *
FROM CTE AS cte1
JOIN
(
SELECT Name, OppName,COUNT(Name) PARTITION BY (Name,OppName) cnt
FROM CTE
) AS cte2
ON cte2.Name = cte1.Name and cte2.OppName = cte1.OppName
WHERE cnt > 1
ORDER BY cte1.OppName, cte1.Name

Related

Show SQL result row in context of others

I have a query showing how a particular entry ranks:
select launch_rank, partner_info from summary WHERE "partner_info" LIKE "%Example%"
However it's only useful in context when it is ranked together with:
How do I show the entry with 10 competitors on either side of it? Without resorting to static queries like WHERE launch_rank > 140 and launch_rank < 200?

Assuming you have one row that you are comparing to and the ranks are actually different on each row, you can use:
with onerow as (
select launch_rank
from summary
where partner_info LIKE '%Example%'
)
select s.*
from (select s.*
from summary s
where s.launch_rank <= (select launch_rank from onerow)
order by s.launch_rank desc
limit 11
) s
union all
select s.*
from (select s.*
from summary s
where s.launch_rank > (select launch_rank from onerow)
order by s.launch_rank asc
limit 10
) s

Just join the table to itself:
select A.launch_rank, A.partner_info from summary A INNER JOIN summary B
ON A.launch_rank>=B.launch_rank-10 AND A.launch_rank<=B.launch_rank+10
WHERE B."partner_info" LIKE '%Example%'

get ROW NUMBER of random records

For a simple SQL like,
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
how to add row numbers to them so that the row numbers become 1,2, and 3?
UPDATE:
I thought I can simplify my question as above, but it turns out to be more complicated. So here is a fuller version -- I need to give three random picks (from MyTable) for each person, with pick/row number of 1, 2, and 3, and there is no logical joining between person and picks.
SELECT * FROM Person
LEFT JOIN (
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
) D ON 1=1
The problem with above SQL are,
Obviously, pick/row number of 1, 2, and 3 should be added
and what is not obvious is that, the above SQL will give each person the same picks, whereas I need to give different person different picks
Here is a working SQL to test it out:
SELECT TOP 15 database_id, create_date, cs.name FROM sys.databases
CROSS apply (
SELECT top 3 Row_number()OVER(ORDER BY (SELECT NULL)) AS RowNo,*
FROM (SELECT top 3 name from sys.all_views ORDER BY NEWID()) T
) cs
So, Please help.
NOTE: This is NOT about MySQL byt T-SQL as their syntax are different, Thus the solution is different as well.

Add Row_number to outer query. Try this
SELECT Row_number()OVER(ORDER BY (SELECT NULL)),*
FROM (SELECT TOP 3 MyId
FROM MyTable
ORDER BY Newid()) a
Logically TOP keyword is processed after Select. After Row Number is generated random 3 records will be pulled. So you should not generate Row Number in original query
Update
It can be achieved through CROSS APPLY. Replace the column names inside cross apply where clause with valid column name from Person table
SELECT *
FROM Person p
CROSS apply (SELECT Row_number()OVER(ORDER BY (SELECT NULL)) rn,*
FROM (SELECT TOP 3 MyId
FROM MyTable
WHERE p.some_col = p.some_col -- Replace it with some column from person table
ORDER BY Newid())a) cs

Ensuring only distinct records are returned with DISTINCT

Given the following table:
date_field_one date_field_two arbitrary_value
---------------- ---------------- -----------------
1/1/11 1/3/11 cheese
1/1/11 1/4/11 the color orange
2/2/11 2/3/11 1
2/2/11 2/4/11 2
My problem: I'm not sure how to go about structuring a query using a set based approach that yields the following results:
for each distinct date, the record with the earliest
date_field_two value is returned
Any ideas?

Edit for new response! The solution posted by M.Ali may be the best fit for your specific case as it will ensure you only ever get one row result from your base data, even if there exist multiple candidate rows for your answer ( as in, date_field_one, date_field_two combinations are not distinct ). The following will return multiple results per date_field_one, date_field_two combination in the not-distinct scenario:
SELECT t.date_field_one, t.date_field_two, t.arbitrary_value
FROM ( SELECT date_field_one,
date_field_two = MIN( date_field_two )
FROM dbo.[table]
GROUP BY date_field_one ) dl
LEFT JOIN dbo.[table] t
ON dl.date_field_one = t.date_field_one
AND dl.date_field_two = t.date_field_two;

;WITH CTE
AS
(
SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY date_field_one ORDER BY date_field_two
ASC)
FROM TableName
)
SELECT * FROM CTE
WHERE rn = 1

Something like this:
select date_field_one, min(date_field_two)
from yourtable
group by date_field_one

select date_field_one, min(date_fileld_two)
from table
group by date_field_one

try this for latest...........
select date_field_one ,min(date_field_two) date_field_two
from table group by date_field_one

SQL query to select distinct row with minimum value

I want an SQL statement to get the row with a minimum value.
Consider this table:
id game point
1 x 5
1 z 4
2 y 6
3 x 2
3 y 5
3 z 8
How do I select the ids that have the minimum value in the point column, grouped by game? Like the following:
id game point
1 z 4
2 y 5
3 x 2

Use:
SELECT tbl.*
FROM TableName tbl
INNER JOIN
(
SELECT Id, MIN(Point) MinPoint
FROM TableName
GROUP BY Id
) tbl1
ON tbl1.id = tbl.id
WHERE tbl1.MinPoint = tbl.Point

This is another way of doing the same thing, which would allow you to do interesting things like select the top 5 winning games, etc.
SELECT *
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Point) as RowNum, *
FROM Table
) X
WHERE RowNum = 1
You can now correctly get the actual row that was identified as the one with the lowest score and you can modify the ordering function to use multiple criteria, such as "Show me the earliest game which had the smallest score", etc.

This will work
select * from table
where (id,point) IN (select id,min(point) from table group by id);

As this is tagged with sql only, the following is using ANSI SQL and a window function:
select id, game, point
from (
select id, game, point,
row_number() over (partition by game order by point) as rn
from games
) t
where rn = 1;

Ken Clark's answer didn't work in my case. It might not work in yours either. If not, try this:
SELECT *
from table T
INNER JOIN
(
select id, MIN(point) MinPoint
from table T
group by AccountId
) NewT on T.id = NewT.id and T.point = NewT.MinPoint
ORDER BY game desc

SELECT DISTINCT
FIRST_VALUE(ID) OVER (Partition by Game ORDER BY Point) AS ID,
Game,
FIRST_VALUE(Point) OVER (Partition by Game ORDER BY Point) AS Point
FROM #T

SELECT * from room
INNER JOIN
(
select DISTINCT hotelNo, MIN(price) MinPrice
from room
Group by hotelNo
) NewT
on room.hotelNo = NewT.hotelNo and room.price = NewT.MinPrice;

This alternative approach uses SQL Server's OUTER APPLY clause. This way, it
creates the distinct list of games, and
fetches and outputs the record with the lowest point number for that game.
The OUTER APPLY clause can be imagined as a LEFT JOIN, but with the advantage that you can use values of the main query as parameters in the subquery (here: game).
SELECT colMinPointID
FROM (
SELECT game
FROM table
GROUP BY game
) As rstOuter
OUTER APPLY (
SELECT TOP 1 id As colMinPointID
FROM table As rstInner
WHERE rstInner.game = rstOuter.game
ORDER BY points
) AS rstMinPoints

This is portable - at least between ORACLE and PostgreSQL:
select t.* from table t
where not exists(select 1 from table ti where ti.attr > t.attr);

Most of the answers use an inner query. I am wondering why the following isn't suggested.
select
*
from
table
order by
point
fetch next 1 row only // ... or the appropriate syntax for the particular DB
This query is very simple to write with JPAQueryFactory (a Java Query DSL class).
return new JPAQueryFactory(manager).
selectFrom(QTable.table).
setLockMode(LockModeType.OPTIMISTIC).
orderBy(QTable.table.point.asc()).
fetchFirst();

Try:
select id, game, min(point) from t
group by id

Compare SQL groups against eachother

How can one filter a grouped resultset for only those groups that meet some criterion compared against the other groups? For example, only those groups that have the maximum number of constituent records?
I had thought that a subquery as follows should do the trick:
SELECT * FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t HAVING Records = MAX(Records);
However the addition of the final HAVING clause results in an empty recordset... what's going on?

In MySQL (Which I assume you are using since you have posted SELECT *, COUNT(*) FROM T GROUP BY X Which would fail in all RDBMS that I know of). You can use:
SELECT T.*
FROM T
INNER JOIN
( SELECT X, COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
) T2
ON T2.X = T.X
This has been tested in MySQL and removes the implicit grouping/aggregation.
If you can use windowed functions and one of TOP/LIMIT with Ties or Common Table expressions it becomes even shorter:
Windowed function + CTE: (MS SQL-Server & PostgreSQL Tested)
WITH CTE AS
( SELECT *, COUNT(*) OVER(PARTITION BY X) AS Records
FROM T
)
SELECT *
FROM CTE
WHERE Records = (SELECT MAX(Records) FROM CTE)
Windowed Function with TOP (MS SQL-Server Tested)
SELECT TOP 1 WITH TIES *
FROM ( SELECT *, COUNT(*) OVER(PARTITION BY X) [Records]
FROM T
)
ORDER BY Records DESC
Lastly, I have never used oracle so apolgies for not adding a solution that works on oracle...
EDIT
My Solution for MySQL did not take into account ties, and my suggestion for a solution to this kind of steps on the toes of what you have said you want to avoid (duplicate subqueries) so I am not sure I can help after all, however just in case it is preferable here is a version that will work as required on your fiddle:
SELECT T.*
FROM T
INNER JOIN
( SELECT X
FROM T
GROUP BY X
HAVING COUNT(*) =
( SELECT COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
)
) T2
ON T2.X = T.X

For the exact question you give, one way to look at it is that you want the group of records where there is no other group that has more records. So if you say
SELECT taxid, COUNT(*) as howMany
GROUP by taxid
You get all counties and their counts
Then you can treat that expressions as a table by making it a subquery, and give it an alias. Below I assign two "copies" of the query the names X and Y and ask for taxids that don't have any more in one table. If there are two with the same number I'd get two or more. Different databases have proprietary syntax, notably TOP and LIMIT, that make this kind of query simpler, easier to understand.
SELECT taxid FROM
(select taxid, count(*) as HowMany from flats
GROUP by taxid) as X
WHERE NOT EXISTS
(
SELECT * from
(
SELECT taxid, count(*) as HowMany FROM
flats
GROUP by taxid
) AS Y
WHERE Y.howmany > X.howmany
)

Try this:
SELECT * FROM (
SELECT *, MAX(Records) as max_records FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t
) WHERE Records = max_records
I'm sorry that I can't test the validity of this query right now.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Selecting TOP 1 Columns where duplicate exists and selecting all where no duplicate exists - sql

Are you looking for something like this? SELECT * FROM CTE AS cte1 JOIN ( SELECT Name, OppName,COUNT(Name) PARTITION BY (Name,OppName) cnt FROM CTE ) AS cte2 ON cte2.Name = cte1.Name and cte2.OppName = cte1.OppName WHERE cnt > 1 ORDER BY cte1.OppName, cte1.Name

Related

Show SQL result row in context of others

get ROW NUMBER of random records

Ensuring only distinct records are returned with DISTINCT

SQL query to select distinct row with minimum value

Compare SQL groups against eachother

Categories

Resources