Deleting all but top 10 rows

Deleting all but top 10 rows - sql

I have the following SQL statement which seems to be deleting every row in the selected table. What it should be doing is deleting all but the top 10 where the difficulty level equals one.
DELETE FROM Scores
WHERE Highscore_ID
NOT IN (SELECT TOP 10 highScore FROM Scores WHERE difficulty = 1 ORDER BY HighScore DESC)
Any suggestions as to why this would be deleting all rows? When running the subquery, it selects the proper rows, however, when deleting, it seems to want to delete every row.

You compare Hichscore_Id with a column highScore. Does those columns really have the same values?
Then it should be
DELETE FROM Scores
WHERE Highscore_ID NOT IN
(SELECT TOP 10 HighScore_ID
FROM Scores
WHERE difficulty = 1
ORDER BY HighScore DESC);
EDIT:
Try this
DELETE FROM Scores
JOIN (SELECT ROW_NUMBER() OVER (ORDER BY highscore) AS your_row, *
FROM Scores
WHERE difficulty = 1
ORDER BY HighScore DESC) AS score2 ON score2.HighScore_ID = Scores.HighScore_ID
WHERE Scores.difficulty = 1 AND score2.your_row>10

Don't know if this is a typo or a real error but your select clause should refer to highscore_id, not highscore.

Try this:
with a as
(
select ROW_NUMBER() over(order by name) as ordinal, * from test
)
delete from a where a.ordinal > 10;
Related: http://www.ienablemuch.com/2012/03/possible-in-sql-server-deleting-any-row.html
Sample data:
CREATE TABLE [beatles]
([name] varchar(14));
INSERT INTO [beatles]
([name])
VALUES
('john'),
('paul'),
('george'),
('ringo'),
('pete'),
('brian'),
('george martin');
Query:
with a as
(
select *, row_number() over(order by name) ordinal
from beatles
)
delete from a
where ordinal > 4;
select * from beatles;
Prior deleting:
NAME
brian
george
george martin
john
paul
pete
ringo
After deleting:
NAME
brian
george
george martin
john
Live test: http://www.sqlfiddle.com/#!3/0adcf/6

My first guess would be "SELECT TOP 10 highScore FROM Scores WHERE difficulty = 1 ORDER BY HighScore DESC" could be returning null.
My second guess would be "highscore_id" is different from "highscore", so there's no overlap (and nothing's deleted).
Definitely double-check your subquery, and make sure it's returning the keys you expect!

Related

How can I delete completely duplicate rows from a query, without having a unique value for it?

I'm having an issue getting information from an MS Access Database table. I need a count of a code but I don't have to take into account duplicate rows, which means that I need to delete all duplicate rows.
Here's an example to illustrate what I need:
Code | Name
12 | George
20 | John
12 | George
33 | John
I will need first to delete both rows with the same code, and then I need a count for the name the rest of the table data for example this will be the result that I'm expecting:
Name | Count
John | 2
I already have a query that does that for me, but is taking around 1 hour to get me around 5000 rows and I need something more efficient. My query:
select name, count(*) from Table
where name = '" + input_name + "'
and code in (select code from Table group by code
having count(code) = 1)
group by name
order by count(name) desc;
I would appreciate any suggestion.

Rather than using in, I might suggest filtering the original dataset in a subquery, e.g.:
select u.name, count(*)
from (select t.code, t.name from yourtable t group by t.code, t.name having count(*) = 1) u
group by u.name
Here, change yourtable to the name of your table.

Adding same random rows to table in SQL

I have the following base table:
_ID_ _Name_
1 Bart Smit
2 Ahmed Lissabon
3 Medina Aziz
4 Ben Joeson
Whereby I would like to assign random titles to above table. However, the same titles should be assigned to the same persons every time the query is run. Thus if I have the following table:
_Titles_
Captain
Mr.
Ms.
Prince
King
Queen
Lieutenant
Doctor
Sir
So the output could look like this:
_ID_ _Title_ _Name_
1 Doctor Bart Smit
2 King Ahmed Lissabon
3 Captain Medina Aziz
4 Sir Ben Joeson
But then it should assign those titles to those names every time I run the code. Now I use the NEWID() in combination with a CROSS APPLY and it randomly assigns titles to names every time I run it.
SELECT _ID_, R._Title_, _NAME_
FROM TABLE
CROSS APPLY
(
SELECT TOP 1 Title
FROM #titles
WHERE TABLE.[_ID_]= TABLE.[_ID_]
ORDER BY NEWID()
) R

If you want the same result every time instead of just running query, update table:
WITH cte AS (
SELECT t.*, R.title AS new_title
FROM TABLE t
CROSS APPLY(SELECT TOP 1 Title
FROM #titles
ORDER BY NEWID()) R
WHERE _TITLE_ IS NULL
)
UPDATE cte
SET _Title_ = new_title;

You can't use any random calculation if it has to be repeatable (unless you add a new column to store that random value).
This applies checksum and modulo to get a repeatable value:
select *
from tab
join
( select *
,row_number() over (order by title) -1 as rn
from titles
) as t
-- or simply hardcoded 9
on abs(checksum(tab.Name,tab.id) % (select count(*) from titles)) = t.rn
;
Of course this will still return different when the number of titles changes (or add an ID column to titles).

How to increment a value in SQL based on a unique key

Apologies in advance if some of the trigger solutions already cover this but I can't get them to work for my scenario.
I have a table of over 50,000 rows, all of which have an ID, with roughly 5000 distinct ID values. There could be 100 rows with an instrumentID = 1 and 50 with an instrumentID = 2 within the table etc but they will have slightly different column entries. So I could write a
SELECT * from tbl WHERE instrumentID = 1
and have it return 100 rows (I know this is easy stuff but just to be clear)
What I need to do is form an incrementing value for each time a instrument ID is found, so I've tried stuff like this:
IntIndex INT IDENTITY(1,1),
dDateStart DATE,
IntInstrumentID INT,
IntIndex1 AS IntInstrumentID + IntIndex,
at the table create step.
However, I need the IntIndex1 to increment when an instrumentID is found, irrespective of where the record is found in the table so that it effectively would provide a count of the records just by looking at the last IntIndex1 value alone. Rather than what the above does which is increment on all of the rows of the table irrespective of the instrumentID so you would get 5001,4002,4003 etc.
An example would be: for intInstruments 5000 and 4000
intInstrumentID | IntIndex1
--------- ------------------
5000 | 5001
5000 | 5002
4000 | 4001
5000 | 5003
4000 | 4002
The reason I need to do this is because I need to join two tables based on these values (a start and end date for each instrumentID). I have tried GROUP BY etc but this can't work in both tables and the JOIN then doesn't work.
Many thanks

I'm not entirely sure I understand your problem, but if you just need IntIndex1 to join to, could you just join to the following query, rather than trying to actually keep the calculated value in the database:
SELECT *,
intInstrumentID + RANK() OVER(PARTITION BY intInstrumentID ORDER BY dDateStart ASC) AS IntIndex1
FROM tbl
Edit: If I understand your comment correctly (which is not certain!), then presumably, you know that your end date and start date tables have the exact same number of rows, which leads to a one to one mapping between them based on thir respective end dates within instrument id?
If that's the case then maybe this join is what you are looking for:
SELECT SD.intInstrumentID, SD.dDateStart, ED.dEndDate
FROM
(
SELECT intInstrumentID,
dStartDate,
RANK() OVER(PARTITION BY intInstrumentID ORDER BY dDateStart ASC) AS IntIndex1
FROM tblStartDate
) SD
JOIN
(
SELECT intInstrumentID,
dEndDate,
RANK() OVER(PARTITION BY intInstrumentID ORDER BY dEndDate ASC) AS IntIndex1
FROM tblStartDate
) ED
ON SD.intInstrumentID = ED.intInstrumentID
AND SD.IntIndex1 = ED.IntIndex1
If not, please will you post some example data for both tables and the expected results?

SQL: How to get the AVG(MIN(number))?

I am looking for the AVERAGE (overall) of the MINIMUM number (grouped by person).
My table looks like this:
Rank Name
1 Amy
2 Amy
3 Amy
2 Bart
1 Charlie
2 David
5 David
1 Ed
2 Frank
4 Frank
5 Frank
I want to know the AVERAGE of the lowest scores. For these people, the lowest scores are:
Rank Name
1 Amy
2 Bart
1 Charlie
2 David
1 Ed
2 Frank
Giving me a final answer of 1.5 - because three people have a MIN(Rank) of 1 and the other three have a MIN(Rank) of 2. That's what I'm looking for - a single number.
My real data has a couple hundred rows, so it's not terribly big. But I can't figure out how to do this in a single, simple statement. Thank you for any help.

Try this:
;WITH MinScores
AS
(
SELECT
"Rank",
Name,
ROW_NUMBER() OVER(PARTITION BY Name ORDER BY "Rank") row_num
FROM Table1
)
SELECT
CAST(SUM("Rank") AS DECIMAL(10, 2)) /
COUNT("Rank")
FROM MinScores
WHERE row_num = 1;
SQL Fiddle Demo

Selecting the set of minimum values is straightforward. The cast() is necessary to avoid integer division later. You could also avoid integer division by casting to float instead of decimal. (But you should be aware that floats are "useful approximations".)
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
Now you can use the minimums as a common table expression, and select from it.
with minimums as (
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
)
select avg(min_rank) avg_min_rank
from minimums
If you happen to need to do the same thing on a platform that doesn't support common table expressions, you can a) create a view of minimums, and select from that view, or b) use the minimums as a derived table.

You might try using a derived table to get the minimums, then get the average minimum in the outer query, as in:
-- Get the avg min rank as a decimal
select avg(MinRank * 1.0) as AvgRank
from (
-- Get everyone's min rank
select min([Rank]) as MinRank
from MyTable
group by Name
) as a

I think the easiest one will be
for max
select name , max_rank = max(rank)
from table
group by name;
for average
select name , avg_rank = avg(rank)
from table
cgroup by name;

write a query to identify discrepancy

I have a table with Student ID's and Student Names. There has been issues with assigning unique Student Id's to students and Hence I want to find the duplicates
Here is the sample Table:
Student ID Student Name
1 Jack
1 John
1 Bill
2 Amanda
2 Molly
3 Ron
4 Matt
5 James
6 Kathy
6 Will
Here I want a third column "Duplicate_Count" to display count of duplicate records.
For e.g. "Duplicate_Count" would display "3" for Student ID = 1 and so on. How can I do this?
Thanks in advance

Select StudentId, Count(*) DupCount
From Table
Group By StudentId
Having Count(*) > 1
Order By Count(*) desc,

Select
aa.StudentId, aa.StudentName, bb.DupCount
from
Table as aa
join
(
Select StudentId, Count(*) as DupCount from Table group by StudentId
) as bb
on aa.StudentId = bb.StudentId
The virtual table gives the count for each StudentId, this is joined back to the original table to add the count to each student record.
If you want to add a column to the table to hold dupcount, this query can be used in an update statement to update that column in the table

This should work:
update mytable
set duplicate_count = (select count(*) from mytable t where t.id = mytable.id)
UPDATE:
As mentioned by #HansUp, adding a new column with the duplicate count probably doesn't make sense, but that really depends on what the OP originally thought of using it for. I'm leaving the answer in case it is of help for someone else.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Deleting all but top 10 rows - sql

Don't know if this is a typo or a real error but your select clause should refer to highscore_id, not highscore.

Related

How can I delete completely duplicate rows from a query, without having a unique value for it?

Adding same random rows to table in SQL

How to increment a value in SQL based on a unique key

SQL: How to get the AVG(MIN(number))?

write a query to identify discrepancy

Categories

Resources