delete overflow record per ID - sql

i need to have 4 records or less per ID. some ID's have more than 4 records. I need to delete the records above the limit so i have 4 records per ID.
i tried many things but the only solution i found is just deleting all the records when the ID has more than 4. this is my code for getting the ammount of records per ID:
select count(voorwerpnummer) AS plaatjes, voorwerpnummer
from Illustraties INNER JOIN items
ON Illustraties.itemID = items.ID
INNER JOIN tbl_voorwerp
ON items.ID = tbl_voorwerp.voorwerpnummer
group by voorwerpnummer
order by plaatjes DESC
i have this line to delete the extra records per itemID:
DELETE FROM illustraties
WHERE plaatjefile NOT IN (select top 4 plaatjefile from illustraties where itemID = 110769395358)
AND itemID = 110769395358
now i need to itterate through all the itemID's which have more than 4 records.
this is how to get all the itemID's with more than 4 records:
WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY itemID ORDER BY itemID) AS rn
FROM illustraties
)
SELECT distinct ItemID
FROM cte
WHERE rn > 4
can anyone make me a function or something to go through all those itemID's and execute that delete statement?
OR make a query which adds a rownumber per ID.
for example : an ID has 5 records. the 5 records get the numbers 1 to 5. the next ID has 8 records. the 8 records get the numbers 1 to 8.
this way i can delete the records which have an rownumber of 5 or higher.

It is little known that you can delete from a CTE or derived table:
WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY itemID ORDER BY itemID) AS rn
FROM illustraties
)
DELETE cte
FROM cte
WHERE rn > 4
I just changed one line. Your ORDER BY should really create a total order. If not SQL Server has some freedom which records to delete. Might be arbitrary and not in your interest.

Hope you need to delete records from Illustraties table.Is that possible for you to share the table structure along with Unique key fields for that table.

Related

SQL query to combine Select duplicates with count and grouping with delete based on Top but not the top 1 of each duplicate

I am looking to combine these 2 statement into one to run as a stored procedure if possible.
I have not used temp tables in queries before and may have to with this, not sure asking advice.
I did not write the original queries and manually run the first one which returns a table listing ID's with duplicate data nad how many records. Then each record ID is put into the 2nd query to remove all but the TOP 1 based on additional filtering criteria.
I have looked at using CTE from SQL select into delete DIRECTLY but am stil at a loss on how to pass each result row ID value into the delete query.
The queries, edited for public consumption are
SELECT id, count() FROM [DEV].[dbo].[7dtest] where FileVer = 1 and CALC_DATE > FORMAT(DATEADD(DD,-7,GETDATE()), 'yyyy-MM-dd') group by id having count() > 1 order by count(*) desc
returns a table with id and number of duplicate rows
then take the id of each row and put into this delete statement
delete from [DEV].[dbo].[7dtest] where AutoID not in (
SELECT TOP 1 AutoID FROM [DEV].[dbo].[7dtest] where FileVer = 1 and id = '123' and CALC_DATE > FORMAT(DATEADD(DD,-7,GETDATE()), 'yyyy-MM-dd')
order by COMPLETED_DATE_CHECK_3 desc, COMPLETED_DATE_CHECK_2 desc, COMPLETED_DATE_CHECK_1 desc)
and FileVer = 1 and id = '123' and CALC_DATE > FORMAT(DATEADD(DD,-7,GETDATE()), 'yyyy-MM-dd')
Can this be done with CTE or do I need to create a temp table and some looping to get the ID one row at a time? Is there a better way I should be doing this?
TIA

How to write a sql microsoft access query that picks 20 random records out of 100 but filter based on record categories?

I need a sql query that will randomly pick 20 records from a table that contains about 100 records. Each record has an associated category that goes from 1 to 15. I want the records that are picked to be completely random. However, I can't have 3 records from the same category being picked.
It seems to me that I can randomly pick 20 records and then eliminate records which contain a given category >=3 times. And then pick again. But all these implies having more than one query. And I don't know how to pass the results of one query to another and then another in microsoft access query. The query results are supposed to serve as a control source for a form. What do i do so that just one query will give me the results which can then be used as a control source for the form?
I tried the following and the problem is that the questions from the same category are grouped together which is not what I want. Here's a sample of what I am trying.
`(SELECT TOP 3 MCQuestionsT.QuestionID, MCQuestionsT.QuestionText, MCQuestionsT.CategoryID
FROM MCQuestionsT
WHERE (((MCQuestionsT.CourseCode)="2323") AND MCQuestionsT.CategoryID = 1)
ORDER BY Rnd(MCQuestionsT.QuestionID))
UNION ALL
(SELECT TOP 3 MCQuestionsT.QuestionID, MCQuestionsT.QuestionText, MCQuestionsT.CategoryID
FROM MCQuestionsT
WHERE (((MCQuestionsT.CourseCode)="2323") AND MCQuestionsT.CategoryID = 2)
ORDER BY Rnd(MCQuestionsT.QuestionID))
UNION ALL
(SELECT TOP 3 MCQuestionsT.QuestionID, MCQuestionsT.QuestionText, MCQuestionsT.CategoryID
FROM MCQuestionsT
WHERE (((MCQuestionsT.CourseCode)="2323") AND MCQuestionsT.CategoryID = 3)
ORDER BY Rnd(MCQuestionsT.QuestionID))
`
-- example using sys.all_objects that returns three random objects of each type
SELECT type_desc, name
FROM (
SELECT type_desc, name, Id = ROW_NUMBER() OVER (PARTITION BY type_desc ORDER BY NEWID())
FROM sys.all_objects
) Q
WHERE Id < 4
-- example using your table
SELECT QuestionID, QuestionText, CategoryID
FROM (
SELECT QuestionID, QuestionText, CategoryID, Id = ROW_NUMBER() OVER (PARTITION BY CategoryID ORDER BY NEWID())
FROM dbo.MCQuestionsT
WHERE CourseCode = '2323'
) Q
WHERE Id < 4

Selecting the newest records for 6 unique columns

I have a table of 6 currency conversions, it's updated almost daily. Unfortunately the way the software works is it inserts new rows rather than updating the existing ones. My previous SELECT was as follows
SELECT FROM_CURRENCY_ID, XCHG_RATE
FROM
(
SELECT TOP 6 FROM_CURRENCY_ID, XCHG_RATE
FROM SHARED_CURRENCY_EXCHANGE
WHERE NOT FROM_CURRENCY_ID = 'CAD'
ORDER BY RECORD_CREATED desc
) t
ORDER BY FROM_CURRENCY_ID
The issue now is some records got updated while others didn't so my query returns duplicate values for one of the currencys and nothing for one. I need it to output the 6 unique FROM_CURRENCY_IDs and their XCHG_RATE with the newest RECORD_CREATED dates
I've been trying a group by to exclude the duplicate rows with no luck.
with x as
(select row_number() over(partition by from_currency_id order by record_created desc) rn, * from shared_currency_exchange)
select from_currency_id, xchg_rate from x
where rn = 1
This gives the most recent record a rownumber 1 and you can use the cte with this condition.

How to find first duplicate row in a table sql server

I am working on SQL Server. I have a table, that contains around 75000 records. Among them there are several duplicate records. So i wrote a query to know which record repeated how many times like,
SELECT [RETAILERNAME],COUNT([RETAILERNAME]) as Repeated FROM [Stores] GROUP BY [RETAILERNAME]
It gives me result like,
---------------------------
RETAILERNAME | Repeated
---------------------------
X | 4
---------------------------
Y | 6
---------------------------
Z | 10
---------------------------
Among 4 record(s) of X record, i need take only first record of X.
so here i want to retrieve all fields from first row of duplicate records. i.e. Take all records whose RETAILERNAME='X' we will get some no. of duplicate records, we need to get only first row from them.
Please guide me.
You could try using ROW_NUMBER.
Something like
;WITH Vals AS (
SELECT [RETAILERNAME],
ROW_NUMBER() OVER(PARTITION BY [RETAILERNAME] ORDER BY [RETAILERNAME]) RowID
FROM [Stores ]
)
SELECT *
FROm Vals
WHERE RowID = 1
SQL Fiddle DEMO
You can then also remove the duplicates if need be (BUT BE CAREFUL THIS IS PERMANENT)
;WITH Vals AS (
SELECT [RETAILERNAME],
ROW_NUMBER() OVER(PARTITION BY [RETAILERNAME] ORDER BY [RETAILERNAME]) RowID
FROM Stores
)
DELETE
FROM Vals
WHERE RowID > 1;
You Can write query as under
SELECT TOP 1 * FROM [Stores] GROUP BY [RETAILERNAME]
HAVING your condition
WITH cte
AS (SELECT [retailername],
Row_number()
OVER(
partition BY [retailername]
ORDER BY [retailername])'RowRank'
FROM [retailername])
SELECT *
FROM cte

Update a field for a specific # of records in SQL Server 2005

Say I want 3 records flagged for each product in my table. But if some products only get 1 or 2 records flagged or even no records flagged, how can I make it randomly flag the remaining records up to the total of 3 per product.
Ex:
1 record gets flagged for Product_A, 2 records get flagged for Product_B and 3 records get flagged for Product_C.
Once script is complete, I need 2 more records flagged for Product_A and 1 more for Product_B.
This can be a loop or a cte or whatever is the most efficient way to do this in sql. Thanks!
Here's one way to do it:
;with SelectedIds as(
select
Id,
row_number() over (
partition by ProductCode -- distinct numbering for each Product Code
order by newid() -- random
) as rowno
from ProductLines
)
update p
set IsFlagged = 1
from ProductLines p
join SelectedIds s
on p.id = s.id and
s.rowno <= 3 -- limit to 3 records / product code
;
Here's a full sample, including some test data: http://www.sqlfiddle.com/#!3/3bee1/6
Use row_number() in a derived table where the numbers are generated so the rows that already have flags come first and the rest are ordered randomly and partition by Product. If random is not a requirement you can just remove newid() from the query.
Set the flag for the rows number 1-3 if the row is not already flagged.
update T
set Flag = 1
from (
select Flag,
row_number() over(partition by Product
order by Flag desc, newid()) as rn
from YourTable
) as T
where T.rn <= 3 and
T.Flag = 0
SQL Fiddle