How to delete one random row at a time in Oracle database - sql

I am working on a project where I need to delete a random row from an Oracle database. Which query should I use for that?

Well obviously it's a DELETE statement. It's the random part which is tricky.
Oracle has a PL/SQL package to handle randomness, DBMS_RANDOM. If the table is small or performance is not critical this will work:
delete from your_table
where id = ( select id from
( select id from your_table
order by dbms_random.value)
where rownum = 1)
/
The innermost query sorts the table in a random order. The sub-query selects the top row from that set, and that's the row which gets deleted.
Alternatively, if you know how many records you have...
delete from your_table
where id = ( select round(dbms_random.value(1,10000))
from dual )

Related

Fetch No oF Rows that can be returned by select query

I'm trying to fetch data and showing in a table with pagination. so I use limit and offset for that but I also need to show no of rows that can be fetched from that query. Is there any way to get that.
I tried
resultset.last() and getRow()
select count(*) from(query) myNewTable;
These two cases i'm getting correct answer but is it correct way to do this. Performance is a concern
We can get the limited records using below code,
First, we need to set how many records we want like below,
var limit = 10;
After that sent this limit to the below statement
WITH
Temp AS(
SELECT
ROW_NUMBER() OVER( primayKey DESC ) AS RowNumber,
*
FROM
myNewTable
),
Temp2 AS(
SELECT COUNT(*) AS TotalCount FROM Temp
)
SELECT TOP limit * FROM Temp, Temp2 WHERE RowNumber > :offset order by RowNumber
This is run in both MSSQL and MySQL
There is no easy way of doing this.
1. As you found out, it usually boils down to executing 2 queries:
Executing SELECT with limit and offset in order to fetch the data that you need.
Executing a COUNT(*) in order to count the total number of pages.
This approach might work for tables that don't have a lot of rows, or when you filter the data (int the COUNT and SELECT queries) on a column that is indexed.
2. If your table is large, but the data that you need to show represents smaller percentage of the data from the table and the data shares a common trait (for example, the data in all of your pages is created on a single day) you can use partitioning. Executing COUNT and SELECT on a single partition will be way more faster than executing them on the whole table.
3. You can create another table which will store the value of the COUNT query.
For example, lets say that your big_table table looks like this:
id | user_id | timestamp_column | text_column | another_text_column
Now, your SELECT query looks like this:
SELECT * FROM big_table WHERE user_id = 4 ORDER BY timestamp_column LIMIT 20 OFFSET 20;
And your count query:
SELECT COUNT(*) FROM table WHERE user_id = 4;
You could create a count_table that will have the following format:
user_id | count
Once you fill this table with the current data in the system, you will create a trigger which will update this table on every insert or update of the big_table.
This way, the count query will be really fast, because it will be executed on the count_table, for example:
SELECT count FROM count_table WHERE user_id = 4
The drawback of this approach is that the insert in the big_table will be slower, since the trigger will fire and update the count_table on every insert.
This are the approaches that you can try but in the end it all depends on the size and type of your data.

Delete a specific range of rows from table in sybase database

I need to delete specific range of rows from table in sybase database.
For example , I have 10 rows. but I want to delete 5 to 7 row from my table.
how can we delete .
DELETE FROM someTable
WHERE myPrimaryKey in ( 5, 6, 7 )
In other words: You cannot just tell the DBMS to delete "the 5th record" from the table. You will have to construct a where clause that matches exactly those rows you want deleted.
When you think about it, this makes sense, because the DBMS does not guarantee any internal order of the records in a table, that is, it has no intrinsic concept of what the n-th record is.
By default a table has no predetermined order so you have to define the order to delete 5-7th rows. You can try to use ROW_NUMBER() to define the order in the table to count rows. For instance if you have a primary key ID you can order by this column:
WITH T1 AS
(
SELECT T.*,
ROW_NUMBER() OVER (ORDER BY ID) as RowNum
FROM T
)
DELETE FROM T1 WHERE RowNum between 5 and 7
SQLFiddle demo

How to retrieve the last 2 records from table?

I have a table with n number of records
How can i retrieve the nth record and (n-1)th record from my table in SQL without using derived table ?
I have tried using ROWID as
select * from table where rowid in (select max(rowid) from table);
It is giving the nth record but i want the (n-1)th record also .
And is there any other method other than using max,derived table and pseudo columns
Thanks
You cannot depend on rowid to get you to the last row in the table. You need an auto-incrementing id or creation time to have the proper ordering.
You can use, for instance:
select *
from (select t.*, row_number() over (order by <id> desc) as seqnum
from t
) t
where seqnum <= 2
Although allowed in the syntax, the order by clause in a subquery is ignored (for instance http://docs.oracle.com/javadb/10.8.2.2/ref/rrefsqlj13658.html).
Just to be clear, rowids have nothing to do with the ordering of rows in a table. The Oracle documentation is quite clear that they specify a physical access path for the data (http://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#i6732). It is true that in an empty database, inserting records into a newtable will probably create a monotonically increasing sequence of row ids. But you cannot depend on this. The only guarantees with rowids are that they are unique within a table and are the fastest way to access a particular row.
I have to admit that I cannot find good documentation on Oracle handling or not handling order by's in subqueries in its most recent versions. ANSI SQL does not require compliant databases to support order by in subqueries. Oracle syntax allows it, and it seems to work in some cases, at least. My best guess is that it would probably work on a single processor, single threaded instance of Oracle, or if the data access is through an index. Once parallelism is introduced, the results would probably not be ordered. Since I started using Oracle (in the mid-1990s), I have been under the impression that order bys in subqueries are generally ignored. My advice would be to not depend on the functionality, until Oracle clearly states that it is supported.
select * from (select * from my_table order by rowid) where rownum <= 2
and for rows between N and M:
select * from (
select * from (
select * from my_table order by rowid
) where rownum <= M
) where rownum >= N
Try this
select top 2 * from table order by rowid desc
Assuming rowid as column in your table:
SELECT * FROM table ORDER BY rowid DESC LIMIT 2

Deleting Duplicate Records from a Table

I Have a table called Table1 which has 48 records. Out of which only 24 should be there in that table. For some reason I got duplicate records inserted into it. How do I delete the duplicate records from that table.
Here's something you might try if SQL Server version is 2005 or later.
WITH cte AS
(
SELECT {list-of-columns-in-table},
row_number() over (PARTITION BY {list-of-key-columns} ORDER BY {rule-to-determine-row-to-keep}) as sequence
FROM myTable
)
DELETE FROM cte
WHERE sequence > 1
This uses a common table expression (CTE) and adds a sequence column. {list-of-columns-in-table} is just as it states. Not all columns are needed, but I won't explain here.
The {list-of-key-columns] is the columns that you use to define what is a duplicate.
{rule-to-determine-row-to-keep} is a sequence so that the first row is the row to keep. For example, if you want to keep the oldest row, you would use a date column for sequence.
Here's an example of the query with real columns.
WITH cte AS
(
SELECT ID, CourseName, DateAdded,
row_number() over (PARTITION BY CourseName ORDER BY DateAdded) as sequence
FROM Courses
)
DELETE FROM cte
WHERE sequence > 1
This example removes duplicate rows based on the CoursName value and keeps the oldest basesd on the DateAdded value.
http://support.microsoft.com/kb/139444
This section is the key. The primary point you should take away. ;)
This article discusses how to locate
and remove duplicate primary keys from
a table. However, you should closely
examine the process which allowed the
duplicates to happen in order to
prevent a recurrence.
Identify your records by grouping data by your logical keys, since you obviously haven't defined them, and applying a HAVING COUNT(*) > 1 statement at the end. The article goes into this in depth.
This is an easier way
Select * Into #TempTable FROM YourTable
Truncate Table YourTable
Insert into YourTable Select Distinct * from #TempTable
Drop Table #TempTable

Delete all but top n from database table in SQL

What's the best way to delete all rows from a table in sql but to keep n number of rows on the top?
DELETE FROM Table WHERE ID NOT IN (SELECT TOP 10 ID FROM Table)
Edit:
Chris brings up a good performance hit since the TOP 10 query would be run for each row. If this is a one time thing, then it may not be as big of a deal, but if it is a common thing, then I did look closer at it.
I would select ID column(s) the set of rows that you want to keep into a temp table or table variable. Then delete all the rows that do not exist in the temp table. The syntax mentioned by another user:
DELETE FROM Table WHERE ID NOT IN (SELECT TOP 10 ID FROM Table)
Has a potential problem. The "SELECT TOP 10" query will be executed for each row in the table, which could be a huge performance hit. You want to avoid making the same query over and over again.
This syntax should work, based what you listed as your original SQL statement:
create table #nuke(NukeID int)
insert into #nuke(Nuke) select top 1000 id from article
delete article where not exists (select 1 from nuke where Nukeid = id)
drop table #nuke
Future reference for those of use who don't use MS SQL.
In PostgreSQL use ORDER BY and LIMIT instead of TOP.
DELETE FROM table
WHERE id NOT IN (SELECT id FROM table ORDER BY id LIMIT n);
MySQL -- well...
Error -- This version of MySQL does not yet support 'LIMIT &
IN/ALL/ANY/SOME subquery'
Not yet I guess.
Here is how I did it. This method is faster and simpler:
Delete all but top n from database table in MS SQL using OFFSET command
WITH CTE AS
(
SELECT ID
FROM dbo.TableName
ORDER BY ID DESC
OFFSET 11 ROWS
)
DELETE CTE;
Replace ID with column by which you want to sort.
Replace number after OFFSET with number of rows which you want to keep.
Choose DESC or ASC - whatever suits your case.
I think using a virtual table would be much better than an IN-clause or temp table.
DELETE
Product
FROM
Product
LEFT OUTER JOIN
(
SELECT TOP 10
Product.id
FROM
Product
) TopProducts ON Product.id = TopProducts.id
WHERE
TopProducts.id IS NULL
This really is going to be language specific, but I would likely use something like the following for SQL server.
declare #n int
SET #n = SELECT Count(*) FROM dTABLE;
DELETE TOP (#n - 10 ) FROM dTable
if you don't care about the exact number of rows, there is always
DELETE TOP 90 PERCENT FROM dTABLE;
I don't know about other flavors but MySQL DELETE allows LIMIT.
If you could order things so that the n rows you want to keep are at the bottom, then you could do a DELETE FROM table LIMIT tablecount-n.
Edit
Oooo. I think I like Cory Foy's answer better, assuming it works in your case. My way feels a little clunky by comparison.
I would solve it using the technique below. The example expect an article table with an id on each row.
Delete article where id not in (select top 1000 id from article)
Edit: Too slow to answer my own question ...
Refactored?
Delete a From Table a Inner Join (
Select Top (Select Count(tableID) From Table) - 10)
From Table Order By tableID Desc
) b On b.tableID = A.tableID
edit: tried them both in the query analyzer, current answer is fasted (damn order by...)
Better way would be to insert the rows you DO want into another table, drop the original table and then rename the new table so it has the same name as the old table
I've got a trick to avoid executing the TOP expression for every row. We can combine TOP with MAX to get the MaxId we want to keep. Then we just delete everything greater than MaxId.
-- Declare Variable to hold the highest id we want to keep.
DECLARE #MaxId as int = (
SELECT MAX(temp.ID)
FROM (SELECT TOP 10 ID FROM table ORDER BY ID ASC) temp
)
-- Delete anything greater than MaxId. If MaxId is null, there is nothing to delete.
IF #MaxId IS NOT NULL
DELETE FROM table WHERE ID > #MaxId
Note: It is important to use ORDER BY when declaring MaxId to ensure proper results are queried.