how can we delete one value out of two values from table? - sql

Is it possible to this?I have a table with two rows and 1 column.Both rows have same value.no primary key is there.can we delete 1 row?

Here's one way to do it with ROW_NUMBER() and a common table expression:
with cte as (
select *,
row_number() over (partition by id order by id) rn
from yourtable)
delete from cte
where rn = 1;
SQL Fiddle Demo

you can do this using RANK() function.
or you can use TOP keyword.

You can get fancy and use cte to delete one but if they are the same value (and the table is as simple as you're describing it), you can also delete both and add one back. Much simpler.
Surrogate Key anyone?

Related

Handling duplicates in BigQuery (Nested Table)

I think this is a very simple question but I would like some guidance: I didn't want to have to drop a table to send a new table with the deduplicated records, like using DELETE FROM based on the query below using BigQuery, is it possible? PS: This is a nested table!
SELECT
*
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY id, date_register) row_number
FROM
dataset.table)
WHERE
row_number = 1
order by id, date_register
To de-duplicate in place, without re-creating the table - use MERGE:
MERGE `temp.many_random` t
USING (
SELECT DISTINCT *
FROM `temp.many_random`
)
ON FALSE
WHEN NOT MATCHED BY SOURCE THEN DELETE
WHEN NOT MATCHED BY TARGET THEN INSERT ROW
It's simpler than the current accepted answer, as it won't ask you to match the current partitioning or clustering - it will just respect it.
Update: please also check Felipe Hoffa's answer which is simpler, and learn more on this post: BigQuery Deduplication.
You need to exclude row_number from output and overwrite your table using CREATE OR REPLACE TABLE:
CREATE OR REPLACE TABLE your_table AS
PARTITION BY DATE(date_register)
SELECT
* EXCEPT(row_number)
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY id, date_register) row_number
FROM your_table)
WHERE
row_number = 1
If you don´t have a partition field defined at the source, I recommend that you create a new table with the partition field to make this query work so that you can automate the process.

Return only the newest rows from a BigQuery table with a duplicate items

I have a table with many duplicate items – Many rows with the same id, perhaps with the only difference being a requested_at column.
I'd like to do a select * from the table, but only return one row with the same id – the most recently requested.
I've looked into group by id but then I need to do an aggregate for each column. This is easy with requested_at – max(requested_at) as requested_at – but the others are tough.
How do I make sure I get the value for title, etc that corresponds to that most recently updated row?
I suggest a similar form that avoids a sort in the window function:
SELECT *
FROM (
SELECT
*,
MAX(<timestamp_column>)
OVER (PARTITION BY <id_column>)
AS max_timestamp,
FROM <table>
)
WHERE <timestamp_column> = max_timestamp
Try something like this:
SELECT *
FROM (
SELECT
*,
ROW_NUMBER()
OVER (
PARTITION BY <id_column>
ORDER BY <timestamp column> DESC)
row_number,
FROM <table>
)
WHERE row_number = 1
Note it will add a row_number column, which you might not want. To fix this, you can select individual columns by name in the outer select statement.
In your case, it sounds like the requested_at column is the one you want to use in the ORDER BY.
And, you will also want to use allow_large_results, set a destination table, and specify no flattening of results (if you have a schema with repeated fields).

Update Oracle table column with row number

I want to update a table column with row number.
Each row in empid column should update with related row number.
I tried following query.
UPDATE employee SET empid = row_number();
But this is not working. Any idea?
First, this is not the correct syntax for the row_number() function, since you're missing the over clause (resulting in an ORA-30484 error). Even if it was, this would not work, as you cannot directly use window functions in a set clause (resulting in an ORA-30483 error).
For this usecase, however, you could just use the rownum pseudo-column:
UPDATE employee SET empid = ROWNUM;
SQLFiddle
You could do something like the following. You can change the ORDER BY order the rows if needed.
UPDATE emp
SET empid = emp.RowNum
FROM (SELECT empid, ROW_NUMBER() OVER (ORDER BY empid) AS rowNum FROM employee) emp
UPDATE employee SET empid = row_number();
Firstly, it is syntactically incorrect.
Secondly, you cannot use ROW_NUMBER() analytic function without the analytic_clause.
As you replied to my comment that the order doesn't matter to you, you could simply use ROWNUM.
UPDATE employee SET empid = ROWNUM;
It will assign the pseudo-column value by randomly picking the rows. Since you are assigning EMPID, I would suggest you should consider ordering.
Usually employee ids are generated using a SEQUENCE object. There are two ways to implement the auto-increment functionality:
Oracle 11g and below - Auto-increment using trigger-sequence approach
Oracle 12c - IDENTITY column autoincrement functionality
you could also do this
create table your_table_name as
select row_number() over( order by 1) as serial_no, a.* from your_query a
this creates the serial number when you write the table itself. ( note this is not set as PK if you want it to act as pk)

Re-indexing a column with either SQL or PL/SQL

I have several tables that use an ID number plus a column called xsequence that are both primary keys. Currently, I have a bunch of data that looks like this:
ID_NUMBER,XSEQUENCE
001,2
001,5
001,8
002,1
002,6
What I need to end up with is:
ID_NUMBER,XSEQUENCE
001,1
001,2
001,3
002,1
002,2
What is the best way of going about starting this? Every time I try, I just end up spinning my wheels.
Try something like this:
select id_number,
row_number() over (partition by id_number order by xsequence) new_xsequence
from yourtable
That's an analytic function really handy for this sort of thing. Using the Partition keyword - "resets" the counter at each id_number. (so 1,2,3 .. then starts again 1,2,3 ... etc.).
(The Partition keyword in analytic functions behaves very similar to the GROUP by keyword)
[edit]
To UPDATE the original table, I actually prefer the MERGE statement - it's a bit simpler syntax wise, and seems a bit more intuitive ;) )
MERGE INTO yourtable base
USING (
select rowid rid,
id_number,
row_number() over (partition by id_number order by xsequence) new_xsequence,
xsequence old_xsequence
from yourtable
) new
ON ( base.rowid = new.rid )
WHEN MATCHED THEN UPDATE
SET base.xsequence = new.new_xsequence
[edit]

Deleting Duplicate Records from a Table

I Have a table called Table1 which has 48 records. Out of which only 24 should be there in that table. For some reason I got duplicate records inserted into it. How do I delete the duplicate records from that table.
Here's something you might try if SQL Server version is 2005 or later.
WITH cte AS
(
SELECT {list-of-columns-in-table},
row_number() over (PARTITION BY {list-of-key-columns} ORDER BY {rule-to-determine-row-to-keep}) as sequence
FROM myTable
)
DELETE FROM cte
WHERE sequence > 1
This uses a common table expression (CTE) and adds a sequence column. {list-of-columns-in-table} is just as it states. Not all columns are needed, but I won't explain here.
The {list-of-key-columns] is the columns that you use to define what is a duplicate.
{rule-to-determine-row-to-keep} is a sequence so that the first row is the row to keep. For example, if you want to keep the oldest row, you would use a date column for sequence.
Here's an example of the query with real columns.
WITH cte AS
(
SELECT ID, CourseName, DateAdded,
row_number() over (PARTITION BY CourseName ORDER BY DateAdded) as sequence
FROM Courses
)
DELETE FROM cte
WHERE sequence > 1
This example removes duplicate rows based on the CoursName value and keeps the oldest basesd on the DateAdded value.
http://support.microsoft.com/kb/139444
This section is the key. The primary point you should take away. ;)
This article discusses how to locate
and remove duplicate primary keys from
a table. However, you should closely
examine the process which allowed the
duplicates to happen in order to
prevent a recurrence.
Identify your records by grouping data by your logical keys, since you obviously haven't defined them, and applying a HAVING COUNT(*) > 1 statement at the end. The article goes into this in depth.
This is an easier way
Select * Into #TempTable FROM YourTable
Truncate Table YourTable
Insert into YourTable Select Distinct * from #TempTable
Drop Table #TempTable