SQL query for removing non-unique row - sql

I'm using postgreSQL 9.2.
Let I've the following table:
id name definition
serial varchar(128) text
1 name1 definition1
..........................................
I need to write a query that remove all rows with the same name such that every row will have unique name. If two rows have the same name, their definitions are also the same.

Use row_number() function on name and remove all rows that have row_number() > 1
Here is an example query: Deleting duplicates

DELETE FROM mytable dd
WHERE EXISTS (
SELECT *
FROM mytable ex
WHERE ex.name = dd.name
AND ex.id < dd.id
);

Why do you even let client applications to add rows when name duplicates in the first place?

Related

SQL Query : should return Single Record if Search Condition met, otherwise return Multiple Records

I have table with Billions of Records, Table structure is like :
ID NUMBER PRIMARY KEY,
MY_SEARCH_COLUMN NUMBER,
MY_SEARCH_COLUMN will have Numeric value upto 15 Digit in length.
What I want is, if any specific record is matched, I will have to get that matched value only,
i.e. : If I enter WHERE MY_SEARCH_COLUMN = 123454321 and table has value 123454321 then this only should be returned.
But if exact value is not matched, I will have to get next 10 values from the table.
i.e. : if I enter WHERE MY_SEARCH_COLUMN = 123454321 and column does not have the value 123454321 then it should return 10 values from the table which is greater than 123454321
Both the case should be covered in single SQL Query, and I have have to keep in mind the Performance of the Query. I have already created Index on the MY_SEARCH_COLUMN columns, so other suggestions are welcome to improve the Performance.
This could be tricky to do without using a proc or maybe some dynamic SQL, but we can try using ROW_NUMBER here:
WITH cte AS (
SELECT ID, MY_SEARCH_COLUMN,
ROW_NUMBER() OVER (ORDER BY MY_SEARCH_COLUMN) rn
FROM yourTable
WHERE MY_SEARCH_COLUMN >= 123454321
)
SELECT *
FROM cte
WHERE rn <= CASE WHEN EXISTS (SELECT 1 FROM yourTable WHERE MY_SEARCH_COLUMN = 123454321)
THEN 1
ELSE 10 END;
The basic idea of the above query is that we assign a row number to all records matching the target or greater. Then, we query using either a row number of 1, in case of an exact match, or all row numbers up to 10 in case of no match.
SELECT *
FROM your_table AS src
WHERE src.MY_SEARCH_COLUMN = CASE WHEN EXISTS (SELECT 1 FROM your_table AS src2 WITH(NOLOCK) WHERE src2.MY_SEARCH_COLUMN = 123456321)
THEN 123456321
ELSE src.MY_SEARCH_COLUMN
END

Update columns in DB2 using randomly chosen static values provided at runtime

I would like to update rows with values chosen randomly from a set of possible values.
Ideally I would be able to provide this values at runtime, using JdbcTemplate from Java application.
Example:
In a table, column "name" can contain any name. The goal is to run through the table and change all names to equal to either "Bob" or "Alice".
I know that this can be done by creating a sql function. I tested it and it was fine but I wonder if it is possible to just use simple query?
This will not work, seems that the value is computed once, and applied to all rows:
UPDATE test.table
SET first_name =
(SELECT a.name
FROM
(SELECT a.name, RAND() idx
FROM (VALUES('Alice'), ('Bob')) AS a(name) ORDER BY idx FETCH FIRST 1 ROW ONLY) as a)
;
I tried using MERGE INTO, but it won't even run (possible_names is not found in SET query). I am yet to figure out why:
MERGE INTO test.table
USING
(SELECT
names.fname
FROM
(VALUES('Alice'), ('Bob'), ('Rob')) AS names(fname)) AS possible_names
ON ( test.table.first_name IS NOT NULL )
WHEN MATCHED THEN
UPDATE SET
-- select random name
first_name = (SELECT fname FROM possible_names ORDER BY idx FETCH FIRST 1 ROW ONLY)
;
EDIT: If possible, I would like to only focus on fields being updated and not depend on knowing primary keys and such.
Db2 seems to be optimizing away the subselect that returns your supposedly random name, materializing it only once, hence all rows in the target table receive the same value.
To force subselect execution for each row you need to somehow correlate it to the table being updated, for example:
UPDATE test.table
SET first_name =
(SELECT a.name
FROM (VALUES('Alice'), ('Bob')) AS a(name)
ORDER BY RAND(ASCII(SUBSTR(first_name, 1, 1)))
FETCH FIRST 1 ROW ONLY)
or may be even
UPDATE test.table
SET first_name =
(SELECT a.name
FROM (VALUES('Alice'), ('Bob')) AS a(name)
ORDER BY first_name, RAND()
FETCH FIRST 1 ROW ONLY)
Now that the result of subselect seems to depend on the value of the corresponding row in the target table, there's no choice but to execute it for each row.
If your table has a primary key, this would work. I've assumed the PK is column id.
UPDATE test.table t
SET first_name =
( SELECT name from
( SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY R) AS RN FROM
( SELECT *, RAND() R
FROM test.table, TABLE(VALUES('Alice'), ('Bob')) AS d(name)
)
)
AS u
WHERE t.id = u.id and rn = 1
)
;
There might be a nicer/more efficient solution, but I'll leave that to others.
FYI I used the following DDL and data to test the above.
create table test.table(id int not null primary key, first_name varchar(32));
insert into test.table values (1,'Flo'),(2,'Fred'),(3,'Sue'),(4,'John'),(5,'Jim');

PL/SQL Increase value of new row, with value of previous

I need to increase value of next NEWLOSAL row, to be bigger than one, from previous of NEWHISA.
Like HISAL and LOSAL column.
NEWLOSAL need to be previous NEWHISAL + 1.
not that sure if this is what you want:
update table1 t1
set t1.Newlosal=case when t1.grade=1 then (t1.Newhisal+1) else (select t2.Newhisal+1 from table1 t2 where t2.grade = (t1.grade-1)) end
WHERE EXISTS (
SELECT 1
FROM table1 t2
WHERE t2.grade=(t1.grade-1))
This can efficiently be done using the merge statement and a window function:
merge into table1 tg
using
(
select id, -- I assume this is the PK column
lag(newhisal) over (order by grade) + 1 as new_losal
from table1
) nv on (nv.id = tg.id)
when matched then update
set tg.newlosal = nv.new_losal;
In SQL rows in a table (or a result) or not ordered, so the concept of a "previous" row only makes sense if you define a sort order. That's what the over (order by grade) does in the window function. From the screen shot I can not tell by which column this should be sorted.
The screen shot also doesn't reveal the primary key column of your table. I assumed it's named ID. You have to change that to reflect your real PK column name.
I also didn't include a partition by clause in the window function assuming that the formula should be applied for all rows in the same way. If this is not the case you need to be more specific with your sample data.

SQL Remove non duplicate entires in a table

I have a table with two columns CountryCode CountryName. There are duplicate entries in countrycode. But I want to remove the non-duplicate entires and keep the rows which are duplicates in the countrycode column. So I am trying to write an SQL statement to do this. I think I have to use Having but not too sure how exactly to incorporate it. Thanks
That's a bit odd. I was expecting you to want to remove the duplicate entries, not the other way around. But something like this should work regardless of the database you are using:
delete from TableName
where CountryCode in (select CountryCode
from TableName
group by CountryCode
having count(*) = 1).
So to be clear, the subquery:
select CountryCode
from TableName
group by CountryCode
having count(*) = 1
... returns rows with unique CountryCodes. And then the delete statement:
delete from TableName
where CountryCode in (...)
... deletes those unique rows so that the only rows remaining in your table should be the ones with duplicates.
However, by your comments, it sounds like you just want a query that returns only the duplicates. If that's the case, then just use the subquery inside a select statement, but modify the having clause to return only duplicates:
select *
from TableName
where CountryCode in (select CountryCode
from TableName
group by CountryCode
having count(*) > 1)
This is a quick solution, it is probablly not the fastes with alot of entries, but it works.
SELECT * FROM [table] AS tbl
WHERE countrycode IN
(SELECT countrycode FROM [table] WHERE tbl.countryname <> countryname)
/* Words in uppercase are SQL Syntax */
naming the first table (tbl) you can use it in the nested query

how to update a column for all rows in a table with different values for each row in a table of 100 records

my table contains 100 records and i need to update a column with new values(diff value for each record) for each of the 100 records .How can i do this. the column to update the values is not primary key.
UPDATE tablename
SET columnname = CASE id
WHEN 1 THEN 42
WHEN 2 THEN 666
END
With this query columnname will be updated to 42 for the row with id = 1, and 666 for id = 2
Create a table with an autoicrement id and the columns of the original table.
Then
INSERT INTO new_table (column1, column2,.....) -- refer all columns except autoincrement id
SELECT * FROM old_table
Update the old table by joining with the new, assuming the is a key composite or not that distincts each row
Set Unique constraint on this column.
ALTER TABLE YourTableName
ADD CONSTRAINT uc_ColumnID UNIQUE (ColumnName)
Now, whenever you try to update it with duplicate values, sql server will not allow:)
Also a long run scenario
If you're on SQL Server 2005 or newer (you didn't exactly specify.....), you could easily use a CTE (Common Table Expression) for this - basically, you select your PK value, and a counter counting up from 1, and you set each row's ColumnName column to the value of the counter:
;WITH UpdateData AS
(
SELECT
PKValue,
ROW_NUMBER() OVER(ORDER BY .......) AS 'RowNum'
FROM
dbo.YourTable
)
UPDATE dbo.YourTable
SET ColumnName = u.RowNum
FROM UpdateData u
WHERE dbo.YourTable.PKValue = u.PKValue
With this, you're generating a sequence from 1 through 100 in the RowNum field of the CTE, and you're setting this unique value to your underlying table.
load a DataTable say dt with specific row ID of the table which you wanna update.
then execute
foreach(DataRow rw in dt.Rows)
{
update table_name set column_name=desired_value where specific_column=rw
}