Remove duplicate lines from SQL Server table

Remove duplicate lines from SQL Server table - sql

Someone deployed a SQL table with the schema
ConfigOptions
name VARCHAR(50)
value VARCHAR(50)
and the following logic for saving options:
int i = ExecuteNonQuery("UPDATE ConfigOptions SET value=#value WHERE name=#name");
if(i==0) i = ExecuteNonQuery("INSERT INTO ConfigOptions (name,value) (#name,#value)");
We now saw that this table is littered with duplicates, and we want to change this.
As far as I can tell, the logic is: whenever the UPDATE affected zero rows, another row is inserted. If I am not mistaken, this can be caused by:
a row by the name of #name does not exist or
the row exists, but already contains value #value
So, all rows with same name should be full duplicates. If now, something is completely wrong (and behaviour may be undefined).
Now I have to fix this problem of duplicates, so I want to add a PK on name. Before I can do this, I have to remove all rows with duplicate names, only keeping one of each.
In the installer (only the installer is allowed to change schema), I only have SQL queries at hand, so I can't do it with C# logic:
Dictionary<string, int> dic = new Dictionary<string, int>();
SqlDataReader sdr = ExecuteReader("SELECT name,COUNT(value) FROM ConfigOptions GROUP BY name HAVING COUNT(value)>1");
while (sdr.Read()) dic.Add(sdr.GetString(0), sdr.GetInt32(1));
sdr.Close();
foreach (var kv in dic) {
AddParameter("#name", System.Data.SqlDbType.VarChar, 50, kv.Key);
ExecuteNonQuery("DELETE TOP " + (kv.Value - 1) + " FROM ConfigOptions WHERE name=#name");
}
ExecuteNonQuery("ALTER TABLE program_options ADD PRIMARY KEY (name)");
Is there a way to put this into SQL logic?

Using %%physloc%%, the phys(ical) loc(ation) of the row, should do the trick:
DELETE FROM ConfigOptions
WHERE %%physloc%% NOT IN (
SELECT MIN(%%physloc%%)
FROM ConfigOptions
GROUP BY name);
After this cleanup, you can add the primary key to the table.
NOTE: this will leave you with only one row for every name. If the value column is different in two records with the same name, you will lose the newest record. If you want to change this, use GROUP BY name, value.

Related

Adding Row in existing table (SQL Server 2005)

I want to add another row in my existing table and I'm a bit hesitant if I'm doing the right thing because it might skew the database. I have my script below and would like to hear your thoughts about it.
I want to add another row for 'Jane' in the table, which will be 'SKATING" in the ACT column.
Table: [Emp_table].[ACT].[LIST_EMP]
My script is:
INSERT INTO [Emp_table].[ACT].[LIST_EMP]
([ENTITY],[TYPE],[EMP_COD],[DATE],[LINE_NO],[ACT],[NAME])
VALUES
('REG','EMP','45233','2016-06-20 00:00:00:00','2','SKATING','JANE')
Will this do the trick?

Your statement looks ok. If the database has a problem with it (for example, due to a foreign key constraint violation), it will reject the statement.
If any of the fields in your table are numeric (and not varchar or char), just remove the quotes around the corresponding field. For example, if emp_cod and line_no are int, insert the following values instead:
('REG','EMP',45233,'2016-06-20 00:00:00:00',2,'SKATING','JANE')

Inserting records into a database has always been the most common reason why I've lost a lot of my hairs on my head!
SQL is great when it comes to SELECT or even UPDATEs but when it comes to INSERTs it's like someone from another planet came into the SQL standards commitee and managed to get their way of doing it implemented into the final SQL standard!
If your table does not have an automatic primary key that automatically gets generated on every insert, then you have to code it yourself to manage avoiding duplicates.
Start by writing a normal SELECT to see if the record(s) you're going to add don't already exist. But as Robert implied, your table may not have a primary key because it looks like a LOG table to me. So insert away!
If it does require to have a unique record everytime, then I strongly suggest you create a primary key for the table, either an auto generated one or a combination of your existing columns.
Assuming the first five combined columns make a unique key, this select will determine if your data you're inserting does not already exist...
SELECT COUNT(*) AS FoundRec FROM [Emp_table].[ACT].[LIST_EMP]
WHERE [ENTITY] = wsEntity AND [TYPE] = wsType AND [EMP_COD] = wsEmpCod AND [DATE] = wsDate AND [LINE_NO] = wsLineno
The wsXXX declarations, you will have to replace them with direct values or have them DECLAREd earlier in your script.
If you ran this alone and recieved a value of 1 or more, then the data exists already in your table, at least those 5 first columns. A true duplicate test will require you to test EVERY column in your table, but it should give you an idea.
In the INSERT, to do it all as one statement, you can do this ...
INSERT INTO [Emp_table].[ACT].[LIST_EMP]
([ENTITY],[TYPE],[EMP_COD],[DATE],[LINE_NO],[ACT],[NAME])
VALUES
('REG','EMP','45233','2016-06-20 00:00:00:00','2','SKATING','JANE')
WHERE (SELECT COUNT(*) AS FoundRec FROM [Emp_table].[ACT].[LIST_EMP]
WHERE [ENTITY] = wsEntity AND [TYPE] = wsType AND
[EMP_COD] = wsEmpCod AND [DATE] = wsDate AND
[LINE_NO] = wsLineno) = 0
Just replace the wsXXX variables with the values you want to insert.
I hope that made sense.

Webmatrix - Creating two tables with the primary key from one

I currently have the following SQL statement which creates an entry in my database, and auto creates the primary key. This works fine:
if (IsPost){
var sql = "INSERT INTO Property_Info (PropertyName) VALUES (#0)";
db.Execute(sql, Request["propertyname"]);
}
What I also need to do on the same page, is insert a record into a different table, using the primary key created in the above statement. is this possible, or will i need to do this on a separate page?

Yep, you can get the database ID you created with db.GetLastInsertId() and use that as a parameter in your next query.
http://msdn.microsoft.com/en-us/library/webmatrix.data.database.getlastinsertid(v=vs.111).aspx
I've found that you are best off casting it to an int too, so immediately after your db.Execute() line, try this:
int newId = (int)db.GetLastInsertId();

Update a table and return both the old and new values

Im writing a VB app that is scrubbing some data inside a DB2 database. In a few tables i want to update entire columns. For example an account number column. I am changing all account numbers to start at 1, and increment as I go down the list. Id like to be able to return both the old account number, and the new one so I can generate some kind of report I can reference so I dont lose the original values. Im updating columns as so:
DECLARE #accntnum INT
SET #accntnum = 0
UPDATE accounts
SET #accntnum = accntnum = #accntnum + 1
GO
Is there a way for me to return both the original accntnum and the new one in one table?

DB2 has a really nifty feature where you can select data from a "data change statement". This was tested on DB2 for Linux/Unix/Windows, but I think that it should also work on at least DB2 for z/OS.
For your numbering, you might considering creating a sequence, as well. Then your update would be something like:
CREATE SEQUENCE acct_seq
START WITH 1
INCREMENT BY 1
NO MAXVALUE
NO CYCLE
CACHE 24
;
SELECT accntnum AS new_acct, old_acct
FROM FINAL TABLE (
UPDATE accounts INCLUDE(old_acct INT)
SET accntnum = NEXT VALUE FOR acct_seq, old_acct = accntnum
)
ORDER BY old_acct;
The INCLUDE part creates a new column in the resulting table with the name and the data type specified, and then you can set the value in the update statement as you would any other field.

A possible solution is to add an additional column (let's call it oldaccntnum) and assign old values to that column as you do your update.
Then drop it when you no longer need it.

Here's what I'd do:
-- create a new table to track the changes.
- with columns identifying a unique key, old-vale, new-value, timestamp
-- create a trigger on the accounts table
to write the old and new values to the new table.
But, not knowing all the conditions, it may not be worth the trouble.

Doing UPSERT when row is referenced by a FK

Let's say that I have a table of items, and for each item, there can be additional information stored for it, which goes into a second table. The additional information is referenced by a FK in the first table, which can be NULL (if the item doesn't have additional info).
TABLE item (
...
item_addtl_info_id INTEGER
)
CONSTRAINT fk_item_addtl_info FOREIGN KEY (item_addtl_info)
REFERENCES addtl_info (addtl_info_id)
TABLE addtl_info (
addtl_info_id INTEGER NOT NULL
GENERATED BY DEFAULT
AS IDENTITY (
INCREMENT BY 1
NO CACHE
),
addtl_info_text VARCHAR(100)
...
CONSTRAINT pk_addtl_info PRIMARY KEY (addtl_info_id)
)
What is the "best practice" to update an item's additional info (in IBM DB2 SQL, preferably)?
It should be an UPSERT operation, meaning that if additional info does not yet exist then a new record is created in the second table, but if it does, then it is only updated, and the FK in the first table does not change.
So imperatively, this is the logic:
UPSERT(item, item_info):
CASE WHEN item.item_addtl_info_id IS NULL THEN
INSERT INTO addtl_info (item_info)
UPDATE item.item_addtl_info_id (addtl_info.addtl_info_id)
^^^^^^^^^^^^^
ELSE
UPDATE addtl_info (item_info)
END
My main problem is how to get the newly inserted addtl_info row's id (underlined above). In a stored proc I can request the id from a sequence and store it in a variable, but maybe there is a more straightforward way. Isn't it something that comes up all the time when programming databases?
I mean, I'm really not interested in what the id of the addtl_info record is as long as it remains unique and is referenced properly. So using sequences seems a bit of an overkill to me in this case.
As a matter of fact, this UPSERT operation should be part of the SQL language as a standard operation (maybe it is, and I just don't know about it?)...

The syntax I was looking for is:
SELECT * FROM NEW TABLE ( INSERT INTO phone_book VALUES ( 'Peter Doe','555-2323' ) )
from Wikipedia (http://en.wikipedia.org/wiki/Insert_%28SQL%29)
This is how to refer to the record that was just inserted in the table.
My colleague called this construct an "in-place trigger", which what it really is...
Here is the first version that I put together as a compound SQL statement:
begin atomic
declare addtl_id integer;
set addtl_id = (select item_addtl_info_id from item where item.item_id = XXX);
if addtl_id is null
then
set addtl_id = (select addtl_info_id from new table
(insert into addtl_info
(addtl_info_text)
values ('My brand new additional info')
)
);
update item set item.item_addtl_info_id = addtl_id
where item.item_id = XXX;
else
update addtl_info set addtl_info_text = 'My updated additional info'
where addtl_info.addtl_info_id = addtl_id;
end if;
end
XXX being equal to the item id to be updated - this code can now be easily inserted into a sproc, and XXX can be converted to an input parameter.
I also tried using MERGE INTO, but I couldn't figure out a syntax for updating a table different from what was specified as the target.

Auto Increment after delete in MySQL

I have a MySQL table with a primary key field that has AUTO_INCREMENT on.
After reading other posts on here I've noticed people with the same problem and with varied answers. Some recommend not using this feature, others state it can't be 'fixed'.
I have:
table: course
fields: courseID, courseName
Example: number of records in the table: 18. If I delete records 16, 17 and 18 - I would expect the next record entered to have the courseID of 16, however it will be 19 because the last entered courseID was 18.
My SQL knowledge isn't amazing but is there anyway to refresh or update this count with a query (or a setting in the phpMyAdmin interface)?
This table will relate to others in a database.
Given all the advice, I have decided to ignore this 'problem'. I will simply delete and add records whilst letting the auto increment do it's job. I guess it doesn't really matter what the number is since it's only being used as a unique identifier and doesn't have a (as mentioned above) business meaning.
For those who I may have confused with my original post: I do not wish to use this field to know how many records I have. I just wanted the database to look neat and have a bit more consistency.

What you're trying to do sounds dangerous, as that's not the intended use of AUTO_INCREMENT.
If you really want to find the lowest unused key value, don't use AUTO_INCREMENT at all, and manage your keys manually. However, this is NOT a recommended practice.
Take a step back and ask "why you need to recycle key values?" Do unsigned INT (or BIGINT) not provide a large enough key space?
Are you really going to have more than 18,446,744,073,709,551,615 unique records over the course of your application's lifetime?

ALTER TABLE foo AUTO_INCREMENT=1
If you've deleted the most recent entries, that should set it to use the next lowest available one. As in, as long as there's no 19 already, deleting 16-18 will reset the autoincrement to use 16.
EDIT: I missed the bit about phpmyadmin. You can set it there, too. Go to the table screen, and click the operations tab. There's an AUTOINCREMENT field there that you can set to whatever you need manually.

Primary autoincrement keys in database are used to uniquely identify a given row and shouldn't be given any business meaning. So leave the primary key as is and add another column called for example courseOrder. Then when you delete a record from the database you may want to send an additional UPDATE statement in order to decrement the courseOrder column of all rows that have courseOrder greater than the one you are currently deleting.
As a side note you should never modify the value of a primary key in a relational database because there could be other tables that reference it as a foreign key and modifying it might violate referential constraints.

Try :
SET #num := 0;
UPDATE your_table SET id = #num := (#num+1);
ALTER TABLE `your_table` AUTO_INCREMENT = 1;
That'll reset the autoincremented value, and then count every row while a new value is created for it.
example : before
1 : first value here
2 : second value here
X : deleted value
4 : The rest of the table
5 : The rest of the rest..
so the table will display the array : 1,2,4,5
Example : AFTER (if you use this command you will obtain)
1 : first value here
2 : second value here
3 : The rest of the table
4 : the rest of the rest
No trace of the deleted value, and the rest of the incremented continues with this new count.
BUT
If somewhere on your code something use the autoincremented value... maybe this attribution will cause problem.
If you don't use this value in your code everything should be ok.

You shouldn't be relying on the AUTO_INCREMENT id to tell you how many records you have in the table. You should be using SELECT COUNT(*) FROM course. ID's are there to uniquely identifiy the course and can be used as references in other tables, so you shouldn't repeat ids and shouldn't be seeking to reset the auto increment field.

I came here looking for an answer to the Title question "MySQL - Auto Increment after delete" but I could only find an answer for that in the questions
How to delete certain row from mysql table?
How to reset AUTO_INCREMENT in MySQL?
By using something like:
DELETE FROM table;
ALTER TABLE table AUTO_INCREMENT = 1;
Note that Darin Dimitrov's answer explain really well AUTO_INCREMENT and it's usage. Take a look there before doing something you might regret.
PS: The question itself is more "Why you need to recycle key values?" and Dolph's answer cover that.

What you are trying to do is very dangerous. Think about this carefully. There is a very good reason for the default behaviour of auto increment.
Consider this:
A record is deleted in one table that has a relationship with another table. The corresponding record in the second table cannot be deleted for auditing reasons. This record becomes orphaned from the first table. If a new record is inserted into the first table, and a sequential primary key is used, this record is now linked to the orphan. Obviously, this is bad. By using an auto incremented PK, an id that has never been used before is always guaranteed. This means that orphans remain orphans, which is correct.

There is actually a way to fix that. First you delete the auto_incremented primary key column, and then you add it again, like this:
ALTER TABLE table_name DROP column_name;
ALTER TABLE table_name ADD column_name int not null auto_increment primary key first;

you can select the ids like so:
set #rank = 0;
select id, #rank:=#rank+1 from tbl order by id
the result is a list of ids, and their positions in the sequence.
you can also reset the ids like so:
set #rank = 0;
update tbl a join (select id, #rank:=#rank+1 as rank from tbl order by id) b
on a.id = b.id set a.id = b.rank;
you could also just print out the first unused id like so:
select min(id) as next_id from ((select a.id from (select 1 as id) a
left join tbl b on a.id = b.id where b.id is null) union
(select min(a.id) + 1 as id from tbl a left join tbl b on a.id+1 = b.id
where b.id is null)) c;
after each insert, you can reset the auto_increment:
alter table tbl auto_increment = 16
or explicitly set the id value when doing the insert:
insert into tbl values (16, 'something');
typically this isn't necessary, you have count(*) and the ability to create a ranking number in your result sets. a typical ranking might be:
set #rank = 0;
select a.name, a.amount, b.rank from cust a,
(select amount, #rank:=#rank+1 as rank from cust order by amount desc) b
where a.amount = b.amount
customers ranked by amount spent.

I can think of plenty of scenarios where you might need to do this, particularly during a migration or development process. For instance, I just now had to create a new table by cross-joining two existing tables (as part of a complex set-up process), and then I needed to add a primary key after the event. You can drop the existing primary key column, and then do this.
ALTER TABLE my_table ADD `ID` INT NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (`ID`);
For a live system, it is not a good idea, and especially if there are other tables with foreign keys pointing to it.

I got a very simple but tricky method.
While deleting a row, you can preserve the IDs into another temporary table. After that, when you will insert new data into the main table then you can search and pick IDs from the temporary table. So use a checking here. If the temporary table has no IDs then calculate maximum ID into the main table and set the new ID as: new_ID = old_max_ID+1.
NB: You can not use auto-increment feature here.

You may think about making a trigger after delete so you can update the value of autoincrement and the ID value of all rows that does not look like what you wanted to see.
So you can work with the same table and the auto increment will be fixed automaticaly whenever you delete a row the trigger will fix it.

You can use your mysql client software/script to specify where the primary key should start from after deleting the required records.

Its definitely not recommendable. If you have a large database with multiple tables, you may probably have saved a userid as id in table 2. if you rearrange table 1 then probably the intended userid will not end up being the intended table 2 id.

MYSQL Query
Auto Increment Solution. It works perfect when you have inserted many records during testing phase of software. Now you want to launch your application live to your client and You want to start auto increment from 1.
To avoid any unwanted problems, for safer side
First export .sql file.
Then follow the below steps:
Step 1)
First Create the copy of an existing table
MySQL Command to create Copy:
CREATE TABLE new_Table_Name SELECT * FROM existing_Table_Name;
The exact copy of a table is created with all rows except Constraints.
It doesn’t copy constraints like Auto Increment and Primary Key into new_Table_name
Step 2)
Delete All rows If Data is not inserted in testing phase and it is not useful.
If Data is important then directly go to Step 3.
DELETE from new_Table_Name;
Step 3) To Add Constraints, Goto Structure of a table
3A) Add primary key constraint from More option (If You Require).
3B) Add Auto Increment constraint from Change option. For this set Defined value as None.
3C) Delete existing_Table_Name and
3D) rename new_Table_Name to existing_Table_Name.
Now It will work perfectly. The new first record will take first value in Auto Increment column.

Here is a step to solve your problem.
On your .php file, just add this query given below:
<?php
$servername = "localhost";
$username = "root";
$password = "";
$dbname = "";
$conn = new mysqli($servername, $username, $password, $dbname);
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
//write the number or id you want to start with the next user in AUTO_INCREMENT
$sql = "ALTER TABLE `table_name` AUTO_INCREMENT = number";
$conn->query($sql);
?>
I hope your problem will be solved.

if($id == 1){ // deleting first row
mysqli_query($db,"UPDATE employees SET id=id-1 WHERE id>1");
}
else if($id>1 && $id<$num){ // deleting middle row
mysqli_query($db,"UPDATE employees SET id=id-1 WHERE id>$id");
}
else if($id == $num){ // deleting last row
mysqli_query($db,"ALTER TABLE employees AUTO_INCREMENT = $num");
}
else{
echo "ERROR";
}
mysqli_query($db,"ALTER TABLE employees AUTO_INCREMENT = $num");

here is a function that fix your problem
public static void fixID(Connection conn, String table) {
try {
Statement myStmt = conn.createStatement();
ResultSet myRs;
int i = 1, id = 1, n = 0;
boolean b;
String sql;
myRs = myStmt.executeQuery("select max(id) from " + table);
if (myRs.next()) {
n = myRs.getInt(1);
}
while (i <= n) {
b = false;
myRs = null;
while (!b) {
myRs = myStmt.executeQuery("select id from " + table + " where id=" + id);
if (!myRs.next()) {
id++;
} else {
b = true;
}
}
sql = "UPDATE " + table + " set id =" + i + " WHERE id=" + id;
myStmt.execute(sql);
i++;
id++;
}
} catch (SQLException e) {
e.printStackTrace();
}
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Remove duplicate lines from SQL Server table - sql

Related

Adding Row in existing table (SQL Server 2005)

Webmatrix - Creating two tables with the primary key from one

Update a table and return both the old and new values

Doing UPSERT when row is referenced by a FK

Auto Increment after delete in MySQL

Categories

Resources