Copying rows to the same table without listing the columns - sql

I need to create a stored proc which has to copy existing rows based on a condition into the same table and deactivate existing records by setting active flag column value to 0.
I tried insert into [tablename] select * from [tablename] where [condition] obviously, I got an error due to primary key constraint, mentioning column list excluding primary key column in the select will work but there are multiple tables and some tables have around 300 columns. I don't want to give a long list in the select. As I use SQL Server there is a solution that I found here on SO by getting column list from information_schema.columns and preparing a dynamic query. However, I am not satisfied with any of those solutions is there any other way that I can do this?

Related

Querying a SQL table and only transferring updated rows to a different database

I have a database table which constantly gets updated. I am looking to query only the changes/additions that have been made on rows with a specific attribute in a column. e.g. get the rows which have been changed/added, the 'description' column of which is "xyz". My end goal is to copy these rows to another table in another database. Is this even possible? The reason for not just querying and overwriting the rows in the other database is to avoid inefficiency.
What I have tried so far?
I am able to select query on the table to get the rows but it gives me all the rows, not the ones that have been changed or recently added. If i add these rows to the table in the other database, the only option I have is to overwrite the rows.
Log table logs the changes in a table but I can't put additional filters in SQL which tells me which of these changes are associated with 'description' column as 'xyz'.
Write your update statements to make use of OUTPUT to capture the before and after values and log them to a table of your choice.
Here is a really simple example update example that uses output to store the RowID, before and after values for the ActivityType column:
DECLARE #MyTableVar table (
SummaryBefore nvarchar(max),
SummaryAfter nvarchar(max),
RowID int
);
update DBA.dbo.dtest set ActivityType = 3
OUTPUT deleted.ActivityType,
inserted.ActivityType,
inserted.RowID
INTO #MyTableVar
select * From #MyTableVar
You can do it two ways
Have new date fields/columns like update_time and/or create_time(Can be defaulted if needed). These fields will indicate the status of the record. You need to save your previous_run_time and then your select query will look for records with update_time/create_time greater than previous_run_time, and then you can move these records to the new DB.
Have CDC turned on the source table, which is available by default in SQL server and then move only those records that have been impacted.

Column Copy and Update vs. Column Create and Insert

I have a table with 32 Million rows and 31 columns in PostgreSQL 9.2.10. I am altering the table by adding columns with updated values.
For example, if the initial table is:
id initial_color
-- -------------
1 blue
2 red
3 yellow
I am modifying the table so that the result is:
id initial_color modified_color
-- ------------- --------------
1 blue blue_green
2 red red_orange
3 yellow yellow_brown
I have code that will read the initial_color column and update the value.
Given that my table has 32 million rows and that I have to apply this procedure on five of the 31 columns, what is the most efficient way to do this? My present choices are:
Copy the column and update the rows in the new column
Create an empty column and insert new values
I could do either option with one column at a time or with all five at once. The columns types are either character varying or character.
The columns types are either character varying or character.
Don't use character, that's a misunderstanding. varchar is ok, but I would suggest just text for arbitrary character data.
Any downsides of using data type "text" for storing strings?
Given that my table has 32 million rows and that I have to apply this
procedure on five of the 31 columns, what is the most efficient way to do this?
If you don't have objects (views, foreign keys, functions) depending on the existing table, the most efficient way is create a new table. Something like this ( details depend on the details of your installation):
BEGIN;
LOCK TABLE tbl_org IN SHARE MODE; -- to prevent concurrent writes
CREATE TABLE tbl_new (LIKE tbl_org INCLUDING STORAGE INCLUDING COMMENTS);
ALTER tbl_new ADD COLUMN modified_color text
, ADD COLUMN modified_something text;
-- , etc
INSERT INTO tbl_new (<all columns in order here>)
SELECT <all columns in order here>
, myfunction(initial_color) AS modified_color -- etc
FROM tbl_org;
-- ORDER BY tbl_id; -- optionally order rows while being at it.
-- Add constraints and indexes like in the original table here
DROP tbl_org;
ALTER tbl_new RENAME TO tbl_org;
COMMIT;
If you have depending objects, you need to do more.
Either was, be sure to add all five at once. If you update each in a separate query you write another row version each time due to the MVCC model of Postgres.
Related cases with more details, links and explanation:
Updating database rows without locking the table in PostgreSQL 9.2
Best way to populate a new column in a large table?
Optimizing bulk update performance in PostgreSQL
While creating a new table you might also order columns in an optimized fashion:
Calculating and saving space in PostgreSQL
Maybe I'm misreading the question, but as far as I know, you have 2 possibilities for creating a table with the extra columns:
CREATE TABLE
This would create a new table and filling could be done using
CREATE TABLE .. AS SELECT.. for filling with creation or
using a separate INSERT...SELECT... later on
Both variants are not what you seem to want to do, as you stated solution without listing all the fields.
Also this would require all data (plus the new fields) to be copied.
ALTER TABLE...ADD ...
This creates the new columns. As I'm not aware of any possibility to reference existing column values, you will need an additional UPDATE ..SET... for filling in values.
So, I' not seeing any way to realize a procedure that follows your choice 1.
Nevertheless, copying the (column) data just to overwrite them in a second step would be suboptimal in any case. Altering a table adding new columns is doing minimal I/O. From this, even if there would be a possibility to execute your choice 1, following choice 2 promises better performance by factors.
Thus, do 2 statements one ALTER TABLE adding all your new columns in on go and then an UPDATE providing the new values for these columns will achieve what you want.
create new column (modified colour), it will have a value of NULL or blank on all records,
run an update statement, assuming your table name is 'Table'.
update table
set modified_color = 'blue_green'
where initial_color = 'blue'
if I am correct this can also work like this
update table set modified_color = 'blue_green' where initial_color = 'blue';
update table set modified_color = 'red_orange' where initial_color = 'red';
update table set modified_color = 'yellow_brown' where initial_color = 'yellow';
once you have done this you can do another update (assuming you have another column that I will call modified_color1)
update table set 'modified_color1'= 'modified_color'

copying data from one table to another except the Identity column

Is there any way to copy all column values from one table to another except the Identity column, without mentioning all the rest of the column names?
I have a table with 63 columns. I created a temporary table with -
SELECT * INTO #TmpWide FROM WideTable WHERE 1 = 0
Now I want to copy some data from WideTable to #TmpWide. I need all the columns of WideTable except the Identity Id column, because I want the copied data to have their own sequential Id's in #TmpWide from 1 to onward. Is it possible without mentioning the (63-1) column names?
You could try dropping the column after the table is created:
SELECT * INTO #TmpWide FROM WideTable WHERE 1=0
ALTER TABLE #TmpWide DROP COLUMN [Id]
This does feel a little ugly or hack-y, but it should do the trick.
There isn't a way to do that, but also it is a bad idea to use * in a situation like this. If WideTable changes you will be forced to change the stored procedures that SELECT * from it. I wrote hundreds of stored procs like this and all it did was create nightmares I'm still dealing with today. Good luck.

Eliminating Duplicate Records in a DB2 Table

How do delete duplicate records in a DB2 table? I want to be left with a single record for each group of dupes.
Create another table "no_dups" that has exactly the same columns as the table you want to eliminate the duplicates from. (You may want to add an identity column, just to make it easier to identify individual rows).
Insert into "no_dups", select distinct column1, column2...columnN from the original table. The "select distinct" should only bring back one row for every duplicate in the original table. If it doesn't you may have to alter the list of columns or have a closer look at your data, it may look like duplicate data but actually is not.
When step 2 is done, you will have your original table, and "no_dups" will have all the rows without duplicates. At this point you can do any number of things - drop and rename tables, or delete all from the original and insert into the original, select * from no_dups.
If you're running into problems identifying duplicates, and you've added an identity column to "no_dups," you should be able to delete rows one by one using the identity column value.

Changing table field to UNIQUE

I want to run the following sql command:
ALTER TABLE `my_table` ADD UNIQUE (
`ref_id` ,
`type`
);
The problem is that some of the data in the table would make this invalid, therefore altering the table fails.
Is there a clever way in MySQL to delete the duplicate rows?
SQL can, at best, handle this arbitrarily. To put it another way: this is your problem.
You have data that currently isn't unique. You want to make it unique. You need to decide how to handle the duplicates.
There are a variety of ways of handling this:
Modifying or deleting duplicate rows by hand if the numbers are sufficiently small;
Running statements to update or delete duplicate that meet certain criteria to get to a point where the exceptions can be dealt with on an individual basis;
Copying the data to a temporary table, emptying the original and using queries to repopulate the table; and
so on.
Note: these all require user intervention.
You could of course just copy the table to a temporary table, empty the original and copy in the rows just ignoring those that fail but I expect that won't give you the results that you really want.
if you don't care which row gets deleted, use IGNORE:
ALTER IGNORE TABLE `my_table` ADD UNIQUE (
`ref_id` ,
`type`
);
What you can do is add a temporary identity column to your table. With that you can write query to identify and delete the duplicates (you can modify the query little bit to make sure only one copy from the set of duplicate rows are retained).
Once this is done, drop the temporary column and add unique constraint to your original column.
Hope this helps.
What I've done in the past is export the unique set of data, drop the table, recreate it with the unique columns and import the data.
It is often faster than trying to figure out how to delete the duplicate data.
There is a good KB article that provides a step-by-step approach to finding and removing rows that have duplicate values. It provides two approaches - a one-off approach for finding and removing a single row and a broader solution to solving this when many rows are involved.
http://support.microsoft.com/kb/139444
Here is a snippet I used to delete duplicate rows in one of the tables
BEGIN TRANSACTION
Select *,
rank() over (Partition by PolicyId, PlanSeqNum, BaseProductSeqNum,
CoInsrTypeCd, SupplierTypeSeqNum
order by CoInsrAmt desc) as MyRank
into #tmpTable
from PlanCoInsr
select distinct PolicyId,PlanSeqNum,BaseProductSeqNum,
SupplierTypeSeqNum, CoInsrTypeCd, CoInsrAmt
into #tmpTable2
from #tmpTable where MyRank=1
truncate table PlanCoInsr
insert into PlanCoInsr
select * from #tmpTable2
drop table #tmpTable
drop table #tmpTable2
COMMIT
This worked for me:
ALTER TABLE table_name ADD UNIQUE KEY field_name (field_name)
You will have to find some other field that is unique because deleting on ref_id and type alone will delete them all.
To get the duplicates:
select ref_id, type from my_table group by ref_id, type having count(*)>1
Xarpb has some clever tricks (maybe too clever): http://www.xaprb.com/blog/2007/02/06/how-to-delete-duplicate-rows-with-sql-part-2/