How to add a row number to new table in SQL? - sql-server-2005

I'm trying to create a new table using an existing table already using:
INSERT INTO NewTable (...,...)
SELECT * from SampleTable
What I need to is add a record number at the beginning or the end, it really doesn't matter as long as it's there.
Sample Table
Elizabeth RI 02914
Emily MA 01834
Prospective New Table
1 Elizabeth RI 02914
2 Emily MA 01834
Is that at all possible?
This is what I ultimately I'm shooting for... except right now those tables aren't the same size because I need my ErrorTemporaryTable to have a column in which the first row has a number which increments by the previous one by one.
declare #counter int
declare #ClientMessage varchar(255)
declare #TestingMessage carchar(255)
select #counter = (select count(*) + 1 as counter from ErrorValidationTesting)
while #counter <= (select count(*) from ErrorValidationTable ET, ErrorValidationMessage EM where ET.Error = EM.Error_ID)
begin
insert into ErrorValidationTesting (Validation_Error_ID, Program_ID, Displayed_ID, Client_Message, Testing_Message, Create_Date)
select * from ErrorTemporaryTable
select #counter = #counter + 1
end

You can use into clause with IDENTITY column:
SELECT IDENTITY(int, 1,1) AS ID_Num, col0, col1
INTO NewTable
FROM OldTable;
Here is more information
You can also create table with identity field:
create table NewTable
(
id int IDENTITY,
col0 varchar(30),
col1 varchar(30)
)
and insert:
insert into NewTable (col0, col1)
SELECT col0, col1
FROM OldTable;
or if you have NewTable and you want to add new column see this solution on SO.

INSERT INTO NewTable (...,...)
SELECT ROW_NUMBER() OVER (ORDER BY order_column), * from SampleTable

If you are in SQL Server
INSERT INTO newTable (idCol, c1,c2,...cn)
SELECT ROW_NUMBER() OVER(ORDER BY c1), c1,c2,...cn
FROM oldTable

Try this query to insert 1,2,3... Replace MyTable and ID with your column names.
DECLARE #myVar int
SET #myVar = 0
UPDATE
MyTable
SET
ID = #myvar ,
#myvar = #myVar + 1

Related

Looping over columns and changing Null values

I have a table called myTable where some values are null.
I want to replace all null values in all columns with the previous non null value. I found some code that iterates over each row for a specific column, and changes Null Values as I want.
DECLARE #value AS int
UPDATE myTable
SET
#value = COALESCE(col2, #value),
col2 = COALESCE(col2, #value)
Result:
This does what I want it do do but only for one column at the time. My goal is to alter the code above in some way so that I can automatically loop over each column in the table.
I ran into several issues when trying to achieve this. Here is my attempt
DECLARE #ColNames table (NAMES nvarchar(50), ARRAYINDEX int identity(1,1) )
INSERT INTO #ColNames (NAMES)
VALUES ('col1'),('col2'),('col3')
DECLARE #INDEXVAR int
DECLARE #TOTALCOUNT int
SET #INDEXVAR = 0
SELECT #TOTALCOUNT = COUNT(*) FROM #ColNames
WHILE #INDEXVAR < #TOTALCOUNT
BEGIN
DECLARE #curColName nvarchar (50)
SELECT #INDEXVAR = #INDEXVAR + 1
SELECT #curColName = NAMES from #ColNames where ARRAYINDEX = #INDEXVAR
DECLARE #value AS int
UPDATE myTable
SET
#value = COALESCE(#curColName, #value),
#curColName = COALESCE(#curColName, #value)
END
The issues that I have found and not been able to solve are the following:
#curColName is just a nvarchar variable and not a representation of my actual column, even if the names are the same. This gives me errors on both lines inside the SET statement.
When hard coding the column names in my loop inside the BEGIN/END statement., the script fills out ALL Null values with a number. So col2 gets the value 3 on ALL rows, not only row 2 and 3 as my previous example.
If these two points are hard or impossible to solve, is there an easier way of solving this problem?
Thanks
This is based on your expected results, and that you actually want to assign the value of Col2 to be the value of the previous non-NULL value, when ordered by the column pk.
If so, to achieve this you can use an updatable CTE. The first CTE puts the data into groups, based on the non-NULL values, and then the second gets the MAX value of Col2 in the group (which would be the non-NULL value). Finally you UPDATE against that CTE on rows where Col2 has the value NULL:
CREATE TABLE dbo.YourTable (PK int,
Col1 int,
Col2 int);
INSERT INTO dbo.YourTable (PK,Col1,Col2)
VALUES(1,2,NULL),
(2,NULL,3),
(3,NULL,NULL);
GO
WITH Groups AS(
SELECT Col2,
COUNT(Col2) OVER (ORDER BY PK) AS Grp
FROM dbo.YourTable),
Maxes AS(
SELECT Col2,
MAX(Col2) OVER (PARTITION BY Grp) AS MaxCol2
FROM Groups)
UPDATE Maxes
SET Col2 = MaxCol2
WHERE Col2 IS NULL;
GO
SELECT *
FROM dbo.YourTable;
GO
DROP TABLE dbo.YourTable;

SQL Loop through tables and columns to find which columns are NOT empty

I created a temp table #test containing 3 fields: ColumnName, TableName, and Id.
I would like to see which rows in the #test table (columns in their respective tables) are not empty? I.e., for every column name that i have in the ColumnName field, and for the corresponding table found in the TableName field, i would like to see whether the column is empty or not. Tried some things (see below) but didn't get anywhere. Help, please.
declare #LoopCounter INT = 1, #maxloopcounter int, #test varchar(100),
#test2 varchar(100), #check int
set #maxloopcounter = (select count(TableName) from #test)
while #LoopCounter <= #maxloopcounter
begin
DECLARE #PropIDs TABLE (tablename varchar(max), id int )
Insert into #PropIDs (tablename, id)
SELECT [tableName], id FROM #test
where id = #LoopCounter
set #test2 = (select columnname from #test where id = #LoopCounter)
declare #sss varchar(max)
set #sss = (select tablename from #PropIDs where id = #LoopCounter)
set #check = (select count(#test2)
from (select tablename
from #PropIDs
where id = #LoopCounter) A
)
print #test2
print #sss
print #check
set #LoopCounter = #LoopCounter + 1
end
In order to use variables as column names and table names in your #Check= query, you will need to use Dynamic SQL.
There is most likely a better way to do this but I cant think of one off hand. Here is what I would do.
Use the select and declare a cursor rather than a while loop as you have it. That way you dont have to count on sequential id's. The cursor would fetch fields columnname, id and tablename
In the loop build a dynamic sql statement
Set #Sql = 'Select Count(*) Cnt Into #Temp2 From ' + TableName + ' Where ' + #columnname + ' Is not null And ' + #columnname <> '''''
Exec(#Sql)
Then check #Temp2 for a value greater than 0 and if this is what you desire you can use the #id that was fetched to update your #Temp table. Putting the result into a scalar variable rather than a temp table would be preferred but cant remember the best way to do that and using a temp table allows you to use an update join so it would well in my opinion.
https://www.mssqltips.com/sqlservertip/1599/sql-server-cursor-example/
http://www.sommarskog.se/dynamic_sql.html
Found a way to extract all non-empty tables from the schema, then just joined with the initial temp table that I had created.
select A.tablename, B.[row_count]
from (select * from #test) A
left join
(SELECT r.table_name, r.row_count, r.[object_id]
FROM sys.tables t
INNER JOIN (
SELECT OBJECT_NAME(s.[object_id]) table_name, SUM(s.row_count) row_count, s.[object_id]
FROM sys.dm_db_partition_stats s
WHERE s.index_id in (0,1)
GROUP BY s.[object_id]
) r on t.[object_id] = r.[object_id]
WHERE r.row_count > 0 ) B
on A.[TableName] = B.[table_name]
WHERE ROW_COUNT > 0
order by b.row_count desc
How about this one - bitmask computed column checks for NULLability. Value in the bitmask tells you if a column is NULL or not. Counting base 2.
CREATE TABLE FindNullComputedMask
(ID int
,val int
,valstr varchar(3)
,NotEmpty as
CASE WHEN ID IS NULL THEN 0 ELSE 1 END
|
CASE WHEN val IS NULL THEN 0 ELSE 2 END
|
CASE WHEN valstr IS NULL THEN 0 ELSE 4 END
)
INSERT FindNullComputedMask
SELECT 1,1,NULL
INSERT FindNullComputedMask
SELECT NULL,2,NULL
INSERT FindNullComputedMask
SELECT 2,NULL, NULL
INSERT FindNullComputedMask
SELECT 3,3,3
SELECT *
FROM FindNullComputedMask

Update table with new value for each row

I need to update a column (type of datetime) in the top 1000 rows my table. However the catch is with each additional row I must increment the GETDATE() by 1 second... something like DATEADD(ss,1,GETDATE())
The only way I know how to do this is something like this:
UPDATE tablename
SET columnname = CASE id
WHEN 1 THEN DATEADD(ss,1,GETDATE())
WHEN 2 THEN DATEADD(ss,2,GETDATE())
...
END
Obviously this is not plausible. Any ideas?
How about using id rather than a constant?
UPDATE tablename
SET columnname = DATEADD(second, id, GETDATE() )
WHERE id <= 1000;
If you want the first 1000 rows (by id), but the id has gaps or other problems, then you can use a CTE:
with toupdate as (
select t.*, row_number() over (order by id) as seqnum
from tablename
)
update toupdate
set columnname = dateadd(second, seqnum, getdate())
where seqnum <= 1000;
I don't know what your ID is like, and I'm assuming you have at least SQL Server 2008 or else ROW_NUMBER() won't work.
Note: I did top 2 to show you that you that the top works. You can change it to top 1000 for your actual query.
DECLARE #table TABLE (ID int, columnName DATETIME);
INSERT INTO #table(ID)
VALUES(1),(2),(3);
UPDATE #table
SET columnName = DATEADD(SECOND,B.row_num,GETDATE())
FROM #table A
INNER JOIN
(
SELECT TOP 2 *, ROW_NUMBER() OVER (ORDER BY ID) row_num
FROM #table
ORDER BY ID
) B
ON A.ID = B.ID
SELECT *
FROM #table
Results:
ID columnName
----------- -----------------------
1 2015-03-31 13:11:59.760
2 2015-03-31 13:12:00.760
3 NULL
You don't make explicit the SQL Server version you're using, so I will assume SQL Server 2005 or above. I believe the WAITFOR DELAY command would be a good option to keep adding 1 second to each rows of the datetime column.
See this example:
-- Create temp table
CREATE TABLE #Client
(
RecordID int identity(1,1),
[Name] nvarchar(100) not null,
PurchaseDate datetime null
)
-- Fill in temp table with example values
INSERT INTO #Client
VALUES ( 'Jhon', NULL)
INSERT INTO #Client
VALUES ( 'Martha', NULL)
INSERT INTO #Client
VALUES ( 'Jimmy', NULL)
-- Create local variables
DECLARE #currentRecordId int,
#currentName nvarchar(100)
-- Create cursor
DECLARE ClientsCursor CURSOR FOR
SELECT RecordID,
[Name]
FROM #Client
OPEN ClientsCursor
FETCH FROM ClientsCursor INTO #currentRecordId, #currentName
-- Check ##FETCH_STATUS to see if there are any more rows to fetch.
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE #Client
SET PurchaseDate = DATEADD(ss,1,GETDATE())
WHERE RecordID = #currentRecordId
AND [Name] = #currentName
WAITFOR DELAY '00:00:01.000'
FETCH NEXT FROM ClientsCursor INTO #currentRecordId, #currentName
END
CLOSE ClientsCursor;
DEALLOCATE ClientsCursor;
SELECT *
FROM #Client
And here's the result:
1 Jhon 2015-03-31 15:20:04.477
2 Martha 2015-03-31 15:20:05.473
3 Jimmy 2015-03-31 15:20:06.470
Hope you find this answer helpful
This should be what you need (at least a guidline)
DELIMITER $$
CREATE PROCEDURE ADDTIME()
BEGIN
DECLARE i INT Default 1 ;
simple_loop: LOOP
UPDATE tablename SET columnname = DATE_ADD(NOW(), INTERVAL i SECOND) where rownumber = i
SET i=i+1;
IF i=1001 THEN
LEAVE simple_loop;
END IF;
END LOOP simple_loop;
END $$
To call that stored procedure use
CALL ADDTIME()

Update specific columns in a table iteratively (Do a bulk update)

My Table Schema is as follows:
Gender: char(1), not null
Last Name: varchar(25), null
First Name: varhcar(35), not null
The data in the table looks like:
Gender | Last Name | First Name |
M Doe John
F Marie Jane
M Jones Jameson
F Simpson Alice
I now am trying to update all the names in the table from the names present in the txt file.
My Query is as follows:
-- Sort out the Forenames we'll be using for the data, we make a #Name2 table because I have yet to figure our
-- inserting specific columns using BULK INSERT and without using a format file.
CREATE TABLE #Name (Name VARCHAR(50))
CREATE TABLE #ForeNames (FirstName VARCHAR(50), Gender VARCHAR(1))
-- Move data in the #Name2 table
BULK INSERT #Name FROM "c:\girlsforenames.txt" WITH (ROWTERMINATOR='\n')
-- Now move it to the forename table and add the gender
INSERT INTO #ForeNames SELECT [Name], 'F' FROM #Name
-- Delete the names from temporary table
TRUNCATE TABLE #Name
-- Same for the boys
BULK INSERT #Name FROM "c:\boysforenames.txt" WITH (ROWTERMINATOR='\n')
INSERT INTO #ForeNames SELECT [Name], 'M' FROM #Name
-- Now do the surnames
TRUNCATE TABLE #Name
BULK INSERT #Name FROM "c:\surnames.txt" WITH (ROWTERMINATOR='\n')
DECLARE #Counter BIGINT
SET #Counter = 4
WHILE (#Counter > 0)
BEGIN
UPDATE TableName
set
[last_name]= (SELECT TOP 1 FirstName from #ForeNames),
[first_name]=(SELECT TOP 1 Name FROM #Name ORDER BY NEWID()),
[gender]= ( SELECT TOP 1 Gender FROM #ForeNames ORDER BY NEWID());
SET #Counter=#Counter-1
END
DROP TABLE #Name
DROP TABLE #ForeNames
SELECT * FROM TableName
What Happens is all the rows in the table are updated with the same values and each time i execute the query they are updated with the new set of values.
What I want is to loop through each row and update it and den update the next row with the other set of random name. But here it is updating the same random name across all the rows of the table.
Any help would be appreciated.
Each SELECT statement is only being executed once in your example (and thus returning 1 result), and since your UPDATE isn't being limited, you're applying the same value to every row.
If you want to update each row with different values, you can use a CTE and the ROW_NUMBER() function to update rows at a time.
There's no need to loop, you can do it in one fell swoop:
WITH cte AS (SELECT *,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS n1
FROM TableName
)
UPDATE cte
SET FirstName = names.Name
FROM cte
JOIN (SELECT *,ROW_NUMBER() OVER (ORDER BY NEWID()) AS n2
FROM #name
)names
on cte.n1 = names.n2
Demo: SQL Fiddle
This example is just for the FirstName.

Delete duplicate records in SQL Server?

Consider a column named EmployeeName table Employee. The goal is to delete repeated records, based on the EmployeeName field.
EmployeeName
------------
Anand
Anand
Anil
Dipak
Anil
Dipak
Dipak
Anil
Using one query, I want to delete the records which are repeated.
How can this be done with TSQL in SQL Server?
You can do this with window functions. It will order the dupes by empId, and delete all but the first one.
delete x from (
select *, rn=row_number() over (partition by EmployeeName order by empId)
from Employee
) x
where rn > 1;
Run it as a select to see what would be deleted:
select *
from (
select *, rn=row_number() over (partition by EmployeeName order by empId)
from Employee
) x
where rn > 1;
Assuming that your Employee table also has a unique column (ID in the example below), the following will work:
delete from Employee
where ID not in
(
select min(ID)
from Employee
group by EmployeeName
);
This will leave the version with the lowest ID in the table.
Edit
Re McGyver's comment - as of SQL 2012
MIN can be used with numeric, char, varchar, uniqueidentifier, or datetime columns, but not with bit columns
For 2008 R2 and earlier,
MIN can be used with numeric, char, varchar, or datetime columns, but not with bit columns (and it also doesn't work with GUID's)
For 2008R2 you'll need to cast the GUID to a type supported by MIN, e.g.
delete from GuidEmployees
where CAST(ID AS binary(16)) not in
(
select min(CAST(ID AS binary(16)))
from GuidEmployees
group by EmployeeName
);
SqlFiddle for various types in Sql 2008
SqlFiddle for various types in Sql 2012
You could try something like the following:
delete T1
from MyTable T1, MyTable T2
where T1.dupField = T2.dupField
and T1.uniqueField > T2.uniqueField
(this assumes that you have an integer based unique field)
Personally though I'd say you were better off trying to correct the fact that duplicate entries are being added to the database before it occurs rather than as a post fix-it operation.
DELETE
FROM MyTable
WHERE ID NOT IN (
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)
WITH TempUsers (FirstName, LastName, duplicateRecordCount)
AS
(
SELECT FirstName, LastName,
ROW_NUMBER() OVER (PARTITIONBY FirstName, LastName ORDERBY FirstName) AS duplicateRecordCount
FROM dbo.Users
)
DELETE
FROM TempUsers
WHERE duplicateRecordCount > 1
WITH CTE AS
(
SELECT EmployeeName,
ROW_NUMBER() OVER(PARTITION BY EmployeeName ORDER BY EmployeeName) AS R
FROM employee_table
)
DELETE CTE WHERE R > 1;
The magic of common table expressions.
Try
DELETE
FROM employee
WHERE rowid NOT IN (SELECT MAX(rowid) FROM employee
GROUP BY EmployeeName);
If you're looking for a way to remove duplicates, yet you have a foreign key pointing to the table with duplicates, you could take the following approach using a slow yet effective cursor.
It will relocate the duplicate keys on the foreign key table.
create table #properOlvChangeCodes(
id int not null,
name nvarchar(max) not null
)
DECLARE #name VARCHAR(MAX);
DECLARE #id INT;
DECLARE #newid INT;
DECLARE #oldid INT;
DECLARE OLVTRCCursor CURSOR FOR SELECT id, name FROM Sales_OrderLineVersionChangeReasonCode;
OPEN OLVTRCCursor;
FETCH NEXT FROM OLVTRCCursor INTO #id, #name;
WHILE ##FETCH_STATUS = 0
BEGIN
-- determine if it should be replaced (is already in temptable with name)
if(exists(select * from #properOlvChangeCodes where Name=#name)) begin
-- if it is, finds its id
Select top 1 #newid = id
from Sales_OrderLineVersionChangeReasonCode
where Name = #name
-- replace terminationreasoncodeid in olv for the new terminationreasoncodeid
update Sales_OrderLineVersion set ChangeReasonCodeId = #newid where ChangeReasonCodeId = #id
-- delete the record from the terminationreasoncode
delete from Sales_OrderLineVersionChangeReasonCode where Id = #id
end else begin
-- insert into temp table if new
insert into #properOlvChangeCodes(Id, name)
values(#id, #name)
end
FETCH NEXT FROM OLVTRCCursor INTO #id, #name;
END;
CLOSE OLVTRCCursor;
DEALLOCATE OLVTRCCursor;
drop table #properOlvChangeCodes
delete from person
where ID not in
(
select t.id from
(select min(ID) as id from person
group by email
) as t
);
Please see the below way of deletion too.
Declare #Employee table (EmployeeName varchar(10))
Insert into #Employee values
('Anand'),('Anand'),('Anil'),('Dipak'),
('Anil'),('Dipak'),('Dipak'),('Anil')
Select * from #Employee
Created a sample table named #Employee and loaded it with given data.
Delete aliasName from (
Select *,
ROW_NUMBER() over (Partition by EmployeeName order by EmployeeName) as rowNumber
From #Employee) aliasName
Where rowNumber > 1
Select * from #Employee
Result:
I know, this is asked six years ago, posting just incase it is helpful for anyone.
Here's a nice way of deduplicating records in a table that has an identity column based on a desired primary key that you can define at runtime. Before I start I'll populate a sample data set to work with using the following code:
if exists (select 1 from sys.all_objects where type='u' and name='_original')
drop table _original
declare #startyear int = 2017
declare #endyear int = 2018
declare #iterator int = 1
declare #income money = cast((SELECT round(RAND()*(5000-4990)+4990 , 2)) as money)
declare #salesrepid int = cast(floor(rand()*(9100-9000)+9000) as varchar(4))
create table #original (rowid int identity, monthyear varchar(max), salesrepid int, sale money)
while #iterator<=50000 begin
insert #original
select (Select cast(floor(rand()*(#endyear-#startyear)+#startyear) as varchar(4))+'-'+ cast(floor(rand()*(13-1)+1) as varchar(2)) ), #salesrepid , #income
set #salesrepid = cast(floor(rand()*(9100-9000)+9000) as varchar(4))
set #income = cast((SELECT round(RAND()*(5000-4990)+4990 , 2)) as money)
set #iterator=#iterator+1
end
update #original
set monthyear=replace(monthyear, '-', '-0') where len(monthyear)=6
select * into _original from #original
Next I'll create a Type called ColumnNames:
create type ColumnNames AS table
(Columnnames varchar(max))
Finally I will create a stored proc with the following 3 caveats:
1. The proc will take a required parameter #tablename that defines the name of the table you are deleting from in your database.
2. The proc has an optional parameter #columns that you can use to define the fields that make up the desired primary key that you are deleting against. If this field is left blank, it is assumed that all the fields besides the identity column make up the desired primary key.
3. When duplicate records are deleted, the record with the lowest value in it's identity column will be maintained.
Here is my delete_dupes stored proc:
create proc delete_dupes (#tablename varchar(max), #columns columnnames readonly)
as
begin
declare #table table (iterator int, name varchar(max), is_identity int)
declare #tablepartition table (idx int identity, type varchar(max), value varchar(max))
declare #partitionby varchar(max)
declare #iterator int= 1
if exists (select 1 from #columns) begin
declare #columns1 table (iterator int, columnnames varchar(max))
insert #columns1
select 1, columnnames from #columns
set #partitionby = (select distinct
substring((Select ', '+t1.columnnames
From #columns1 t1
Where T1.iterator = T2.iterator
ORDER BY T1.iterator
For XML PATH ('')),2, 1000) partition
From #columns1 T2 )
end
insert #table
select 1, a.name, is_identity from sys.all_columns a join sys.all_objects b on a.object_id=b.object_id
where b.name = #tablename
declare #identity varchar(max)= (select name from #table where is_identity=1)
while #iterator>=0 begin
insert #tablepartition
Select distinct case when #iterator=1 then 'order by' else 'over (partition by' end ,
substring((Select ', '+t1.name
From #table t1
Where T1.iterator = T2.iterator and is_identity=#iterator
ORDER BY T1.iterator
For XML PATH ('')),2, 5000) partition
From #table T2
set #iterator=#iterator-1
end
declare #originalpartition varchar(max)
if #partitionby is null begin
select #originalpartition = replace(b.value+','+a.type+a.value ,'over (partition by','') from #tablepartition a cross join #tablepartition b where a.idx=2 and b.idx=1
select #partitionby = a.type+a.value+' '+b.type+a.value+','+b.value+') rownum' from #tablepartition a cross join #tablepartition b where a.idx=2 and b.idx=1
end
else
begin
select #originalpartition=b.value +','+ #partitionby from #tablepartition a cross join #tablepartition b where a.idx=2 and b.idx=1
set #partitionby = (select 'OVER (partition by'+ #partitionby + ' ORDER BY'+ #partitionby + ','+b.value +') rownum'
from #tablepartition a cross join #tablepartition b where a.idx=2 and b.idx=1)
end
exec('select row_number() ' + #partitionby +', '+#originalpartition+' into ##temp from '+ #tablename+'')
exec(
'delete a from _original a
left join ##temp b on a.'+#identity+'=b.'+#identity+' and rownum=1
where b.rownum is null')
drop table ##temp
end
Once this is complied, you can delete all your duplicate records by running the proc. To delete dupes without defining a desired primary key use this call:
exec delete_dupes '_original'
To delete dupes based on a defined desired primary key use this call:
declare #table1 as columnnames
insert #table1
values ('salesrepid'),('sale')
exec delete_dupes '_original' , #table1