How to UPDATE all columns of a record without having to list every column - sql

I'm trying to figure out a way to update a record without having to list every column name that needs to be updated.
For instance, it would be nice if I could use something similar to the following:
// the parts inside braces are what I am trying to figure out
UPDATE Employee
SET {all columns, without listing each of them}
WITH {this record with id of '111' from other table}
WHERE employee_id = '100'
If this can be done, what would be the most straightforward/efficient way of writing such a query?

It's not possible.
What you're trying to do is not part of SQL specification and is not supported by any database vendor. See the specifications of SQL UPDATE statements for MySQL, Postgresql, MSSQL, Oracle, Firebird, Teradata. Every one of those supports only below syntax:
UPDATE table_reference
SET column1 = {expression} [, column2 = {expression}] ...
[WHERE ...]

This is not posible, but..
you can doit:
begin tran
delete from table where CONDITION
insert into table select * from EqualDesingTabletoTable where CONDITION
commit tran
be carefoul with identity fields.

Here's a hardcore way to do it with SQL SERVER. Carefully consider security and integrity before you try it, though.
This uses schema to get the names of all the columns and then puts together a big update statement to update all columns except ID column, which it uses to join the tables.
This only works for a single column key, not composites.
usage: EXEC UPDATE_ALL 'source_table','destination_table','id_column'
CREATE PROCEDURE UPDATE_ALL
#SOURCE VARCHAR(100),
#DEST VARCHAR(100),
#ID VARCHAR(100)
AS
DECLARE #SQL VARCHAR(MAX) =
'UPDATE D SET ' +
-- Google 'for xml path stuff' This gets the rows from query results and
-- turns into comma separated list.
STUFF((SELECT ', D.'+ COLUMN_NAME + ' = S.' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #DEST
AND COLUMN_NAME <> #ID
FOR XML PATH('')),1,1,'')
+ ' FROM ' + #SOURCE + ' S JOIN ' + #DEST + ' D ON S.' + #ID + ' = D.' + #ID
--SELECT #SQL
EXEC (#SQL)

In Oracle PL/SQL, you can use the following syntax:
DECLARE
r my_table%ROWTYPE;
BEGIN
r.a := 1;
r.b := 2;
...
UPDATE my_table
SET ROW = r
WHERE id = r.id;
END;
Of course that just moves the burden from the UPDATE statement to the record construction, but you might already have fetched the record from somewhere.

How about using Merge?
https://technet.microsoft.com/en-us/library/bb522522(v=sql.105).aspx
It gives you the ability to run Insert, Update, and Delete. One other piece of advice is if you're going to be updating a large data set with indexes, and the source subset is smaller than your target but both tables are very large, move the changes to a temporary table first. I tried to merge two tables that were nearly two million rows each and 20 records took 22 minutes. Once I moved the deltas over to a temp table, it took seconds.

If you are using Oracle, you can use rowtype
declare
var_x TABLE_A%ROWTYPE;
Begin
select * into var_x
from TABLE_B where rownum = 1;
update TABLE_A set row = var_x
where ID = var_x.ID;
end;
/
given that TABLE_A and TABLE_B are of same schema

It is possible. Like npe said it's not a standard practice. But if you really have to:
1. First a scalar function
CREATE FUNCTION [dte].[getCleanUpdateQuery] (#pTableName varchar(40), #pQueryFirstPart VARCHAR(200) = '', #pQueryLastPart VARCHAR(200) = '', #pIncludeCurVal BIT = 1)
RETURNS VARCHAR(8000) AS
BEGIN
DECLARE #pQuery VARCHAR(8000);
WITH cte_Temp
AS
(
SELECT
C.name
FROM SYS.COLUMNS AS C
INNER JOIN SYS.TABLES AS T ON T.object_id = C.object_id
WHERE T.name = #pTableName
)
SELECT #pQuery = (
CASE #pIncludeCurVal
WHEN 0 THEN
(
STUFF(
(SELECT ', ' + name + ' = ' + #pQueryFirstPart + #pQueryLastPart FROM cte_Temp FOR XML PATH('')), 1, 2, ''
)
)
ELSE
(
STUFF(
(SELECT ', ' + name + ' = ' + #pQueryFirstPart + name + #pQueryLastPart FROM cte_Temp FOR XML PATH('')), 1, 2, ''
)
) END)
RETURN 'UPDATE ' + #pTableName + ' SET ' + #pQuery
END
2. Use it like this
DECLARE #pQuery VARCHAR(8000) = dte.getCleanUpdateQuery(<your table name>, <query part before current value>, <query part after current value>, <1 if current value is used. 0 if updating everything to a static value>);
EXEC (#pQuery)
Example 1: make all employees columns 'Unknown' (you need to make sure column type matches the intended value:
DECLARE #pQuery VARCHAR(8000) = dte.getCleanUpdateQuery('employee', '', 'Unknown', 0);
EXEC (#pQuery)
Example 2: Remove an undesired text qualifier (e.g. #)
DECLARE #pQuery VARCHAR(8000) = dte.getCleanUpdateQuery('employee', 'REPLACE(', ', ''#'', '''')', 1);
EXEC (#pQuery)
This query can be improved. This is just the one I saved and sometime I use. You get the idea.

Similar to an upsert, you could check if the item exists on the table, if so, delete it and insert it with the new values (technically updating it) but you would lose your rowid if that's something sensitive to keep in your case.
Behold, the updelsert
IF NOT EXISTS (SELECT * FROM Employee WHERE ID = #SomeID)
INSERT INTO Employee VALUES(#SomeID, #Your, #Vals, #Here)
ELSE
DELETE FROM Employee WHERE ID = #SomeID
INSERT INTO Employee VALUES(#SomeID, #Your, #Vals, #Here)

you could do it by deleting the column in the table and adding the column back in and adding a default value of whatever you needed it to be. then saving this will require to rebuild the table

Related

Insert into table the outcome of a select on that table using Row_Number

I am creating a query where in I select data on a table, then select a number of rows from that table, to then insert those rows into another identical table in another Database, and then repeat the proces to select the next number of rows from the orignal table.
For Reference, this is what i try to do (already build it for Oracle):
$" INSERT INTO {destination-table}
SELECT * FROM {original-table}
WHERE ROWID IN (SELECT B.RID
FROM (SELECT ROWID AS RID, rownum as RID2
FROM {original-table}
WHERE {Where Claus}
AND ROWNUM <= {recordsPerStatement * iteration}
) B WHERE RID2 > {recordsPerStatement * (iteration - 1)})"
This is put through a loop in .net
For SQL server however I fail to get this done. The data i retrieve with:
$" Select B.* from (Select A.* from (Select Row_NUMBER()
OVER (order by %%physloc%%) As RowID, {original-table}.* FROM
{original-table} where {where-claus})
A Where A.RowID between {recordsPerStatement * (iteration - 1)}
AND {recordsPerStatement * iteration} B"
The problem here is that above select produces an extra column (ROWID) which prevents me from inserting the above data into the destination-table
I have been looking at ways to get rid of the ROWID column in the top select or to insert data from original-table based on the data retrieved
(something like insert into destination-table select * from original-table where exists in (rest of select query)..... but to no avail
TLDR = Get rid of a ROWID column used in calculations to then be able to insert rows into an identical table
specifications:
A LOT (millions of rows) of data (therefor processing it in bits)
Unknown tables (so i cannot call on specific column names, as they are unknown)
needs to have an order (thus the row_number) so the same data is not copied twice.
insert using a select query (as first retrieving it and doing some magic locally would severly impact performance)
If necessary additional variables can be added in here (like an order claus variable) however, any reference to data in the query will ALWAYS be a variable + If I can find a way to not add more varriables in the query then that would be preferable
I hope that someone would have an idea on what i could look at further.
This approach uses a temporary table to save the paginated data before processing it page by page. It has worked for me, but not sure if you might have problems with very large data sets. You could put the whole thing in an SP then call the SP with parameters from .net. You will need to add a parameter for the destination table name and construct/execute an INSERT statement in the final loop.
-- Parameters
DECLARE #PageSize integer = 100;
DECLARE #TableName nVarchar(200) = 'WRD_WordHits';
DECLARE #OrderBy nVarchar(3000) = 'WordID'
STEP_010: BEGIN
-- Get the column definitions for the table
DECLARE #Cols int;
SELECT TABLE_NAME, ORDINAL_POSITION, COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH
, IS_NULLABLE
INTO #Tspec
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #TableName;
-- Number of columns
SET #Cols = ##ROWCOUNT;
END;
STEP_020: BEGIN
-- Create the temporary table that will hold the paginated data
CREATE TABLE #TT2 ( PageNumber int, LineNumber int, SSEQ int )
DECLARE #STMT nvarchar(3000);
END;
STEP_030: BEGIN
-- Add columns to #TT2 using the column definitions
DECLARE #Ord int = 0;
DECLARE #Colspec nvarchar(3000) = '';
DECLARE #AllCols nvarchar(3000) = '';
DECLARE #ColName nvarchar(200) = '';
WHILE #Ord < #Cols BEGIN
SELECT #Ord = #Ord + 1;
-- Get the column name and specification
SELECT #ColName = Column_Name
, #Colspec =
Column_Name + ' ' + DATA_TYPE + CASE WHEN CHARACTER_MAXIMUM_LENGTH IS NULL THEN ''
ELSE '(' + CAST(CHARACTER_MAXIMUM_LENGTH AS varchar(30) ) + ')' END
FROM #Tspec WHERE ORDINAL_POSITION = #Ord;
-- Create and execute statement to add the column and the columns list used later
SELECT #STMT = ' ALTER TABLE #TT2 ADD ' + #Colspec + ';'
, #AllCols = #AllCols + ', ' + #ColName ;
EXEC sp_ExecuteSQL #STMT;
END;
-- Remove leading comma from columns list
SELECT #AllCols = SUBSTRING(#AllCols, 3, 3000);
PRINT #AllCols
-- Finished with the source table spec
DROP TABLE #Tspec;
END;
STEP_040: BEGIN -- Create and execute the statement used to fill #TT2 with the paginated data from the source table
-- The first two cols are the page number and row number within the page
-- The sequence is arbitrary but could use a key list for the order by clause
SELECT #STMT =
'INSERT #TT2
SELECT FLOOR( CAST( SSEQ as float) /' + CAST(#PageSize as nvarchar(10)) + ' ) + 1 PageNumber, (SSEQ) % ' + CAST(#PageSize as nvarchar(10)) + ' + 1 LineNumber, * FROM
(
SELECT ROW_NUMBER() OVER ( ORDER BY ' + #OrderBy + ' ) - 1 AS SSEQ, * FROM ' + #TableName + '
)
A; ' ;
EXEC sp_ExecuteSQL #STMT;
-- *** Test only to show that the table contains the data
--SELECT * FROM #TT2;
--SELECT #STMT = 'SELECT NULL AS EXECSELECT, ' + #AllCols + ' FROM #TT2;' ;
--EXEC sp_ExecuteSQL #STMT;
-- ***
END;
STEP_050: BEGIN -- Loop through paginated data, one page at a time.
-- Variables to control the paginated loop
DECLARE #PageMAX int;
SELECT #PageMAX = MAX(PageNumber) FROM #TT2;
PRINT 'Generated ' + CAST( #PageMAX AS varchar(10) ) + ' pages from table';
DECLARE #Page int = 0;
WHILE #Page < #PageMax BEGIN
SELECT #Page = #Page + 1;
-- Create and execute the statement to get one page of data - this could be any statement to process data page by page
SELECT #STMT = 'SELECT ' + #AllCols + ' FROM #TT2 WHERE PageNumber = ' + CAST(#Page AS Varchar(10 )) + ' ORDER BY LineNumber '
-- Execute the statment.
PRINT #STMT -- For testing
--EXEC sp_EXECUTESQL #STMT;
END;
-- Finished with Paginated data
DROP TABLE #TT2;
END;
The solution i came up with:
First reading the column_names from the database and storing them locally, to then use them again in building up the insert / select query and only select those columns from the view (which are all apart from ROWID).
commandText = $"SELECT column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = N'{table}'"
columnNames = "executionfunction with commandText"
columnNamesCount = columnNames.Rows.Count
Dim counter As Int16 = 0
commandText = String.Empty
commandText = $"INSERT INTO {destination} SELECT "
For Each row As DataRow In columnNames.Rows
If counter = columnNamesCount - 1 Then
commandText += $"B.{row("column_name")} "
Else
commandText += $"B.{row("column_name")}, "
End If
counter = counter + 1
Next
commandText += $"FROM
(Select A.* FROM (Select Row_NUMBER()
OVER(order by %%physloc%%) AS RowID, {table}.*
FROM {table} where {filter}) A
WHERE A.RowID between ({recordsPerStatement} * ({iteration}-1)) + 1
AND ({recordsPerStatement} * {iteration})) B"
EDIT: To remove the %%physloc%% clause AN OFFSET FETCH NEXT part has been build in. new approach:
commandText += $"INSERT INTO {destination} SELECT * FROM {table} WHERE {filter}"
For i As Int16 = 1 To columnNamesCount
If i = 1 Then
commandText += $"ORDER BY {columnNames.Rows(i - 1)("column_name")} ASC"
Else
commandText += $"{columnNames.Rows(i - 1)("column_name")} ASC"
End If
If i <> columnNamesCount Then
commandText += ", "
End If
Next
commandText += $" OFFSET ({recordsPerStatement} * ({iteration} -1)) ROWS FETCH Next {recordsPerStatement} ROWS ONLY"

Dynamic SP returning values in reverse order

I am using MS SQL and created one Dynamic stored procedure:
ALTER Procedure [dbo].[sp_MTracking]
(
#OList varchar(MAX)
)
As
BEGIN TRY
SET NOCOUNT ON
DECLARE #SQL varchar(600)
SET #SQL = 'select os.X,os.Y from Table1 as os join Table2 as s on os.sID=s.sID where s.SCode IN ('+ #OList +')'
exec (#SQL)
END TRY
BEGIN CATCH
Execute sp_DB_ErrorInfo
Select -1 Result
END CATCH
GO
It is working properly, but I am getting x,y values in reverse order.
For example if I am passing 'scode1,scode2' as parameter, I am getting x,y values for scode1 in 2nd row and x,y values for scode2 as first row.
How can I fix this issue
Thanks
This is a bit long for a comment.
SQL tables and results sets represent unordered sets. There is no ordering, unless you explicitly use an ORDER BY clause.
Your query does not have an ORDER BY. Hence, you have no reason to expect the results in any particular order. In addition, the ordering may be different on different runs of the query. If you want the results in a particular order, add ORDER BY.
Probably the easiest way is to use charindex():
order by charindex(',' + s.code + ',' , ',''' + #olist + ''',')
This is a bit more cumbersome in dynamic sql:
SET #SQL = '
select os.X,os.Y
from Table1 os join
Table2 s
on os.sID = s.sID
where s.SCode IN (' + #OList + ')
order by charindex('','' + s.code + '','', '',''' + #OList + ''', '')
';
Well, there are a couple of things here.
The first thing is what Gordon wrote - to ensure the order of the result set you must use the order by clause.
Second, like Devart demonstrated in his answer, you don't need dynamic sql for this kind of procedures.
Third, if you want your results ordered by the order of the parameters in the list, you should use a slightly different approach then Devart wrote.
Therefor, here are my 2 cents:
If you can change the stored procedure to accept a table valued parameter instead of VARCHAR(max) that would be your best option IMHO.
If not, you must use a split function to create a table from that varchar and then use that table in your select.
Note that you will have to choose a split function that returns a table with two columns - one for the value and one for it's position in the original string.
Whatever the case may be, the rest of the sql should be something like this:
SELECT os.X, os.Y
FROM Table1 os
INNER JOIN Table2 s ON os.[sID] = s.[sID]
INNER JOIN #TVP t ON s.SCode = t.Value
ORDER BY t.Sort
That's assuming #TVP to be a Table containing a Value column that's the same data type of SCode in table2, and a Sort column (an int, naturally).
Without dynamic sql -
ALTER PROCEDURE [dbo].[sp_MTracking]
(
#OList VARCHAR(MAX)
)
AS BEGIN
SET NOCOUNT ON
DECLARE #t TABLE (val VARCHAR(50) PRIMARY KEY WITH(IGNORE_DUP_KEY=ON))
INSERT INTO #t
SELECT item = t.c.value('.', 'INT')
FROM (
SELECT txml = CAST('<r>' + REPLACE(#OList, ',', '</r><r>') + '</r>' AS XML)
) r
CROSS APPLY txml.nodes('/r') t(c)
SELECT os.X, os.Y
FROM Table1 os
JOIN Table2 s ON os.[sID] = s.[sID]
WHERE s.SCode IN (SELECT * FROM #t)
--OPTION(RECOMPILE)
END
GO

Export data from a non-normalized database

I need to export data from a non-normalized database where there are multiple columns to a new normalized database.
One example is the Products table, which has 30 boolean columns (ValidSize1, ValidSize2 ecc...) and every record has a foreign key which points to a Sizes table where there are 30 columns with the size codes (XS, S, M etc...). In order to take the valid sizes for a product I have to scan both tables and take the value SizeCodeX from the Sizes table only if ValidSizeX on the product is true. Something like this:
Products Table
--------------
ProductCode <PK>
Description
SizesTableCode <FK>
ValidSize1
ValidSize2
[...]
ValidSize30
Sizes Table
-----------
SizesTableCode <PK>
SizeCode1
SizeCode2
[...]
SizeCode30
For now I am using a "template" query which I repeat for 30 times:
SELECT
Products.Code,
Sizes.SizesTableCode, -- I need this code because different codes can have same size codes
Sizes.Size_1
FROM Products
INNER JOIN Sizes
ON Sizes.SizesTableCode = Products.SizesTableCode
WHERE Sizes.Size_1 IS NOT NULL
AND Products.ValidSize_1 = 1
I am just putting this query inside a loop and I replace the "_1" with the loop index:
SET #counter = 1;
SET #max = 30;
SET #sql = '';
WHILE (#counter <= #max)
BEGIN
SET #sql = #sql + ('[...]'); -- Here goes my query with dynamic indexes
IF #counter < #max
SET #sql = #sql + ' UNION ';
SET #counter = #counter + 1;
END
INSERT INTO DestDb.ProductsSizes EXEC(#sql); -- Insert statement
GO
Is there a better, cleaner or faster method to do this? I am using SQL Server and I can only use SQL/TSQL.
You can prepare a dynamic query using the SYS.Syscolumns table to get all value in row
DECLARE #SqlStmt Varchar(MAX)
SET #SqlStmt=''
SELECT #SqlStmt = #SqlStmt + 'SELECT '''+ name +''' column , UNION ALL '
FROM SYS.Syscolumns WITH (READUNCOMMITTED)
WHERE Object_Id('dbo.Products')=Id AND ([Name] like 'SizeCode%' OR [Name] like 'ProductCode%')
IF REVERSE(#SqlStmt) LIKE REVERSE('UNION ALL ') + '%'
SET #SqlStmt = LEFT(#SqlStmt, LEN(#SqlStmt) - LEN('UNION ALL '))
print ( #SqlStmt )
Well, it seems that a "clean" (and much faster!) solution is the UNPIVOT function.
I found a very good example here:
http://pratchev.blogspot.it/2009/02/unpivoting-multiple-columns.html

Pass EXEC statement to APPLY as a parameter

I have a need to grab data from multiple databases which has tables with the same schema. For this I created synonyms for this tables in the one of the databases. The number of databases will grow with time. So, the procedure, which will grab the data should be flexible. I wrote the following code snippet to resolve the problem:
WHILE #i < #count
BEGIN
SELECT #synonymName = [Name]
FROM Synonyms
WHERE [ID] = #i
SELECT #sql = 'SELECT TOP (1) *
FROM [dbo].[synonym' + #synonymName + '] as syn
WHERE [syn].[Id] = tr.[Id]
ORDER BY [syn].[System.ChangedDate] DESC'
INSERT INTO #tmp
SELECT col1, col2
FROM
(
SELECT * FROM TableThatHasRelatedDataFromAllTheSynonyms
WHERE [Date] > #dateFrom
) AS tr
OUTER APPLY (EXEC(#sql)) result
SET #i = #i + 1
END
I also appreciate for any ideas on how to simplify the solution.
Actually, it's better to import data from all tables into one table (maybe with additional column for source table name) and use it. Importing can be performed through SP or SSIS package.
Regarding initial question - you can achieve it through TVF wrapper for exec statement (with exec .. into inside it).
UPD: As noticed in the comments exec doesn't work inside TVF. So, if you really don't want to change DB structure and you need to use a lot of tables I suggest to:
OR select all data from synonym*** table into variables (as I see you select only one row) and use them
OR prepare dynamic SQL for complete statement (with insert, etc.) and use temporary table instead of table variable here.
My solution is quite simple. Just to put all the query to the string and exec it. Unfortunately it works 3 times slower than just copy/past the code for all the synonyms.
WHILE #i < #count
BEGIN
SELECT #synonymName = [Name]
FROM Synonyms
WHERE [ID] = #i
SELECT #sql = 'SELECT col1, col2
FROM
(
SELECT * FROM TableThatHasRelatedDataFromAllTheSynonyms
WHERE [Date] > ''' + #dateFrom + '''
) AS tr
OUTER APPLY (SELECT TOP (1) *
FROM [dbo].[synonym' + #synonymName + '] as syn
WHERE [syn].[Id] = tr.[Id]
ORDER BY [syn].[System.ChangedDate] DESC) result'
INSERT INTO #tmp
EXEC(#sql)
SET #i = #i + 1
END

SELECT INTO behavior and the IDENTITY property

I've been working on a project and came across some interesting behavior when using SELECT INTO. If I have a table with a column defined as int identity(1,1) not null and use SELECT INTO to copy it, the new table will retain the IDENTITY property unless there is a join involved. If there is a join, then the same column on the new table is defined simply as int not null.
Here is a script that you can run to reproduce the behavior:
CREATE TABLE People (Id INT IDENTITY(1,1) not null, Name VARCHAR(10))
CREATE TABLE ReverseNames (Name varchar(10), ReverseName varchar(10))
INSERT INTO People (Name)
VALUES ('John'), ('Jamie'), ('Joe'), ('Jenna')
INSERT INTO ReverseNames (Name, ReverseName)
VALUES ('John','nhoJ'), ('Jamie','eimaJ'), ('Joe','eoJ'), ('Jenna','anneJ')
--------
SELECT Id, Name
INTO People_ExactCopy
FROM People
SELECT Id, ReverseName as Name
INTO People_WithJoin
FROM People
JOIN ReverseNames
ON People.Name = ReverseNames.Name
SELECT Id, (SELECT ReverseName FROM ReverseNames WHERE Name = People.Name) as Name
INTO People_WithSubSelect
FROM People
--------
SELECT OBJECT_NAME(c.object_id) as [Table],
c.is_identity as [Id Column Retained Identity]
FROM sys.columns c
where
OBJECT_NAME(c.object_id) IN ('People_ExactCopy','People_WithJoin','People_WithSubSelect')
AND c.name = 'Id'
--------
DROP TABLE People
DROP TABLE People_ExactCopy
DROP TABLE People_WithJoin
DROP TABLE People_WithSubSelect
DROP TABLE ReverseNames
I noticed that the execution plans for both the WithJoin and WithSubSelect queries contained one join operator. I'm not sure if one will be significantly better on performance if we were dealing with a larger set of rows.
Can anyone shed any light on this and tell me if there is a way to utilize SELECT INTO with joins and still preserve the IDENTITY property?
From Microsoft:
When an existing identity column is
selected into a new table, the new
column inherits the IDENTITY property,
unless one of the following conditions
is true:
The SELECT statement contains a join, GROUP BY clause, or aggregate function.
Multiple SELECT statements are joined by using UNION.
The identity column is listed more than one time in the select list.
The identity column is part of an expression.
The identity column is from a remote data source.
If any one of these conditions is
true, the column is created NOT NULL
instead of inheriting the IDENTITY
property. If an identity column is
required in the new table but such a
column is not available, or you want a
seed or increment value that is
different than the source identity
column, define the column in the
select list using the IDENTITY
function.
You could use the IDENTITY function as they suggest and omit the IDENTITY column, but then you would lose the values, as the IDENTITY function would generate new values and I don't think that those are easily determinable, even with ORDER BY.
I don't believe there is much you can do, except build your CREATE TABLE statements manually, SET IDENTITY_INSERT ON, insert the existing values, then SET IDENTITY_INSERT OFF. Yes you lose the benefits of SELECT INTO, but unless your tables are huge and you are doing this a lot, [shrug]. This is not fun of course, and it's not as pretty or simple as SELECT INTO, but you can do it somewhat programmatically, assuming two tables, one having a simple identity (1,1), and a simple INNER JOIN:
SET NOCOUNT ON;
DECLARE
#NewTable SYSNAME = N'dbo.People_ExactCopy',
#JoinCondition NVARCHAR(255) = N' ON p.Name = r.Name';
DECLARE
#cols TABLE(t SYSNAME, c SYSNAME, p CHAR(1));
INSERT #cols SELECT N'dbo.People', N'Id', 'p'
UNION ALL SELECT N'dbo.ReverseNames', N'Name', 'r';
DECLARE #sql NVARCHAR(MAX) = N'CREATE TABLE ' + #NewTable + '
(
';
SELECT #sql += c.name + ' ' + t.name
+ CASE WHEN t.name LIKE '%char' THEN
'(' + CASE WHEN c.max_length = -1
THEN 'MAX' ELSE RTRIM(c.max_length/
(CASE WHEN t.name LIKE 'n%' THEN 2 ELSE 1 END)) END
+ ')' ELSE '' END
+ CASE c.is_identity
WHEN 1 THEN ' IDENTITY(1,1)'
ELSE ' ' END + ',
'
FROM sys.columns AS c
INNER JOIN #cols AS cols
ON c.object_id = OBJECT_ID(cols.t)
INNER JOIN sys.types AS t
ON c.system_type_id = t.system_type_id
AND c.name = cols.c;
SET #sql = LEFT(#sql, LEN(#sql)-1) + '
);
SET IDENTITY_INSERT ' + #NewTable + ' ON;
INSERT ' + #NewTable + '(';
SELECT #sql += c + ',' FROM #cols;
SET #sql = LEFT(#sql, LEN(#sql)-1) + ')
SELECT ';
SELECT #sql += p + '.' + c + ',' FROM #cols;
SET #sql = LEFT(#sql, LEN(#sql)-1) + '
FROM ';
SELECT #sql += t + ' AS ' + p + '
INNER JOIN ' FROM (SELECT DISTINCT
t,p FROM #cols) AS x;
SET #sql = LEFT(#sql, LEN(#sql)-10)
+ #JoinCondition + ';
SET IDENTITY_INSERT ' + #NewTable + ' OFF;';
PRINT #sql;
With the tables given above, this produces the following, which you could pass to EXEC sp_executeSQL instead of PRINT:
CREATE TABLE dbo.People_ExactCopy
(
Id int IDENTITY(1,1),
Name varchar(10)
);
SET IDENTITY_INSERT dbo.People_ExactCopy ON;
INSERT dbo.People_ExactCopy(Id,Name)
SELECT p.Id,r.Name
FROM dbo.People AS p
INNER JOIN dbo.ReverseNames AS r
ON p.Name = r.Name;
SET IDENTITY_INSERT dbo.People_ExactCopy OFF;
I did not deal with other complexities such as DECIMAL columns or other columns that have parameters such as max_length, nor did I deal with nullability, but these things wouldn't be hard to add it if you need greater flexibility.
In the next version of SQL Server (code-named "Denali") you should be able to construct a CREATE TABLE statement much easier using the new metadata discovery functions - which do much of the grunt work for you in terms of specifying precision/scale/length, dealing with MAX, etc. You still have to manually create indexes and constraints; but you don't get those with SELECT INTO either.
What we really need is DDL that allows you to say something like "CREATE TABLE a IDENTICAL TO b;" or "CREATE TABLE a BASED ON b;"... it's been asked for here, but has been rejected (this is about copying a table to another schema, but the same concept could apply to a new table in the same schema with a different table name). http://connect.microsoft.com/SQLServer/feedback/details/632689
I realize this is a really late response but whoever is still looking for this solution, like I was until I found this solution:
You can't use the JOIN operator for the IDENTITY column property to be inherited.
What you can do is use a WHERE clause like this:
SELECT a.*
INTO NewTable
FROM
MyTable a
WHERE
EXISTS (SELECT 1 FROM SecondTable b WHERE b.ID = a.ID)
This works.