SQL Server 2008 copy table with changing physical sequence of fields - sql-server-2005

The task is to copy
table1 ( A,C,B )
to
table2 ( A,B,C )
Tables effectively identical, the same fields/constraints just physical sequence of fields is different. Can I do it with standard tools and minimal coding. For bulk copy in this case mapping for each field pair seems to be required.

This seems to be a simple
insert into table2(A,B,C)
select A,B,C from table1

If I've understood your question correctly, you can just copy the table without changes and then change the ordering afterwards. Change the field order by dragging the column in Design mode in SQL Enterprise Manager.

Related

Use to SQL to detect changes between tables

I want to create a SQL script that would compare 2 of the same fields in two different tables. These tables may be in two different servers. I want to use this script to see if one field gets updated in one table/server, it is also updated in the other table/server. Any ideas to approach this?
First thing you need to be sure of is that your servers are linked, otherwise you won't easily be able to compare the two. If the servers are linked, and the tables are identical you can use an EXCEPT query to identify the changes e.g.
select * from [server1].[db].[schema].[table]
except
select * from [server2].[db].[schema].[table]
This query will return all rows from the table in server1 that don't appear in server2 from here you can either wrap this in a count or insert/update the missing/changed rows from one table to another
Identifying whether the rows have changed or been inserted will rely on using a primary key, with that you can join one table to another and identify what needs updating using a query like so:
select *
from [server1].[db].[schema].[table] t1
inner join [server2].[db].[schema].[table] t2 on t1.id = t2.id
where ( t1.col1 <> t2.col1 or t1.col2 <> t2.col2 ... )
Another way of tracking changes is to use a DML trigger and have this propagate changes from one table to another.
I was working on a SQL Server auditing tool that uses these principles, have a look through the code if you like its not 100% working https://github.com/simtbak/panko/blob/main/archive/Panko%20v003.sql

Trying to use cursor on one database using select from another db

So I'm trying to wrap my head around cursors. I have task to transfer data from one database to another, but they have slightly diffrent schemas. Let's say I have TableOne (Id, Name, Gold) and TableTwo (Id, Name, Lvl). I want to take all records from TableTwo and insert it into TableOne, but it can be duplicated data on Name column. So if single record from TableTwo exist (on Name column comparison) in TableOne, I want to skip it, if don't - create record in TableOne with unique Id.
I was thinking about looping on each record in TableTwo, and for every record check if it's exist in TableOne. So, how do I make this check without making call to another database every time? I wanted first select all record from TableOne, save it into variable and in loop itself make check against this variable. Is this even possible in SQL? I'm not so familiar with SQL, some code sample would help a lot.
I'm using Microsoft SQL Server Management Studio if that matters. And of course, TableOne and TableTwo exists in diffrent databases.
Try this
Insert into table1(id,name,gold)
Select id,name,lvl from table2
Where table2.name not in(select t1.name from table1 t1)
If you want to add newId for every row you can try
Insert into table1(id,name,gold)
Select (select max(m.id) from table1 m) + row_number() over (order by t2.id) ,name,lvl from table2 t2
Where t2.name not in(select t1.name from table1 t1)
It is possible yes, but I would not recommend it. Looping (which is essentially what a cursor does) is usually not advisable in SQL when a set-based operation will do.
At a high level, you probably want to join the two tables together (the fact that they're in different databases shouldn't make a difference). You mention one table has duplicates. You can eliminate those in a number of ways such as using a group by or a row_number. Both approaches will require you understanding which rows you want to "pick" and which ones you want to "ignore". You could also do what another user posted in a comment where you do an existence check against the target table using a correlated subquery. That will essentially mean that if any rows exist in the target table that have duplicates you're trying to insert, none of those duplicates will be put in.
As far as cursors are concerned, to do something like this, you'd be doing essentially the same thing, except on each pass of the cursor you would be temporarily assigning and using variables instead of columns. This approach is sometimes called RBAR (for "Rob by Agonizing Row"). On every pass of the cursor or loop, it has to re-open the table, figure out what data it needs, then operate on it. Even if that's efficient and it's only pulling back one row, there's still lots of overhead to doing that query. So while, yes, you can force SQL to do what you've describe, the database engine already has an operation for this (joins) which does it far faster than any loop you could conceivably write

Safely insert row data from one table to another - SQL

I need to move some data stored in one table to another using a script, taking into account existing records that may already be in the destination table as well as any relationships that may exist.
I am curious to know the best method of doing this that has a relatively low impact on performance and can be reversed if necessary.
At first I will be moving only one record to ensure the process runs smoothly but then it will be responsible for moving around 1650 rows.
What would be the best approach to take or is there a better alternative?
Edit:
My previous suggestion of using MERGE will not work as I will be operating under the SQL Server 2005 environment, not 2008 like previously mentioned.
the question does not provide any details, so I can't provide any actual real code, just this plan of attack:
step 1 write a query that will SELECT only the rows you need to copy. You will need to JOIN and/or filter (WHERE) this data to only include the rows that don't already exist in the destination table. Make the column list be the exact same as the destination table's columns, in column order and data type.
step 2 turn that SELECT statement into an INSERT by adding INSERT YourDestinationTable (col1, col2, col3..) before the select.
step 3 if you only want to try a single row, add a TOP 1 to the select part of the new INSERET - SELECT command, you can rerun this command as many times as necessary with/without a TOP because it should eliminate any rows you add by the JOINs and WHERE conditions in the SELECT
in the end, you'll have something that looks like:
INSERT YourDestinationTable
(Col1, Col2, Col3, ...)
SELECT
s.Col1, s.Col2, s.Col3, ...
FROM YourSourceTable s
LEFT OUTER JOIN SomeOtherTable x ON s.Col4=x.Col4
WHERE NOT EXISTS (SELECT 1 FROM YourDestinationTable d WHERE s.PK=d.PK)
AND x.Col5='J'
I'm reading the question as only inserting missing rows from a source table to a destination table. If changes need to be migrated as well then prior to the above steps you will need to do an UPDATE of the destination table joining in the source table. This is hard to explain without more specifics of the actual tables, columns, etc.
Yes, the MERGE statement is ideal for bulk imports if you are running SQL Server 2008.

My SQL insert/update statement is too inefficient

I'm trying to write code for a batch import of lots of rows into the database.
Currently I bulk copy the raw data (from a .csv file) into a staging table, so that it's all at the database side. That leaves me with a staging table full of rows that identify 'contacts'. These now need to be moved into other tables of the database.
Next I copy over the rows from the staging table that I don't already have in the contacts table, and for the ones I do already have, I need to update the column named "GroupToBeAssignedTo", indicating a later operation I will perform.
I have a feeling I'm going about this wrong. The query isn't efficient and I'm looking for advice of how I could do this better.
update [t1]
set [t1].GroupToBeAssignedTo = [t2].GroupToBeAssignedTo from Contacts [t1]
inner join ContactImportStaging [t2] on [t1].UserID = [t2].UserID AND [t1].EmailAddress = [t2].EmailAddress AND [t2].GUID = #GUID
where not exists
(
select GroupID, ContactID from ContactGroupMapping
where GroupID = [t2].GroupToBeAssignedTo AND ContactID = [t1].ID
)
Might it be better to just import all the rows without checking for duplicates first and then 'clean' the data afterwards? Looking for suggestions of where I'm going wrong. Thanks.
EDIT: To clarify, the question is regarding MS SQL.
This answer is slightly "I wouldn't start from here", but it's the way I'd do it ;)
If you've got the Standard or Enterprise editions of MS SQL Server 2005, and you have access to SQL Server Integration Services, this kind of thing is a doddle to do with a Data Flow.
Create a data source linked to the CSV file (it's faster if it's sorted by some field)
...and another to your existing contacts table (using ORDER BY to sort it by the same field)
Do a Merge Join on their common field -- you'll need to use a Sort transformation if either the two sources aren't already sorted
Do a Conditional split to focus only on rows that aren't already in your table (i.e. a table-unique field is "null", i.e. the merge join didn't actually merge for that row)
Use an OLEDB destination to input to the table.
Probably more individual steps than a single insert-with-select statement, but it'll save your staging, and it's pretty intuitive to follow. Plus, you're probably already licenced to use it, and it's pretty easy :)
Next I copy over the rows from the staging table that I don't already have in the contacts table
It seems that implies that ContactGroupMapping does not have records matching Contacts.id, in which case you can just omit the EXISTS:
UPDATE [t1]
SET [t1].GroupToBeAssignedTo = [t2].GroupToBeAssignedTo
FROM Contacts [t1]
INNER JOIN
ContactImportStaging [t2]
ON [t1].UserID = [t2].UserID
AND [t1].EmailAddress = [t2].EmailAddress
AND [t2].GUID = #GUID
Or I am missing something?

SQL Server features/commands that most developers are unaware of [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Hidden Features of SQL Server
I've worked as a .NET developer for a while now, but predominantly against a SQL Server database for a little over 3 years now. I feel that I have a fairly decent grasp of SQL Server from a development standpoint, but I ashamed to admit that I just learned today about "WITH TIES" from this answer - Top 5 with most friends.
It is humbling to see questions and answers like this on SO because it helps me realize that I really don't know as much as I think I do and helps re-energize my will to learn more, so I figured what better way than to ask the masses of experts for input on other handy commands/features.
What is the most useful feature/command that the average developer is probably unaware of?
BTW - if you are like I was and don't know what "WITH TIES" is for, here is a good explanation. You'll see quickly why I was ashamed I was unaware of it. I could see where it could be useful though. - http://harriyott.com/2007/06/with-ties-sql-server-tip.aspx
I realize that this is a subjective question so please allow for at least a few answers before you close it. :) I'll try to edit my question to keep up a list with your response. Thanks
[EDIT] - Here is a summary of the responses Please scroll down for more information. Thanks again guys/gals.
MERGE - A single command to INSERT / UPDATE / DELETE into a table from a row source.
FILESTREAM feature of SQL Server 2008 allows storage of and efficient access to BLOB data using a combination of SQL Server 2008 and the NTFS file system
CAST - get a date without a time portion
Group By - I gotta say you should definitely know this already
SQL Server Management Studio
Transactions
The sharing of local scope temp tables between nested procedure calls
INSERT INTO
MSDN
JOINS
PIVOT and UNPIVOT
WITH(FORCESEEK) - forces the query optimizer to use only an index seek operation as the access path to the data in the table.
FOR XML
COALESCE
How to shrink the database and log files
Information_Schema
SET IMPLICIT_TRANSACTIONS in Management Studio 2005
Derived tables and common table expressions (CTEs)
OUTPUT clause - allows access to the "virtual" tables called inserted and deleted (like in triggers)
CTRL + 0 to insert null
Spacial Data in SQL Server 2008
FileStream in SQL Server 2008: FILESTREAM feature of SQL Server 2008 allows storage of and efficient access to BLOB data using a combination of SQL Server 2008 and the NTFS file system.
Creating a Table for Storing FILESTREAM Data
Once the database has a FILESTREAM filegroup, tables can be created that contain FILESTREAM columns. As mentioned earlier, a FILESTREAM column is defined as a varbinary (max) column that has the FILESTREAM attribute. The following code creates a table with a single FILESTREAM column
USE Production;
GO
CREATE TABLE DocumentStore (
DocumentID INT IDENTITY PRIMARY KEY,
Document VARBINARY (MAX) FILESTREAM NULL,
DocGUID UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL
UNIQUE DEFAULT NEWID ())
FILESTREAM_ON FileStreamGroup1;
GO
In SQL Server 2008 (and in Oracle 10g): MERGE.
A single command to INSERT / UPDATE / DELETE into a table from a row source.
To generate a list of numbers from 1 to 31 (say, for a calendary):
WITH cal AS
(
SELECT 1 AS day
UNION ALL
SELECT day + 1
FROM cal
WHERE day <= 30
)
A single-column index with DESC clause in a clustered table can be used for sorting on column DESC, cluster_key ASC:
CREATE INDEX ix_column_desc ON mytable (column DESC)
SELECT TOP 10 *
FROM mytable
ORDER BY
column DESC, pk
-- Uses the index
SELECT TOP 10 *
FROM mytable
ORDER BY
column, pk
-- Doesn't use the index
CROSS APPLY and OUTER APPLY: enables to join rowsources which depend on the values of the tables being joined:
SELECT *
FROM mytable
CROSS APPLY
my_tvf(mytable.column1) tvf
SELECT *
FROM mytable
CROSS APPLY
(
SELECT TOP 5 *
FROM othertable
WHERE othertable.column2 = mytable.column1
) q
EXCEPT and INTERSECT operators: allow selecting conditions that include NULLs
DECLARE #var1 INT
DECLARE #var2 INT
DECLARE #var3 INT
SET #var1 = 1
SET #var2 = NULL
SET #var2 = NULL
SELECT col1, col2, col3
FROM mytable
INTERSECT
SELECT #val1, #val2, #val3
-- selects rows with `col1 = 1`, `col2 IS NULL` and `col3 IS NULL`
SELECT col1, col2, col3
FROM mytable
EXCEPT
SELECT #val1, #val2, #val3
-- selects all other rows
WITH ROLLUP clause: selects a grand total for all grouped rows
SELECT month, SUM(sale)
FROM mytable
GROUP BY
month WITH ROLLUP
Month SUM(sale)
--- ---
Jan 10,000
Feb 20,000
Mar 30,000
NULL 60,000 -- a total due to `WITH ROLLUP`
It's amazing how many people work unprotected with SQL Server as they don't know about transactions!
BEGIN TRAN
...
COMMIT / ROLLBACK
After creating a #TempTable in a procedure, it is available in all stored procedures that are then called from from the original procedure. It is a nice way to share set data between procedures. see: http://www.sommarskog.se/share_data.html
COALESCE() , it accepts fields and a value to use incase the fields are null.
For example if you have a table with city, State, Zipcode you can use COALESCE() to return the addresses as single strings, IE:
City | State | Zipcode
Houston | Texas | 77058
Beaumont | Texas | NULL
NULL | Ohio | NULL
if you were to run this query against the table:
select city + ‘ ‘ + COALESCE(State,’’)+ ‘ ‘+COALESCE(Zipcode, ‘’)
Would return:
Houston Texas 77058
Beaumont Texas
Ohio
You can also use it to pivot data, IE:
DECLARE #addresses VARCHAR(MAX)
SELECT #addresses = select city + ‘ ‘ + COALESCE(State,’’)+ ‘ ‘
+COALESCE(Zipcode, ‘’) + ‘,’ FROM tb_addresses
SELECT #addresses
Would return:
Houston Texas 77058, Beaumont Texas, Ohio
A lot of SQL Server developers still don't seem to know about the OUTPUT clause (SQL Server 2005 and newer) on the DELETE, INSERT and UPDATE statement.
It can be extremely useful to know which rows have been INSERTed, UPDATEd, or DELETEd, and the OUTPUT clause allows to do this very easily - it allows access to the "virtual" tables called inserted and deleted (like in triggers):
DELETE FROM (table)
OUTPUT deleted.ID, deleted.Description
WHERE (condition)
If you're inserting values into a table which has an INT IDENTITY primary key field, with the OUTPUT clause, you can get the inserted new ID right away:
INSERT INTO MyTable(Field1, Field2)
OUTPUT inserted.ID
VALUES (Value1, Value2)
And if you're updating, it can be extremely useful to know what changed - in this case, inserted represents the new values (after the UPDATE), while deleted refers to the old values before the UPDATE:
UPDATE (table)
SET field1 = value1, field2 = value2
OUTPUT inserted.ID, deleted.field1, inserted.field1
WHERE (condition)
If a lot of info will be returned, the output of OUTPUT can also be redirected to a temporary table or a table variable (OUTPUT INTO #myInfoTable).
Extremely useful - and very little known!
Marc
There are a handful of ways to get a date without a time portion; here's one that is quite performant:
SELECT CAST(FLOOR(CAST(getdate() AS FLOAT))AS DATETIME)
Indeed for SQL Server 2008:
SELECT CAST(getdate() AS DATE) AS TodaysDate
The "Information_Schema" gives me a lot of views that I can use to gather information about the SQL objects tables, procedures, views, etc.
If you are using Management Studio 2005 you can have it automatically execute your query as a transaction. In a new query window go to Query->Query Options. Then click on the ANSI "tab" (on the left). Check SET IMPLICIT_TRANSACTIONS. Click OK. Now if you run any query in this current query window it will run as a transaction and you must manually ROLLBACK or COMMIT it before continuing. Additionally, this only works for the current query window; pre-existing/new query windows will need to have the option set.
I've personally found it useful. However, it's not for the faint of heart. You must remember to ROLLBACK or COMMIT your query. It will NOT tell you that you have a pending transaction if you switch to a different query window (or even a new one). However, it will tell you if you try to close the query window.
PIVOT and UNPIVOT
FOR XML
BACKUP LOG <DB_NAME> WITH TRUNCATE_ONLY
DBCC_SHRINKDATABASE(<DB_LOG_NAME>, <DESIRED_SIZE>)
When I started to manage very large databases on MS SQL Server and the log file had over 300 GB this statements saved my life. In most cases the shrink database will have no effect.
Before running them be sure to make full backup of LOG, and after running them to do a full backup of DB (restore sequence is no longer valid).
Most SQL Server developers should know about and use derived tables and common table expressions (CTEs).
The documentation.
Sad to say, but I have come to the conclusion that the most hidden feature that developers are unaware of is the documentation on MSDN. Take for instance a Transact-SQL verb like RESTORE. The BOL will cover not only the syntax and arguments of RESTORE. But this is only the tip of the iceberg when it comes to documentation. The BOL covers:
the in depth fundamentals of recovery: Understanding How Restore and Recovery of Backups Work in SQL Server.
end-to-end scenarios on how to deploy a recovery strategy: Implementing Restore Scenarios for SQL Server Databases.
the issues around system databases: Considerations for Backing Up and Restoring System Databases.
optimizing the recovery procedures: Optimizing Backup and Restore Performance in SQL Server.
understanding how to to a restore. Backing Up and Restoring How-to Topics (Transact-SQL).
more corner cases and uncommon scenarios, there are examples like Example: Piecemeal Restore of Only Some Filegroups (Full Recovery Model).
The list goes on and on, and this is just one single topic (backup and restore). Every feature of SQL Server gets similar coverage. Reckon not everything will get the detail backup and recovery gets, but everything is documented and there are How To topics for every feature.
The amount of information available is just ludicrous. Yet the documentation is one of the most underused resources, hence my vote for it being a hidden feature.
How about materialised views? Add a clustered index to a view and you effectively create a table containing duplicate data that is automatically updated. Slows down inserts and updates because you are doing the operation twice but you make selecting a specific subset faster. And apparently the database optimiser uses it without you having to call it explicitly.
Is a view faster than a simple query?
It sounds silly to say but I've looked a lot of queries where I just asked myself does the person just not know what GROUP BY is? I'm not sure if most developers are unaware of it but it comes up enough that I wonder sometimes.
use ctrl-0 to insert a null value in a cell
WITH (FORCESEEK) which forces the query optimizer to use only an index seek operation as the access path to the data in the table.
Spacial Data in SQL Server 2008 i.e. storing Lat/Long data in a geography datatype and being able to calculate/query using the functions that go along with it.
It supports both Planar and Geodetic data.
Why am I tempted to say JOINS?
Derived tables are one of my favorites. They perform so much better than correlated subqueries but may people continue to use correlated subqueries instead.
Example of a derived table:
select f.FailureFieldName, f.RejectedValue, f.RejectionDate,
ft.FailureDescription, f.DataTableLocation, f.RecordIdentifierFieldName,
f.RecordIdentifier , fs.StatusDescription
from dataFailures f
join(select max (dataFlowinstanceid) as dataFlowinstanceid
from dataFailures
where dataflowid = 13)a
on f.dataFlowinstanceid = a.dataFlowinstanceid
join FailureType ft on f.FailureTypeID = ft.FailureTypeID
join FailureStatus fs on f.FailureStatusID = fs.FailureStatusID
When I first started working as programmer, I started with using SQL Server 2000. I had been taught DB theory on Oracle and MySQL so I didn't know much about SQL Server 2000.
But, as it turned out nor did the development staff I joined because they didn't know that you could convert datetime (and related) data types to formatted strings with built in functions. They were using a very inefficient custom function they had developed. I was more than happy to show them the errors of their ways... (I'm not with that company anymore... :-D)
With that annotate:
So I wanted to add this to the list:
select Convert(varchar, getdate(), 101) -- 08/06/2009
select Convert(varchar, getdate(), 110) -- 08-06-2009
These are the two I use most often. There are a bunch more: CAST and CONVERT on MSDN