Trying to use cursor on one database using select from another db - sql

So I'm trying to wrap my head around cursors. I have task to transfer data from one database to another, but they have slightly diffrent schemas. Let's say I have TableOne (Id, Name, Gold) and TableTwo (Id, Name, Lvl). I want to take all records from TableTwo and insert it into TableOne, but it can be duplicated data on Name column. So if single record from TableTwo exist (on Name column comparison) in TableOne, I want to skip it, if don't - create record in TableOne with unique Id.
I was thinking about looping on each record in TableTwo, and for every record check if it's exist in TableOne. So, how do I make this check without making call to another database every time? I wanted first select all record from TableOne, save it into variable and in loop itself make check against this variable. Is this even possible in SQL? I'm not so familiar with SQL, some code sample would help a lot.
I'm using Microsoft SQL Server Management Studio if that matters. And of course, TableOne and TableTwo exists in diffrent databases.

Try this
Insert into table1(id,name,gold)
Select id,name,lvl from table2
Where table2.name not in(select t1.name from table1 t1)
If you want to add newId for every row you can try
Insert into table1(id,name,gold)
Select (select max(m.id) from table1 m) + row_number() over (order by t2.id) ,name,lvl from table2 t2
Where t2.name not in(select t1.name from table1 t1)

It is possible yes, but I would not recommend it. Looping (which is essentially what a cursor does) is usually not advisable in SQL when a set-based operation will do.
At a high level, you probably want to join the two tables together (the fact that they're in different databases shouldn't make a difference). You mention one table has duplicates. You can eliminate those in a number of ways such as using a group by or a row_number. Both approaches will require you understanding which rows you want to "pick" and which ones you want to "ignore". You could also do what another user posted in a comment where you do an existence check against the target table using a correlated subquery. That will essentially mean that if any rows exist in the target table that have duplicates you're trying to insert, none of those duplicates will be put in.
As far as cursors are concerned, to do something like this, you'd be doing essentially the same thing, except on each pass of the cursor you would be temporarily assigning and using variables instead of columns. This approach is sometimes called RBAR (for "Rob by Agonizing Row"). On every pass of the cursor or loop, it has to re-open the table, figure out what data it needs, then operate on it. Even if that's efficient and it's only pulling back one row, there's still lots of overhead to doing that query. So while, yes, you can force SQL to do what you've describe, the database engine already has an operation for this (joins) which does it far faster than any loop you could conceivably write

Related

Difference between two tables, unknown fields

Is there a way in Access using SQL to get the difference between 2 tables?
I'm building an audit function and I want to return all records from table1 where a value (or values) doesn't match the corresponding record in table2. Primary keys will always match between the two tables. They will always contain the exact same number of fields, field names, and types, as each other. However, the number and name of those fields cannot be determined before the query is run.
Please also note, I am looking for an Access SQL solution. I know how to solve this with VBA.
Thanks,
There are several possibilities to compare fields with known names, but there is no way in SQL to access fields without knowing their name. Mostly becase SQL doesn't consider fields to have a specific order in a table.
So the only way to accomplish what you need in pure Access-SQL would be, if there was a SQL-Command for it (kind of like the * as placeholder for all fields). But there isn't. Microsoft Access SQL Reference.
What you COULD do is create an SQL-clause on the fly in VBA. (I know, you said you didn't want to do it in VBA - but this is doing it in SQL, but using VBA to create the SQL..).
Doing everything in VBA would probably take some time, but creating an SQL on the fly is very fast and you can optimize it to the specific table. Then executing the SQL is the fastest solution you can get.
Not sure without your table structure but you can probably get that done using NOT IN operator (OR) using WHERE NOT EXISTS like
select * from table1
where some_field not in (select some_other_field from table2);
(OR)
select * from table1 t1
where not exists (select 1 from table2 where some_other_field = t1.some_field);
SELECT A.*, B.* FROM A FULL JOIN B ON (A.C = B.C) WHERE A.C IS NULL OR B.C IS NULL;
IF you have tables A and B, both with colum C, here are the records, which are present in table A but not in B.To get all the differences with a single query, a full join must be used,like above

minus vs delete where exist in oracle

I have a CREATE TABLE query which can be done using two methods (create as select statement for thousands/million records):
First method:
create table as select some data minus (select data from other table)
OR
first i should create the table as
create table as select .....
and then
delete from ..where exist.
I guess the second method is better.For which query the cost is less?Why is minus query not as fast as the second method?
EDIT:
I forgot to mention that the create statement has join from two tables as well.
The minus is slow probably because it needs to sort the tables on disk in order to compare them.
Try to rewrite the first query with NOT EXISTS instead of MINUS, it should be faster and will generate less REDO and UNDO (as a_horse_with_no_name mentioned). Of course, make sure that all the fields involved in the WHERE clauses are indexed!
The second one will write lots of records to disk and then remove them. This will in 9 of 10 cases take way longer then filtering what you write in to begin with.
So if the first one actually isn't faster we need more information about the tables and statements involved.

SQL - renumbering a sequential column to be sequential again after deletion

I've researched and realize I have a unique situation.
First off, I am not allowed to post images yet to the board since I'm a new user, so see appropriate links below
I have multiple tables where a column (not always the identifier column) is sequentially numbered and shouldn't have any breaks in the numbering. My goal is to make sure this stays true.
Down and Dirty
We have an 'Event' table where we randomly select a percentage of the rows and insert the rows into table 'Results'. The "ID" column from the 'Results' is passed to a bunch of delete queries.
This more or less ensures that there are missing rows in several tables.
My problem:
Figuring out an sql query that will renumber the column I specify. I prefer to not drop the column.
Example delete query:
delete ItemVoid
from ItemTicket
join ItemVoid
on ItemTicket.item_ticket_id = itemvoid.item_ticket_id
where itemticket.ID in (select ID
from results)
Example Tables Before:
Example Tables After:
As you can see 2 rows were delete from both tables based on the ID column. So now I gotta figure out how to renumber the item_ticket_id and the item_void_id columns where the the higher number decreases to the missing value, and the next highest one decreases, etc. Problem #2, if the item_ticket_id changes in order to be sequential in ItemTickets, then
it has to update that change in ItemVoid's item_ticket_id.
I appreciate any advice you can give on this.
(answering an old question as it's the first search result when I was looking this up)
(MS T-SQL)
To resequence an ID column (not an Identity one) that has gaps,
can be performed using only a simple CTE with a row_number() to generate a new sequence.
The UPDATE works via the CTE 'virtual table' without any extra problems, actually updating the underlying original table.
Don't worry about the ID fields clashing during the update, if you wonder what happens when ID's are set that already exist, it
doesn't suffer that problem - the original sequence is changed to the new sequence in one go.
WITH NewSequence AS
(
SELECT
ID,
ROW_NUMBER() OVER (ORDER BY ID) as ID_New
FROM YourTable
)
UPDATE NewSequence SET ID = ID_New;
Since you are looking for advice on this, my advice is you need to redesign this as I see a big flaw in your design.
Instead of deleting the records and then going through the hassle of renumbering the remaining records, use a bit flag that will mark the records as Inactive. Then when you are querying the records, just include a WHERE clause to only include the records are that active:
SELECT *
FROM yourTable
WHERE Inactive = 0
Then you never have to worry about re-numbering the records. This also gives you the ability to go back and see the records that would have been deleted and you do not lose the history.
If you really want to delete the records and renumber them then you can perform this task the following way:
create a new table
Insert your original data into your new table using the new numbers
drop your old table
rename your new table with the corrected numbers
As you can see there would be a lot of steps involved in re-numbering the records. You are creating much more work this way when you could just perform an UPDATE of the bit flag.
You would change your DELETE query to something similar to this:
UPDATE ItemVoid
SET InActive = 1
FROM ItemVoid
JOIN ItemTicket
on ItemVoid.item_ticket_id = ItemTicket.item_ticket_id
WHERE ItemTicket.ID IN (select ID from results)
The bit flag is much easier and that would be the method that I would recommend.
The function that you are looking for is a window function. In standard SQL (SQL Server, MySQL), the function is row_number(). You use it as follows:
select row_number() over (partition by <col>)
from <table>
In order to use this in your case, you would delete the rows from the table, then use a with statement to recalculate the row numbers, and then assign them using an update. For transactional integrity, you might wrap the delete and update into a single transaction.
Oracle supports similar functionality, but the syntax is a bit different. Oracle calls these functions analytic functions and they support a richer set of operations on them.
I would strongly caution you from using cursors, since these have lousy performance. Of course, this will not work on an identity column, since such a column cannot be modified.

Safely insert row data from one table to another - SQL

I need to move some data stored in one table to another using a script, taking into account existing records that may already be in the destination table as well as any relationships that may exist.
I am curious to know the best method of doing this that has a relatively low impact on performance and can be reversed if necessary.
At first I will be moving only one record to ensure the process runs smoothly but then it will be responsible for moving around 1650 rows.
What would be the best approach to take or is there a better alternative?
Edit:
My previous suggestion of using MERGE will not work as I will be operating under the SQL Server 2005 environment, not 2008 like previously mentioned.
the question does not provide any details, so I can't provide any actual real code, just this plan of attack:
step 1 write a query that will SELECT only the rows you need to copy. You will need to JOIN and/or filter (WHERE) this data to only include the rows that don't already exist in the destination table. Make the column list be the exact same as the destination table's columns, in column order and data type.
step 2 turn that SELECT statement into an INSERT by adding INSERT YourDestinationTable (col1, col2, col3..) before the select.
step 3 if you only want to try a single row, add a TOP 1 to the select part of the new INSERET - SELECT command, you can rerun this command as many times as necessary with/without a TOP because it should eliminate any rows you add by the JOINs and WHERE conditions in the SELECT
in the end, you'll have something that looks like:
INSERT YourDestinationTable
(Col1, Col2, Col3, ...)
SELECT
s.Col1, s.Col2, s.Col3, ...
FROM YourSourceTable s
LEFT OUTER JOIN SomeOtherTable x ON s.Col4=x.Col4
WHERE NOT EXISTS (SELECT 1 FROM YourDestinationTable d WHERE s.PK=d.PK)
AND x.Col5='J'
I'm reading the question as only inserting missing rows from a source table to a destination table. If changes need to be migrated as well then prior to the above steps you will need to do an UPDATE of the destination table joining in the source table. This is hard to explain without more specifics of the actual tables, columns, etc.
Yes, the MERGE statement is ideal for bulk imports if you are running SQL Server 2008.

Optimize query that compares two tables with similar schema in different databases

I have two different tables with similar schema in different database. What is the best way to compare records between these two tables. I need to find out-
records that exists in first table whose corresponding record does not exist in second table filtering records from the first table with some where clauses.
So far I have come with this SQL construct:
Select t1_col1, t1_ col2 from table1
where t1_col1=<condition> AND
t1_col2=<> AND
NOT EXISTS
(SELECT * FROM
table2
WHERE
t1_col1=t2_col1 AND
t1_col2=t2_col2)
Is there a better way to do this?
This above query seems fine but I suspect it is doing row by row comparison without evaluating the conditions in the first part of the query because the first part of the query will reduce the resultset very much. Is this happening?
Just use except keyword!!!
Select t1_col1, t1_ col2 from table1
where t1_col1=<condition> AND
t1_col2=<condition>
except
SELECT t2_col1, t2_ col2 FROM table2
It returns any distinct values from the query to the left of the EXCEPT operand that are not also returned from the right query.
For more information on MSDN
If the data in both table are expected to have the same primary key, you can use IN keyword to filter those are not found in the other table. This could be the simplest way.
If you are open to third party tools like Redgate Data Compare you can try it, it's a very nice tool. Visual Studio 2010 Ultimate edition also have this feature.