Retrieving the last inserted rows - sql

I have a table which contains GUID and Name columns and I need to retrieve the last inserted rows so I can load it into table2.
But how would I find out the latest data in Table1. I seem to be lost at this I have read similar posts posing the same question but the answers don't seem to work for my situation.
I am using SQL Server 2008 and I upload my data using SSIS

1 - One way to do this is with triggers. Check out my blog entry that shows how to copy data from one table to another on a insert.
Triggers to replicate data = http://craftydba.com/?p=1995
However, like most things in life, there is overhead with triggers. If you are bulk loading a ton of data via SSIS, this can add up.
2 - Another way to do this is to add a modify date to your first table and modify your SSIS package.
ALTER TABLE [MyTable1]
ADD [ModifyDate] [datetime] NULL DEFAULT GETDATE();
Next, change your SSIS package. In the control flow, add an execute SQL task. Insert data from [MyTable1] to [MyTable2] using TSQL.
INSERT INTO [MyTable2]
SELECT * FROM [MyTable1]
WHERE [ModifyDate] >= 'Start Date/Time Of Package';
Execute SQL Task =
http://technet.microsoft.com/en-us/library/ms141003.aspx
This will be quicker than a data flow or execute OLEDB command since you are working with the data on the server.

Related

SQL Server bulk insert for large data set

I have 1 million rows of data in a file, I want to insert all the records into SQL Server. While inserting I am doing some comparison with existing data on the server, if the comparison satisfied I will update the existing records in the server or else I will insert the record from the file.
I'm currently doing this by looping from C#, which consume more than 3 hours to complete the work. Can anyone suggest idea to improve the performance?
Thanks,
Xavier.
Check if your database in Full or Simple recovery mode:
SELECT recovery_model_desc
FROM sys.databases
WHERE name = 'MyDataBase';
If database is SIMPLE recovery mode you can create a staging table right there. If it is in Full mode then better create Staging table in separate database with Simple model.
Use any BulkInsert operation/tool (for instance BCP, as already suggested)
Insert only those data from your staging table, which do not exist in your target table. (hope you know how to do it)

SQL Server : Query using data from a file

I need to run a query in SQL Server, where I have a particular number of values stored individually on separate lines in a text file, and I need to run a query in SQL server to check if a value in a column of the table matches any one of the value stored in the txt file.
How should I go about doing this ?
I am aware of how to formulate various types of queries in SQL Server, just not sure how to run a query that is dependent on a file for its query parameters.
EDIT :
Issue 1 : I am not doing this via a program since the query that I need to run traverses over 7 million datapoints which results in the program timing out before it can complete, hence the only alternative I have left is to run the query in SQL Server itself without worrying about the timeout.
Issue 2 : I do not have admin rights to the database that I am accessing which is why there is no way I could create a table, dump the file into it, then perform a query by joining those tables.
Thanks.
One option would be to use BULK INSERT and a temp table. Once in the temp table, you can parse the values. This is likely not the exact answer you need, but based on your experience, I'm sure you could tweak as needed.
Thanks...
SET NOCOUNT ON;
USE Your_DB;
GO
CREATE TABLE dbo.t (
i int,
n varchar(10),
d decimal(18,4),
dt datetime
);
GO
BULK INSERT dbo.t
FROM 'D:\import\data.txt'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n');
There are lots of approaches.
Mine would be to import the file to a table, do the comparison with a regular SQL query, and then delete the file-data table if you don't need it anymore.
Bulk import the data from text file into a temporary table.
Execute the query to do the comparison between your actual physical table & temporary table.

OleDB Destination executes full rollback on error, Bulk Insert Task doesn't

I'm using SSIS and BIDS to process a text file who contains lots (millions) of records. I decided to use the Bulk Insert Task and it worked great but then the destination table needed an additional column with a default value on the insert operation and the Bulk Insert Task stopped working. After that, I decided to use a Derived Column with the defaul value and an OleDB Destination to insert the bulk data. It solved my last problem but generated a new one: If there is an error when inserting the data in the OleDB Destination, then it executes a full rollback and no row was added on my table, but when I used the Bulk Insert Task, there were rows based in the BatchSize configuration. Let me explain it with a sample:
I use a text file with 5000 lines. The file contained a duplicate id (intentionally) between the rows 3000 and 4000.
Before starting the DTS, the destination table was totally empty.
Using Bulk Insert Task, after the error raised (and the DTS stopped), the destination table had 3000 rows. I set the BatchSize attribute to 1000.
Using OleDB Destination, after the error raised, the destination table had 0 rows! I set the Rows per batch attribute to 1000 and the Maximum insert commit size to its max value: 2147483647. I tried changing last one to 0, no effect.
Is this the normal behavior of OleDB Destination? Can someone provide me a guide about working with these tasks? Should I forget to use these tasks and use the Bulk Insert from T-SQL?
As a side note, I also tried following the instructions for KEEPNULLS in Keep Nulls or UseDefault Values During Bulk Import (SQL Server) to not use the OleDB Destination task, but it didn't work (maybe is just me).
EDIT: Additional info about the problem.
Table structure (sample)
Table T
id int, name varchar(50), processed int default 0
CSV File (sample)
1, hello
2, world
There is no rolling back on Bulk Inserts, that's why they are fast.
Take a look at using format files:
http://msdn.microsoft.com/en-us/library/ms179250.aspx
You could potentially place this in a transaction in SSIS (you'll need MSDTC running), or you could create T-SQL script with a try-catch to handle any exceptions of the bulk insert (probably just rollback or commit).

Date of inserting a row into table

Is it possible to find date+time when a row was inserted into a table in SQL Server 2005?
Does SQL Server log insert commands?
Whenever I create a table, I alway include the following two columns:
CreatedBy varchar(255) default system_name,
CreatedAt datetime default getdate()
Although this uses a bit of extra space, I've found that the information proves very, very useful over time.
Your question is about the log. The answer is "yes". However, whether you can get the information depends on your recovery mode. If simple, then the records are overwritten for the next transaction. If bulk or full, then the information is in the log, at least since the last incremental backup.
You can derive insert date as long as you are using cdc created functions to pull actual data records.
So for example if you will pull something like:
DECLARE #from_lsn binary(10), #to_lsn binary(10);
SET #from_lsn=sys.fn_cdc_get_min_lsn ( 'name_of_your_cdc_instance_on_cdc_table' );
SET #to_lsn=sys.fn_cdc_get_max_lsn();
SELECT * FROM
cdc.fn_cdc_get_net_changes_name_of_your_cdc_instance_on_cdc_table
(
#from_lsn,
#to_lsn,
N'all'
);
You can use cdc built in function sys.fn_cdc_map_lsn_to_time to convert Log Sequence Number to datetime. Below usecase:
SELECT sys.fn_cdc_map_lsn_to_time(__$start_lsn),* FROM
cdc.fn_cdc_get_net_changes_name_of_your_cdc_instance_on_cdc_table
(
#from_lsn,
#to_lsn,
N'all'
);
you can have a InsertDate default getdate() column on your table, that would be the easiest approach.
On SQl Server 2008 you can use CDC to control changed data on your table
Change data capture records insert, update, and delete activity that
is applied to a SQL Server table. This makes the details of the
changes available in an easily consumed relational format. Column
information and the metadata that is required to apply the changes to
a target environment is captured for the modified rows and stored in
change tables that mirror the column structure of the tracked source
tables. Table-valued functions are provided to allow systematic access
to the change data by consumers.

Why doesn't SSIS OLE DB Command transformation insert all the records into a table?

I have an SSIS package that takes data from Tables in an SQL database and insert (or update existing rows) in a table that is another database.
Here is my problem, after the lookup, I either insert or update the rows but over half of the rows that goes into the insert are not added to the table.
For the insert, I am using an Ole Db Command object in which I use an insert command that I have tested. I found out why the package was running without error notification but still not inserting all the rows in the Table.
I have checked in sqlProfiler and it says the command was RCP:Completed which I assume means it supposedly worked.
If I do the insert manually in sql management studio with the data the sql profiler gives me (the values it uses toe execute the insert statement with), it works. I have checked the data and everything seems fine (no illegal data in the rows that are not inserted).
I am totally lost as to how to fix this, anyone has an idea?
Any specific reason to use OLE DB Command instead of OLE DB Destination to insert the records?
EDIT 1:
So, you are seeing x rows (say 100) sent from Lookup transformation match output to the OLE DB destination but only y rows (say 60) are being inserted. Is that correct? Instead of inserting into your actual destination table, try to insert into a dummy table to see if all the rows are being redirected correctly. You can create a dummy table by clicking on the New... button on the OLE DB destination. It will create a table for you matching the input columns. That might help to narrow down the issue.
EDIT 2:
What is the name of the table that you are trying to use? I don't think that it matters. I am just curious if the name is any reserved keyword. One other thing that I can think of is whether there are any other processes that might trigger some action on your destination table (either from within the package or outside of the package)? I suspect that some other process might be deleting the rows from the table.