SSIS Only inserting data if Error Output set to Redirect Row - sql

Got a weird problem.
I have created an SSIS package that imports data via an ODBC connection between an External database and SQL Server 2008.
The Data Flow has a Source Query (simple: select columns from table).
Data Conversion to String [DT_STR]
Destination: SQL Server table with same structure.
If I execute the Task, 11,000+ rows are entered. There are No errors.
However, the data is all White Space and NULLS, except the Date fields are correct (which is how I can tell that it's trying to enter the right data).
If I change the Error Output on any single column in the Source Query to Redirect, all of the fields are populated correctly. If I set it to Ignore or Failure, we get the blanks.
I added a Flat File Destination and connected the Redirect row, and nothing is written to it.
We have several packages like this, and I've never had this problem.
I also recreated the entire Package, and the same thing happens.
Any ideas?
There are 8 Data Flow tasks doing the same thing in this package for different databases. If, I can't figure it out, is there a good way to set the Error Output to Nowhere?
UPDATE 2/26/18:
I added Data Viewers, and on the first Viewer from the Source, I get the Blanks/NULLS if the Error Output on the Source is NOT set to Redirect Rows. If I set to Redirect, the Viewer shows data. NOTE: I can set ANY column Error Output to Redirect on the Error column OR the Truncation column. As long as one is selected, the data from the Source comes through.
Since this is the Viewer between the Source and Data Conversion, doesn't that mean the problem would be at the Source Query and NOT the Data Conversion?
Here it is with Source Query - Error Output set to default Fail Component
Here I set to Redirect:

Related

SSIS - Error Output doesn't contain the value which is erroring

I have a source connection which is a ADO.NET MySql, I execute a stored procedure on MYSQL connection which gives me the data set, as we know SSIS tries to read the data from few rows and define the metadata when I run this procedure it defines the data based on data which it receives, however I run this procedure in loops for different sites (server and connection details are parameterized) columns and data types are same for all the sites, since SSIS reads only few columns while defining the metadata it doesn't get the data length and precision's correctly and they error out at the source connection itself, I am writing these errors to the flat file to process them later, when SSIS writes the error it replaces actual value with NULL.
For Example for column which is defined as Numeric(4,2) when it encounters the values such as 243.32 instead of writing the actual value to error file it writes a NULL value.
Is there anyway we can make SSIS write the actual value to the error file, Below is the Job Flow, Appreciate any inputs on how we can handle this,
Explained in the details of the problem

Azure Data Factory: trivial SQL query in Data Flow returns nothing

I am experimenting with Data Flows in Azure Data Factory.
I have:
Set up a LinkedService to a SQL Server db. This db only has 2 tables.
The two tables are called "dummy_data_table1" and "dummy_data_table1" and are registered as Datasets
The ADF is copying data from these 2 tables, and in the Data Flow they are called "source1" and "source2"
However, when I select a source, go to Source options, and change Input from Table to Query and enter a simple query, it returns 0 columns (there are 11 columns in dummy_data_table1). I suspect my syntax is wrong, but how should I change it?
Hopefully this screenshot will help.
The problem was not the syntax. The problem was that the data flow could not recognize "dummy_data_table1" because it didn't refer to anything known. To make it work, I had to:
Enable Data Flow Debug (at the top of the page, not visible in my screenshot)
Once that's enabled, I had to click on "import projection" to import the schema of my table
Once this is done, the table name and fields are all automatically recognized and can be referenced to in the query just like one would do in SQL Server.
Source:
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-source#import-schema

Is it possible to override the Excel Data Type through SSIS?

I've tried finding a solution for my issue but alas the problem continues. I've got an Excel Data Destination which I am trying to map in to SSIS [Please note I am saying the issue with the way SSIS identifies the Data Type of the Excel input. The scenario is OLE DB Source > Data Conversion > Excel Destination, please don't tell me to do a Data Conversion or use the Input and Output Properties method because it doesn't work, it just converts back to what SSIS "thinks" it's meant to be the instant I click out of the operations window]. I'm trying to create a new Excel Document through SSIS by mapping out the template to my data source from OLE DB Source.
Now when I do it with example data in the Excel Destination, it works fine because SSIS registers that the value in the workbook is a NTEXT [which is what I want]. However, the instant I apply the expression to use a blank template [with just headers no example data] it converts the Data Type in my Template to NVARCHAR(255) which is wrong and my package fails when I execute it, due to incompatible Data Type.
I've tried converting the Data Type within the Excel workbook to a TEXT format but it doesn't matter because when you pull it in to the Data Flow SSIS overwrites it and identifies that Column as a NVARCHAR(255). Even when I give up and comply and change the Input Data to NVARCHAR(255) because I'm just so annoyed, it still doesn't work because it fails my package and gives me an error message that it truncates my column field [-_-"]. I can't win.
I'll probably try and use a SQL Command to force it to identify the column as a NTEXT in the Excel Destination Editor or just rewrite some form of Forced SSIS to identify the Column as NTEXT but is there another way I am not aware of? I feel this is quite a known issue and there should be a plausible solution. Any assistance will be appreciated. Thank you.

Data transfer from a view from one database to a table in another database

I am just trying to find out whether this is the right way to do this task.
Any other suggestions to improve this is greatly appreciated.
I have the following on my SSIS package.
Data Flow task and established a OLE DB connection to the source database where the view is.
Execute SQL task - I am executing a query with a INSERT INTO Destination Except (all those records that are already there from the source.)
Send mail task is to send out an email.
How to know that the data transfer is successful? So that I can use the send mail to
indicate success or failure.
How to schedule this package so that it runs automatically (Every Tuesday.)
I have tried the suggestion below. Please refer to the new Data Flow task.
OLE DB Source - Points to a view in database server 1
Lookup gets all the rows from OLE DB source. (the rowcount on source and on the lookup)
matches.
On the lookup task, I have configured error output to use 'Redirect row' on all the mapped columns.
On the OLE DB Destination (Destination table where it already has a subset of records from the source. So the Configured Error output to get unmatches rows for insert.
When, I execute the package - I am getting an Primary key constraint error as - Cannot insert duplicate key.
Any suggestions?
You will want to double click the connector from the Execute SQL Task to Send Mail Task Currently it's green which indicates it will only take that path on Success. You will want to update the constraint to be on Completion as you don't care if it's Success or Fail.
It sounds like you have your data flow pulling all of the data from your source and writing to a staging table. In your Execute SQL Task, you then use a query to add data into your target table where it doesn't exist.
This can be consolidated into a single Data Flow. Between your OLE DB Source and OLE DB Destination, add a Lookup task. Since you are on 2005, the Lookup behaves a bit differently than 2008+. You will write a query that pulls back the business keys in your target table and then compares that to what is coming from your OLE DB Source. Map those keys in the interface.
You only want the rows that aren't matched so you will need to get the "unmatched records" from the lookup. In 2005, the option for Unmatched output didn't exist so you will need to route the Error output to your OLE DB Destination.
Andy Leonard has a nice little writeup on how to accomplish this: Configuration an SSIS 2005 Lookup Transformation for a Left Outer Join The only difference for your case, is that you don't care about the matched rows. Instead of Ignore Failure, you want to select Redirect Row. Then when you go to connect the Lookup to the OLE DB Destination, you will be presented with two options. The Green Connector is the Matched, Red Connector is the Unmatched rows. Tie the Red path to your Destination

How to resume data migration from the point where error happened in ssis?

I am migrating data from an Oracle database to a SQL server 2008 r2 database using SSIS. My problem is that at a certain point the package fails, say some 40,000 rows out of 100,000 rows. What can I do so that the next time when I run the package after correcting the errors or something, I want it to be restarted from the 40,001st row, i.e, the row where the error had occured.
I have tried using checkpoint in SSIS, but the problem is that they work only between different control flow tasks. I want something that can work on the rows that are being transferred.
There's no native magic I'm aware of that is going to "know" that it failed on row 40,000 and when it restarts, it should start streaming row 40,001. You are correct that checkpoints are not the answer and have plenty of their own issues (can't serialize Object types, loops restart, etc).
How you can address the issue is through good design. If your package is created with the expectation that it's going to fail, then you should be able to handle these scenarios.
There are two approaches I'm familiar with. The first approach is to add a Lookup Transformation in the Data Flow between your source and your destination. The goal of this is to identify what records exist in the target system. If no match is found, then only those rows will be sent on to destination. This is a very common pattern and will allow you to also detect changes between source and destination (if that is a need). The downside is that you will always be transferring the full data set out of the source system and then filtering rows in the data flow. If it failed on row 99,999 out of 1,000,000 you will still need to stream all 1,000,000 rows back to SSIS for it to find the 1 that hasn't been sent.
The other approach is to use a dynamic filter in your WHERE clause of your source. If you can make assumptions like the rows are inserted in order, then you can structure your SSIS package to look like Execute SQL Task where you run a query like SELECT COALESCE(MAX(SomeId), 0) +1 AS startingPoint FROM dbo.MyTable against the Destination database and then assign that to an SSIS variable (#[User::StartingId]). You then use an expression on your select statement from the Source to be something like "SELECT * FROM dbo.MyTable T WHERE T.SomeId > " + (DT_WSTR, 10) #[User::StartingId] Now when the data flow begins, it will start where it last loaded data. The challenge on this approach is finding those scenarios where you know data hasn't been inserted out of order.
Let me know if you have questions, need things better explained, pictures, etc. Also, above code is freehanded so there could be syntax errors but the logic should be correct.