SSIS - Error Output doesn't contain the value which is erroring - error-handling

I have a source connection which is a ADO.NET MySql, I execute a stored procedure on MYSQL connection which gives me the data set, as we know SSIS tries to read the data from few rows and define the metadata when I run this procedure it defines the data based on data which it receives, however I run this procedure in loops for different sites (server and connection details are parameterized) columns and data types are same for all the sites, since SSIS reads only few columns while defining the metadata it doesn't get the data length and precision's correctly and they error out at the source connection itself, I am writing these errors to the flat file to process them later, when SSIS writes the error it replaces actual value with NULL.
For Example for column which is defined as Numeric(4,2) when it encounters the values such as 243.32 instead of writing the actual value to error file it writes a NULL value.
Is there anyway we can make SSIS write the actual value to the error file, Below is the Job Flow, Appreciate any inputs on how we can handle this,
Explained in the details of the problem

Related

Is it possible to override the Excel Data Type through SSIS?

I've tried finding a solution for my issue but alas the problem continues. I've got an Excel Data Destination which I am trying to map in to SSIS [Please note I am saying the issue with the way SSIS identifies the Data Type of the Excel input. The scenario is OLE DB Source > Data Conversion > Excel Destination, please don't tell me to do a Data Conversion or use the Input and Output Properties method because it doesn't work, it just converts back to what SSIS "thinks" it's meant to be the instant I click out of the operations window]. I'm trying to create a new Excel Document through SSIS by mapping out the template to my data source from OLE DB Source.
Now when I do it with example data in the Excel Destination, it works fine because SSIS registers that the value in the workbook is a NTEXT [which is what I want]. However, the instant I apply the expression to use a blank template [with just headers no example data] it converts the Data Type in my Template to NVARCHAR(255) which is wrong and my package fails when I execute it, due to incompatible Data Type.
I've tried converting the Data Type within the Excel workbook to a TEXT format but it doesn't matter because when you pull it in to the Data Flow SSIS overwrites it and identifies that Column as a NVARCHAR(255). Even when I give up and comply and change the Input Data to NVARCHAR(255) because I'm just so annoyed, it still doesn't work because it fails my package and gives me an error message that it truncates my column field [-_-"]. I can't win.
I'll probably try and use a SQL Command to force it to identify the column as a NTEXT in the Excel Destination Editor or just rewrite some form of Forced SSIS to identify the Column as NTEXT but is there another way I am not aware of? I feel this is quite a known issue and there should be a plausible solution. Any assistance will be appreciated. Thank you.

SSIS: Excel data source - if column not exists use other column

I am using select statement in excel source to select just specific columns data from excel for import.
But I am wondering, is it possible to select data such way when I select for example column with name: Column_1, but if this column is not exists in excel then it will try to select column with name Column_2? Currently if Column_1 is missing, then data flow task fails.
Use a Script task and write .net code to read the excel file and then perform the check for the Column_1 availability in the file. If the column does not present then use Column_2 as input. Script Task in SSIS can act as a source.
SSIS is metadata based and will not support dynamic metadata, however you can use Script Component as #nitin-raj suggested to handle all known source columns. There is a good post below on how it can be done.
Dynamic File Connections
If you have many such files that can have varying columns then it is better to create a custom component.However, you cannot have dynamic metadata even with custom component, the set of columns should be known upfront to SSIS.
If the list of columns keep changing and you cannot know in advance what are expected columns then you are better off handling the entire thing in C#/VB.Net using Script Task of control flow
As a best practice, because SSIS meta data is static, any data quality and formatting issues in source files should be corrected before ssis data flow task runs.
I have seen this situation before and there is a very simple fix. In the beginning of your ssis package, using a file task to create copy of the source excel file and then run a c# script or execute a powershell to rename the columns so that if column 1 does not exist, it is either added at the appropriate spot in excel file or in case the column name is wrong is it corrected.
As a result of this, you will not need to refresh your ssis meta data every time it fails. This is a standard data standardization practice.
The easiest way is to add two data flow tasks, one data flow for each Excel source select statement and use precedence constraints to execute the second data flow when the first one fails.
The disadvantage of this approach is that if the first data flow task fails for another reason, it will also try to execute the second one. You will need some advanced error handling to check if the error is thrown due to missing columns or not.
But if have a similar situation, I will use a Script Task to check if the column exists and build the SQL command dynamically. Note that this SQL command must always return the same metadata (you must use aliases).
Helpful links
Overview of SSIS Precedence Constraints
Working with Precedence Constraints in SQL Server Integration Services
Precedence Constraints

Variable values stored outside of SSIS

This is merely a SSIS question for advanced programmers. I have a sql table that holds clientid, clientname, Filename, Ftplocationfolderpath, filelocationfolderpath
This table holds a unique record for each of my clients. As my client list grows I add a new row in my sql table for that client.
My question is this: Can I use the values in my sql table and somehow reference each of them in my SSIS package variables based on client id?
The reason for the sql table is that sometimes we get request to change the delivery or file name of a file we send externally. We would like to be able to change those things dynamically on the fly within the sql table instead of having to export the package each time and manually change then re-import the package. Each client has it's own SSIS package
let me know if this is feasible..I'd appreciate any insight
Yes, it is possible. There are two ways to approach this and it depends on how the job runs. First is if you are running for a single client for a single job run or if you are running for multiple clients for a single job run.
Either way, you will use the Execute SQL Task to retrieve data from the database and assign it to your variables.
You are running for a single client. This is fairly straightforward. In the Result Set, select the option for Single Row and map the single row's result to the package variables and go about your processing.
You are running for multiple clients. In the Result Set, select Full Result Set and assign the result to a single package variable that is of type Object - give it a meaningful name like ObjectRs. You will then add a ForEachLoop Enumerator:
Type: Foreach ADO Enumerator
ADO object source variable: Select the ObjectRs.
Enumerator Mode: Rows in all the tables (ADO.NET dataset only)
In Variable mappings, map all of the columns in their sequential order to the package variables. This effectively transforms the package into a series of single transactions that are looped.
Yes.
I assume that you run your package once per client or use some loop.
At the beginning of the "per client" code read all required values from the database into SSIS varaibles and the use these variables to define what you need. You should not hardcode client specific information in the package.

SSIS Only inserting data if Error Output set to Redirect Row

Got a weird problem.
I have created an SSIS package that imports data via an ODBC connection between an External database and SQL Server 2008.
The Data Flow has a Source Query (simple: select columns from table).
Data Conversion to String [DT_STR]
Destination: SQL Server table with same structure.
If I execute the Task, 11,000+ rows are entered. There are No errors.
However, the data is all White Space and NULLS, except the Date fields are correct (which is how I can tell that it's trying to enter the right data).
If I change the Error Output on any single column in the Source Query to Redirect, all of the fields are populated correctly. If I set it to Ignore or Failure, we get the blanks.
I added a Flat File Destination and connected the Redirect row, and nothing is written to it.
We have several packages like this, and I've never had this problem.
I also recreated the entire Package, and the same thing happens.
Any ideas?
There are 8 Data Flow tasks doing the same thing in this package for different databases. If, I can't figure it out, is there a good way to set the Error Output to Nowhere?
UPDATE 2/26/18:
I added Data Viewers, and on the first Viewer from the Source, I get the Blanks/NULLS if the Error Output on the Source is NOT set to Redirect Rows. If I set to Redirect, the Viewer shows data. NOTE: I can set ANY column Error Output to Redirect on the Error column OR the Truncation column. As long as one is selected, the data from the Source comes through.
Since this is the Viewer between the Source and Data Conversion, doesn't that mean the problem would be at the Source Query and NOT the Data Conversion?
Here it is with Source Query - Error Output set to default Fail Component
Here I set to Redirect:

SSIS - Variable truncated/converted when passed to stored procedure

I'm working on an SSIS package that gets data from a CSV file, where there is 1 column, and the data from each row is passed as an argument to a stored procedure. Each row is 9 numbers and 1 letter (i.e. a 10 char string).
The structure of the package goes like this:
For loop that iterates over files in a given directory.
Data Flow ( Flat File Source getting CSV file -> Recordset Destination )
-> For loop that iterates over each record.
Execute SQL Task calls stored procedure with the value from each record as a parameter.
The problem:
Based on debug code in the stored procedure, this is what is passed:
Actual data: 111111111X, 222222222Y, 333333333Z.
What's received: 1, 2, 3.
I added some breakpoints and watched the variable being used to store the string, and I can see it's value during each iteration is correct, i.e. the full 10-char string, before and after the stored procedure call.
Based on some googling, this leads me to think there is some type conversion happening between the time the data is passed from the variable to the Execute SQL Task and the call is made.
This link indicates that the task will cast data sometimes:
https://social.msdn.microsoft.com/forums/sqlserver/en-US/eadaee01-33b0-47f3-8702-34a40d2fe333/debug-execute-sql-task-what-parameter-value-is-passed
I've ruled out type/length nonsense and the watched variable confirms. I've also tried casting in the statement itself, to no avail.
Can anyone confirm my hypothesis and/or offer a solution?
it may occur EVEN when:
check if your source flat file is a UNICODE file
AND
set your target column is defined as nvarchar(max).