Is it possible to override the Excel Data Type through SSIS? - sql

I've tried finding a solution for my issue but alas the problem continues. I've got an Excel Data Destination which I am trying to map in to SSIS [Please note I am saying the issue with the way SSIS identifies the Data Type of the Excel input. The scenario is OLE DB Source > Data Conversion > Excel Destination, please don't tell me to do a Data Conversion or use the Input and Output Properties method because it doesn't work, it just converts back to what SSIS "thinks" it's meant to be the instant I click out of the operations window]. I'm trying to create a new Excel Document through SSIS by mapping out the template to my data source from OLE DB Source.
Now when I do it with example data in the Excel Destination, it works fine because SSIS registers that the value in the workbook is a NTEXT [which is what I want]. However, the instant I apply the expression to use a blank template [with just headers no example data] it converts the Data Type in my Template to NVARCHAR(255) which is wrong and my package fails when I execute it, due to incompatible Data Type.
I've tried converting the Data Type within the Excel workbook to a TEXT format but it doesn't matter because when you pull it in to the Data Flow SSIS overwrites it and identifies that Column as a NVARCHAR(255). Even when I give up and comply and change the Input Data to NVARCHAR(255) because I'm just so annoyed, it still doesn't work because it fails my package and gives me an error message that it truncates my column field [-_-"]. I can't win.
I'll probably try and use a SQL Command to force it to identify the column as a NTEXT in the Excel Destination Editor or just rewrite some form of Forced SSIS to identify the Column as NTEXT but is there another way I am not aware of? I feel this is quite a known issue and there should be a plausible solution. Any assistance will be appreciated. Thank you.

Related

SSIS: Excel data source - if column not exists use other column

I am using select statement in excel source to select just specific columns data from excel for import.
But I am wondering, is it possible to select data such way when I select for example column with name: Column_1, but if this column is not exists in excel then it will try to select column with name Column_2? Currently if Column_1 is missing, then data flow task fails.
Use a Script task and write .net code to read the excel file and then perform the check for the Column_1 availability in the file. If the column does not present then use Column_2 as input. Script Task in SSIS can act as a source.
SSIS is metadata based and will not support dynamic metadata, however you can use Script Component as #nitin-raj suggested to handle all known source columns. There is a good post below on how it can be done.
Dynamic File Connections
If you have many such files that can have varying columns then it is better to create a custom component.However, you cannot have dynamic metadata even with custom component, the set of columns should be known upfront to SSIS.
If the list of columns keep changing and you cannot know in advance what are expected columns then you are better off handling the entire thing in C#/VB.Net using Script Task of control flow
As a best practice, because SSIS meta data is static, any data quality and formatting issues in source files should be corrected before ssis data flow task runs.
I have seen this situation before and there is a very simple fix. In the beginning of your ssis package, using a file task to create copy of the source excel file and then run a c# script or execute a powershell to rename the columns so that if column 1 does not exist, it is either added at the appropriate spot in excel file or in case the column name is wrong is it corrected.
As a result of this, you will not need to refresh your ssis meta data every time it fails. This is a standard data standardization practice.
The easiest way is to add two data flow tasks, one data flow for each Excel source select statement and use precedence constraints to execute the second data flow when the first one fails.
The disadvantage of this approach is that if the first data flow task fails for another reason, it will also try to execute the second one. You will need some advanced error handling to check if the error is thrown due to missing columns or not.
But if have a similar situation, I will use a Script Task to check if the column exists and build the SQL command dynamically. Note that this SQL command must always return the same metadata (you must use aliases).
Helpful links
Overview of SSIS Precedence Constraints
Working with Precedence Constraints in SQL Server Integration Services
Precedence Constraints

Padding ssis input source columns to avoid truncation errors?

First post. In SSIS I am using an ODBC Source, and the database (or ODBC driver) doesn't appear to report column metadata correctly for any of the tables in the database for varchar type columns. Therefore, each time I import a table, I get truncation errors on all the varchar fields. Is there any way to set the size of these fields besides doing it ONE AT A TIME in the advanced editor? When importing a flat file source it lets you select a padding % for string fields. Does something like this exist for OLE or ODBC sources? If not, is there any way I can override the column length to, say, force them all to be VARCHAR(1000)?
I have never experience SQL Server providing the wrong meta data for an ODBC connection and it is unlikely you have a ghost in the machine (Deus Ex Machina). The meta data of the column can be set in the ODBC source via the advanced editor. I am willing to bet that is where the difference is. To confirm this:
Right click the ODBC connection and select the Advanced Editor
Click on the Input/Ouput Properties tab
Expand OLE DB Source Output
Expand both External Columns and Output Columns
Inspect each column pair and verify that the meta data matches
Correct any outages in the meta data
Let me know if that works. If it does not work, please provide data and SQL query you are using.
The VARCHAR field width must be set to the maximum incoming field width. I know the default field width is 50. Regardless, each field must be set. I previously worked on a project with large numbers of columns on the input files. My solution was to store the meta-data for the columns in a database table and then I built a C# application to read in the meta-data and then modify the *.dtsx file and set the meta data on all columns. This is the best solution that I am aware of to automate the task.
Unfortunately, I don't have much experience with pulling data through ODBC. Are you pulling from an Access database? Or, what are you pulling from?

SSIS Only inserting data if Error Output set to Redirect Row

Got a weird problem.
I have created an SSIS package that imports data via an ODBC connection between an External database and SQL Server 2008.
The Data Flow has a Source Query (simple: select columns from table).
Data Conversion to String [DT_STR]
Destination: SQL Server table with same structure.
If I execute the Task, 11,000+ rows are entered. There are No errors.
However, the data is all White Space and NULLS, except the Date fields are correct (which is how I can tell that it's trying to enter the right data).
If I change the Error Output on any single column in the Source Query to Redirect, all of the fields are populated correctly. If I set it to Ignore or Failure, we get the blanks.
I added a Flat File Destination and connected the Redirect row, and nothing is written to it.
We have several packages like this, and I've never had this problem.
I also recreated the entire Package, and the same thing happens.
Any ideas?
There are 8 Data Flow tasks doing the same thing in this package for different databases. If, I can't figure it out, is there a good way to set the Error Output to Nowhere?
UPDATE 2/26/18:
I added Data Viewers, and on the first Viewer from the Source, I get the Blanks/NULLS if the Error Output on the Source is NOT set to Redirect Rows. If I set to Redirect, the Viewer shows data. NOTE: I can set ANY column Error Output to Redirect on the Error column OR the Truncation column. As long as one is selected, the data from the Source comes through.
Since this is the Viewer between the Source and Data Conversion, doesn't that mean the problem would be at the Source Query and NOT the Data Conversion?
Here it is with Source Query - Error Output set to default Fail Component
Here I set to Redirect:

SSIS package SQL DB to a Excel Spreadsheet destination Unicode Error

I have a DB OLE Source going to an excel destination. I receive the following error
Error at Data Flow [Excel Destination [88]]: Column "X" cannot convert between unicode and non-unicode string data types.
I have added in a data conversion to change string columns to Unicode. this has not resolved the problem. any guidance would be appreciated
Go to your excel destination component --> mapping --> hover your mouse over column in question, you'll see that it is Unicode Str. Something like this :
Hence, you need a data conversion component to add an alias of source column to DT_WSTR Unicode String AND map it in excel destination component.
I replicated your problem and thus providing you solution.
IF this doesn't work, then delete these components and re-add them, as this is will mostly resolve your issue.
Try using a derived column instead of data conversion transformation, use the following expression
If destination is unicode
(DT_WSTR,50)[X]
Else
(DR_STR,50,1252)[X]

generated excel from SSIS but getting quote in every column?

I have generated and excel from SSIS package successfully.
But every column is having extra ' (quote) mark why is it so?
My source sql table is like below
Name price address
ashu 123 pune
jkl 34 UK
In my sql table i took all column as varchar(50) datatype.
In Excel Manager when it is going to create table
Excel Destination took all column as same varchar(50) datatype.
And in Data Flow I have used Data Conversion transformation to prevent unicode conversion error.
Please advice where i need to change to get the clear columns in excel file.
You could create a template Excel file in which you have specified all the column types (change to Text from General) and headers you will need. Store it in a /Template directory and have copy it over to where you will need it from within the SSIS package.
In your SSIS package:
Use Script Component to copy Excel Template file into directory of choice.
Programatically change its name and store the whole filepath in a variable that will be used in your corresponding Data Flow Task.
Use Expression Builder for your Excel Connection Manager. Set the ExcelFilePath to be retrieved from your variable.
the single quote or apostrophe is a way of entering any data (in Excel) and ensure it is treated as text so numbers with leading zeros or fractions are not interpreted by Excel as numeric or dates.
a NJ zip code for instance 07456 would be interpreted as 7456 but by entering it as '07456 it keeps its leading zero (please note that numbers in your example are left aligned, like text is)
I guess SSIS is adding the quotes because your data is of VARCHAR type
First, define the field types for your excel destination in SSIS, any non-text fields will format properly without the '. Then, add a derived column transformation between your source and destination, and use a replace statement for any text columns.
Should be:
(REPLACE(Column1, "'","")
This caused me major problems! So I completed the following:
You can change the excel version to 'Microsoft Excel 4.0' within the excel connection manager in your SSIS package.
Then within excel follow Options > Trust Center > Trust Center Settings > File Block Settings > Untick the 'Open' checkbox for 'Excel 4 workbooks' and 'sheets'.
It is a particular problem when using the Excel destination, at least with older versions of SSIS anyway. To answer the why question, there is this in the Microsoft documentation:
The following behaviors of the Jet provider that is included with the Excel driver can lead to unexpected results when saving data to an Excel destination.
Saving text data. When the Excel driver saves text data values to an Excel destination, the driver precedes the text in each cell with the single quote character (') to ensure that the saved values will be interpreted as text values. If you have or develop other applications that read or process the saved data, you may need to include special handling for the single quote character that precedes each text value.
Taken from https://learn.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms137643(v=sql.105)