Finding the column throwing exception during data migration with SSIS from Oracle to MS SQL - sql

I am working on a data migration project. In current task, I have to select data from n number of tables from Oracle, join them and insert the data into a single SQL table. The number of rows are in millions.
Issue: There is data in Oracle which when we are trying to insert in SQL is giving exception. For example the datatype of the Oracle column is VARCHAR2 whereas in SQL it's int. The data is numbers. But there are few columns which have special characters like ','. This is one such example which will fail when we are trying to insert into SQL table. It's failing for many such columns.
I am using SSIS for this task. I am moving all the error id's of the rows into an error table which are throwing such error as mentioned in above example.
Question: I want the column name for which the insertion is failing for each row. Is there an option in SSIS? On error I want to store the id and the column name in an Error table.
Tried to search on internet, but didn't get anything. In SSIS, we do have option to configure the rows having Error. But didn't find that giving column name option to insert into a error table.
Edit: The data will come on daily basis i.e. the SSIS package will be executed daily.

The Error Output contains many columns providing information about it.
The list of columns includes the columns in the component input, the ErrorCode and ErrorColumn columns added by previous error outputs, and the ErrorCode and ErrorColumn columns added by this component.
If you are using OLEDB Destination, you cannot redirect the error rows while using Fast load option. And since you mentioned that
The number of rows are in millions.
Then it is not recommended to use the Row-by-Row insertion.
If there are few columns, i suggest adding a Data Conversion Transformation and use its Error output to get the error information.
References and helpful links
Configuring Error Output Columns
SSIS how to redirect the rows in OLEDB Destination when the fast load option is turned on and maximum insert commit size set to zero
Error Handling in Data
Error Handling With OLE DB Destinations

Related

SQL INSERT sp_cursor Error

I have a pair of linked SQL servers: ServerA and ServerB. I want to write a simple INSERT INTO SELECT statement which will copy a row from ServerA's database to ServerB's database. ServerB's database was copied directly from ServerA's, and so they should have the exact same basic structure (same column names, etc.)
The problem is that when I try to execute the following statement:
INSERT INTO [ServerB].[data_collection].[dbo].[table1]
SELECT * FROM [ServerA].[data_collection].[dbo].[table1]
I get the following error:
Msg 16902, Level 16, State 48, Line 1
sp_cursor: The value of the parameter 'value' is invalid.
On the other hand, if I try to execute the following statement:
INSERT INTO [ServerB].[data_collection].[dbo].[table1] (Time)
SELECT Time FROM [ServerA].[data_collection].[dbo].[table1]
The statement works just fine, and the code is executed as expected. The above statement executes just fine, regardless of which or how many tables I specify to insert.
So my question here is why would my INSERT INTO SELECT statement function properly when I explicitly specify which columns to copy, but not when I tell it to copy everything using "*"? My second question would then be: how do I fix the problem?
Googling around to follow up on my initial hunch, I found a source I consider reliable enough to cite in an answer.
The 'value' parameter specified isn't one of your columns, it is the optional argument to sp_cursor that is called implicitly via your INSERT INTO...SELECT.
From SQL Server Central...
I have an ssis package that needs to populate a sql table with data
from a pipe-delimited text file containing 992 (!) columns per record.
...Initially I'd set up the package to contain a data flow task to use
an ole db destination control where the access mode was set to Table
or view mode. For some reason though, when running the package it
would crash, with an error stating the parameter 'value' was not valid
in the sp_cursor procedure. On setting up a trace in profiler to see
what this control actually does it appears it tries to insert the
records using the sp_cursor procedure. Running the same query in SQL
Server Management Studio gives the same result. After much testing and
pulling of hair out, I've found that by replacing the sp_cursor
statement with an insert statement the record populated fine which
suggests that sp_cursor cannot cope when more than a certain number
of parameters are attempted. Not sure of the figure.
Note the common theme here between your situation and the one cited - a bazillion columns.
That same source offers a workaround as well.
I've managed to get round this problem however by setting the access
mode to be "Table or view - fast load". Viewing the trace again
confirms that SSIS attempts this via a "insert bulk" statement which
loads fine.

Can't convert String to Numeric/Decimal in SSIS

I have five or six OLE DB Sources with a String[DT_STR], with a length of 500 and 1252 (Latin) as Code Page.
The format of the column is like 0,08 or 0,10 etc etc. As you can see, it is separated with a comma.
All of them are equal except one of them. In this one source, I have a POINT as separation. On this it is working when I set the Data Type in the advanced editor of the OLE DB Source. On another (with comma separated) it is also working, if I set the Data Type in the advanced editor of the OLE DB Source. BUT the weird thing is, that it isn't working with the other sources although they are the same (sperated with comma).
I tested Numeric(18,2) and decimal(2).
Another try to solve the problem with the conversion task and/or the derived column task, failed.
I'm using SQL Server 2008 R2
Slowly, I think SSIS is fooling me :)
Has anyone an idea?
/// EDIT
Here a two screens:
Is working:
click
Isn't working:
click
I would not set the Data Type in the Advanced Editor of the OLE DB Source. I would convert the data in the SQL Code of the OLE DB Source, or in a Script Transformation e.g. using Decimal.TryParse , which would update a new column.
SSIS is unbeleivably fussy over datatypes and trying to mess with its internals is not productive.
Check that there are any spaces in between the commas, so that the SSIS is throwing an error trying to convert the blank space to a number. A blank space does not equal nothing in between spaces.
Redirect error rows and output the data to a file. Then you can examine the data that is being rejected by the SSIS and determine why it's causing error.
Reason for the error
1) Null’s are not properly handled either in the destination database or during SSIS package creation. It is quite possible that the source contains a null database but the destination is not accepting the NULL data leading to build generate above error.
2) Data types between source and destination does not match. For example, source column has varchar data and destination column have an int data type. This can easily generate above error. There are certain kind of datatypes which will automatically convert to another data type and will not generate the error but there are for incompatible datatypes which will generate The value could not be converted because of a potential loss of data. error.
The Issue arises when there is unhandled space or null. I have worked around using the Conditional (Ternary) Operator which checks the length:
LEN(TRIM([Column Name])) >= 1 ? (DT_NUMERIC,38,8)[Column Name] : 0

VS 2005 SSIS Error value origin

I have an ssis package created in vs 2005 that has started to give me the following error:
[Lawson Staging Table [4046]] Error: There was an error with input column "JOB_CODE" (4200) on input "OLE DB Destination Input" (4059). The column status returned was: "The value violated
the integrity constraints for the column.".
My first question is: what are the 4046, 4200 & 4059 values following my table, column and destination?
My second question is about the integrity constraint message. The destination table is a heap (no keys or indexes) with no constraints. The destination column is defined as a varchar(10). The input column is from oracle, is defined as char(9) and is called job_code. So - where is there an integrity constraint defined?
The final question is about the select statement; looks like the following:
Select ...
,lpad(trim(e.job_code),10,'0') as job_code ...
If I take the lpad and trim functions out, it works but I need these functions in place because my spec calls for a fixed length column padded with leading zeros. This column returns data as expected in TOAD but fails in the ssis package. Does anyone see an issue with how the functions are being used?
Since this package worked in the past but suddenly started to throw this error, I'm assuming that new invalid data has come into play. however, recently added rows don't seem to be any different then historical records.
Those numbers are more likely to be the ids assigned to the each task/table/column etc.
You could probably go to the advanced editor of the data flow task and look at the input and output properties. You can see that for each input or for each column there is an ID assigned.
Next: The error that you are getting occurs usually when "Allow Nulls" option is unchecked.
Try this:
Look at the name of the column for this error/warning.
Go to SSMS and find the table
Allow Nulls for that Column
Save the table
Rerun the SSIS

string or binary data truncated after server reboot

After rebooting SQL Server 2005 Standard 9.0.3233, we have been experiencing the above error in some of our stored procedures which try to insert into a table variable from a specific column of a table. The base table has the column defined as varchar(10), but the table variable has the column being inserted into defined only as varchar(3). However, the SELECT statement only returns data with 3 or less characters.
We have not changed the data or the code base in any other way, and this is only happening on our production server. If I run the same query on a test server with the same SQL Server 2005 edition installed, but an older backup, the error does not occur. The same data is returned in both queries if the INSERT is removed, or the table variable column is extended to match the base table.
What I have noticed is that the execution plan is different when the same query is run on the two servers. On the server where the query works, there is a computed scalar operation which takes the column and does an implicit conversion to varchar(3), before it is then outputted to the nested loop join operation.
On the server that returns an error, there is a hash join and table scan of the base table instead. I have already tried to rebuild indices and update statistics on all tables involved, including using fullscan, and with the same stat_stream as in the server that works, but I can't get the same plan back.
For now we have fixed the few stored procedures which were broken by modifying the size of the table variable column, but I would like to know if there is a way to get the statistics and indices back so that they produce the same plans as before, in case there is more code out there which just hasn't executed yet.
This is known behavior and probably has nothing to do with your reboot. Effectively what's happening is that the optimizer is re-ordering the logical elements of your query for performance reason, but this is resulting in the truncation-error check being done before the WHERE clause's filtering.
The recommended solution is to wrap the column expression that gets assigned to your VARCHAR(3) in a CASE.. that duplicates the length test in your WHERE clause. I know that sounds illogical, but it usually fixes the problem.

Create delimited string from a row in stored procedure with unknown number of elements

Using SQL Server 2000 and Microsoft SQL Server MS is there a way to create a delimited string based upon an unknown number of columns per row?
I'm pulling one row at a time from different tables and am going to store them in a column in another table.
A simple SQL query can't do anything like that. You need to specify the fields you are concatenating.
The only method that I'm aware of is to dynamincally build a query for each table.
I don't recall the structure of MSSQL2000, so I won't try to give an exact example, maybe someone else can. But there -are- system tables that contain table defintions. By parsing the contents of those system tables you can dynamically build the necessary query for each source data table.
TSQLthat writes TSQL, however, can be a bit tricky to debug and maintain :) So be careful how you structure everything...
Dems.
EDIT:
Or just do it in your client application.