Is it possible to force an error in an Integration Services data flow to demonstrate its rollback? - sql-server-2005

I have been tasked with demoing how Integration Services handles an error during a data flow to show that no data makes it into the destination. This is an existing package and I want to limit the code changes to the package as much as possible (since this is most likely a one time deal).
The scenario that is trying to be understood is a "systemic" failure - the source file disappears midstream, or the file server loses power, etc.
I know I can make this happen by having the Error Output of the source set to Failure and introducing bad data but I would like to do something lighter than that.
I suppose I could add a Script Transform task and look for a certain value and throw an error but I was hoping someone has come up with something easier / more elegant.
Thanks,
Matt

mess up the file that you are trying to import by pasting some bad data or saving it in another format like UTF-8 or something like that

We always have a task at the end that closes the dataflow in our meta data tables. To test errors, I simply remove the ? that is the variable for the stored proc it runs. Easy to do and easy to fix back the way it was and it doesn't mess up anything datawise as our error trapping then closes the the data flow with an error. You could do something similar by adding a task to call a stored proc with an input variable but assign no parameters to it so it will fail. Then once the test is done, simply disable that task.

Data will make it to the destination if it is not running as a transaction. If you want to prevent populating partial data you have to use transactions. Then there is an option to set the end result of a control flow item as "failed" irrespective of the actual result but this is not available in data flow items. You will have to either produce an actual error in the data or code in a situation that will create an error. There is no other way...

Could we try with transaction level property of the package?
On failure of the data flow it will revert all the data from the target.
On successful dataflow only it will commit the data to target otherwise it will roll back the data from target.

Related

Validations within the SSIS

I currently have the following flow in an SSIS
My question is, is there any way to place an OLE DB, which allows me to insert information into a table called errors if Excel Source fails, in a few words, if the Excel file does not exist, I want to save 3 things in the table of errors, which would be, the date, an action and the message it produces. Is it possible to do this?
You can find it through Event Handler Tab in SSIS Paackage on the right of Parameters tab.
For further details, to accomplish your needs, please explore the Event Handler in SSIS, as you can see there is "OnError" right there, and there are plenty of options which occuring events that you may want to validate or make actions with, which also this including Error.
https://learn.microsoft.com/en-us/sql/integration-services/integration-services-ssis-event-handlers?view=sql-server-2017

How to display a status depending on the data flow position

Consider for example this modified Simple TCP sample program:
How can I display the current state of the program like
Wait for Connection
Connected
Connection terminated
on the frontpanel, depending on where the "data flow" currently is.
The easiest way to do this is to place a string indicator on your front panel and write messages to a local variable of this indicator at each point where you want to see a status update.
You need to keep in mind how LabVIEW dataflow works: code will execute as soon as the data it depends on becomes available. Sometimes you can use existing structures to enforce this - for example, if you put a string constant inside your loop and wire it to a local variable terminal outside the loop, the write will only happen after the loop exits. Sometimes you may need to enforce that dataflow artificially, for example by placing your operation inside a sequence frame and connecting a wire to the border of the sequence: then what's inside the sequence will only happen after data arrives on that wire. (This is about the only thing you should use a sequence for!)
This method is not guaranteed to be deterministic, but it's usually good enough for giving a simple status indication to the user.
A better version of the above would be to send the status messages on a queue or notifier which you read, and update the status indicator, in a separate loop. The queue and notifier write functions have error terminals which can help you to enforce sequence. A notifier is like the local variable in that you will only see the most recent update; a queue keeps all the data you write to it in the right order so would be more suitable if you want to log all the updates to a scrolling list or log file. With this solution you could add more features: for example the read loop could add a timestamp in front of each message so you could see how recent it was.
A really good solution to this general problem is to use a design pattern based on a state machine. Now your program flow is clearly organised into different states and it's very easy to add in functionality like sending a different message from each state. There are good examples and project templates for these design patterns included with recent versions of LabVIEW.
You should be able to find more information on any of the terms in bold in the LabVIEW help or on the NI website.

SSIS SQL Step Seems to fail

I have an SSIS package (ss2k12) I'm working on which begins with a SQL task to check if a table exists, create it if it doesn't, then truncates it. The table is a work table for a Data Flow task that follows.
When I run the task, it works. When I run the package (after deleting the table...) it fails looking for the missing table (which the sql task creates if it's missing....) Is this because it's "pre-checking" the data flow task? How do I get around the issue?
When a package receives the signal to start, the SSIS engine looks at every component and verifies that it exists, the meta data signatures match, etc. Then, when the component gets the signal that it can run, the metadata is then rechecked prior to execution.
To get around this issue, you need to use the DelayValidation property to indicate that validation should only occur when it is ready to execute.
Depending on how your package is structured, you might need to set this at both the Task (Data Flow) as well as the Package (control flow) level.

SSIS Package Not Populating Any Results

I'm trying to load data from my database into an excel file of a standard template. The package is ready and it's running, throwing a couple of validation warnings stating that truncation may occur because my template has fields of a slightly smaller size than the DB columns i've matched them to.
However, no data is getting populated to my excel sheet.
No errors are reported, and when I click preview for my OLE DB source, it's showing me rows of results. None of these are getting populated into my excel sheet though.
You should first make sure that you have data coming through the pipeline. In the arrow connecting your Source task to Destination task (I'm assuming you don't have any steps between), double click and you'll open the Data Flow Path Editor. Click on Data Viewer, then Add and click OK. That will allow you to see what is moving through the pipeline.
Something to consider with Excel is that is prefers Unicode data types to Non-Unicode. Chances are you have a database collation that is Non-Unicode, so you might have to convert the values in a Data Conversion task.
ALSO, you may need to force the package to execute in 32bit runtime. The VS application develops in a 32bit environment, so the drivers you have visibility to are 32bit. If there is no 64bit equivalent, it will break when you try and run the package. Right click on your project and click Properties and under the Debug menu you'll need to change the setting Run64BitRuntime to FALSE.
you dont provide much informatiom. Add a Data View between your source and your excel destination to see if data is passing through. Do do it, just double click the data flow path, select data view and then add a grid.
Run your app. If you see data, provide more details so we can help you
Couple of questions that may lead to an answer:
Have you checked that data is actually passed through the SSIS package at run time?
Have you double checked your mapping?
Try converting within the package so you don't have the truncation issue
If you add some more details about what you're running, I may be able do give a better answer.
EDIT: Considering what you wrote in your comment, I'd defiantly try the third option. Let us know if this doesn't solve the problem.
Just as an assist for anyone else running into this - I had a similar issue and beat my head against the wall for a long time before I found out what was going on. My export WAS writing data to the file, but because I was using a template file as the destination, and that template file had previous data that had been deleted, the process was appending the data BELOW the previously used rows. So, I was writing out three lines of data, for example, but the data did not start until row 344!!!
The solution was to select the entire spreadsheet in my template file, and delete every bit of it so that I had a completely clean sheet to begin with. I then added my header lines to the clean sheet and saved it. Then I ran the data flow task and...ta-daa!!! Perfect export!
Hopefully this will help some poor soul who runs into this same issue in the future!

Audit and error handling in SSIS

We are starting a project to handle big, big flat files. These files are kind of 'normalized' and we want to process them first to an intermediate file.
I would like to see a custom table for audit rows and a custom table for errors that are thrown during processing. Also errors must be stored in the Event Log.
What are the best practices according to audit & error handling in general for SSIS (VS2008)?
(edit)
We have made (I think) very elegant solution by designing 1 master package. This package runs a child package (the one orginally intended). The master package subscribes to the 3 events like OnInformation, OnWarning and OnError. These events are routed to a generic audit & logging service that makes calls to the Enterprise Library Logging & Exception handling blocks.
What I would recommend you is to adopt the following philosophy for stable etl processes coming from files:
Never cast anything in the connector, just import the fields as nvarchars of the maximum lenght they will achieve.
Procedurally add a rowcount for error tracking in casting errors.
Cast and control each column to your specification.
If a row cannot be read at some stage, you will not know the index, but you will know that the file is malformed (extremely rare in my experience, for half transferred files), and it should be rejected anyway.
A quick screenshot of a part of a file loading process shows how the rejection (after assigning row_id) can work (link to dataflow image). To this you can add further countless checks (duplicates...) and even have a repository for the loaded files to check upon the rejects and whatever else you might want to control (Link to control flow image).
In some of my processes, I even use a flat file connector and just import each row as a bulk text and then split it in columns with an intermediate script component, allowing for different versions of the columns in the files.
Anyway, sorry not to be more detailed (due to my status I can't add more links or any images), but I hope that you understand the concept.
Regards,
Francisco.