Reusing tasks in SSIS - sql-server-2005

How can a task be reused in SSIS without copy/paste?
For example, I'd like to use the tasks I've defined in an event handler for one executable in another executable, but not with all executables in the package. So far, I haven't found any solutions other than writing a complete custom component, which seems like overkill. Any suggestions?

Have you considered using an event at the package level, and filtering to only fire when your particular condition requires it?
E.g. you could use the OnPostExecute event just by putting a dummy task in your flow with a name that starts with a specific string like "RunMyTasks", and then check the System::SourceName to see if it starts with "RunMyTasks". If it does, then branch to run your tasks (and otherwise branch to handle the event as you normally would).
You could do a similar thing using OnVariableValueChanged - this might be better (although you'd need to test it). Create a variable with RaiseChangedEvent=TRUE. Create a script task / component to change the value of the variable; finally, put your task set into the event handler.
Check the scoping notes at the bottom of Jamie's post here.

If you can use third-party solutions, check the commercial CozyRoc SSIS+ library. It includes enhanced Script Task Plus, which allows export of script to external file and then link and reuse in other packages.

Related

TFS CI build trigger include variable

Is it possible to use a variable as build trigger? I've tried and the build doesn't get triggered. If I remove the variable and insert a value, the build gets triggered as expected.
Aren't variables allowed here? $(Mapping.ServerPath) is set to MyRepo/Branches/MyBranch. $/MyRepo/Branches/MyBranch triggers the build correctly.
No. And why should they?
The specified path results in a poll action being performed on the static path.
You can use wildcards if needed.
The build should trigger on a change, hence CI trigger.
Making the path a variable, when would you provide it?
If it's just keeping a static value elsewhere, why not fill it in?
If you want to provide the path when calling the build.
Then you don't intend to use the CI option as intended?
No, it is not supported.
There is a user voice that you can follow: Allow Variables in Repository, variables and triggers Tab.
We have a microservice architecture with dozens of builds, it makes sense to be able to use a variable that we can update when we start our next iteration. With our branching strategy we have a new branch for each sprint and for each release. Changing the CI trigger in every build every couple weeks is inefficient.
We are using on premise TFS2018 and from everything I've seen this is not supported.

How to create a Logic/Script for a Data Extension?

I always implement scripts into a Cloudpage or directly into a newsletter but I never created a script which will run by her own in a special interval. Would that be possible? Maybe every night?
There is a script activity that is available that allows you to do that. However, it's for Server-Side JavaScript opposed to AMPscript. Once you save the script in the script activity you can then add it to an automation just like any other activity and execute it at the required intervals.
The feature isn't typically on by default so you will likely need to request it to be enabled by support. You should then see it listed as an option with the other activities.

How to use use if..else in Data Flow based on user variable values in SSIS

I have a fairly straightforward SSIS package with a number of Data Flow tasks each with data-flows for multiple tables like shown below:
I want to be able to execute each of these data-flows based on some user-defined variable values that I manipulate using a Script Task in control-flow. Something like, if a variable (say BESTELLDRUCK) value is true, then I want to execute the data-flow for this table (source-conversion-destination tasks), else I want to skip this table and proceed to another table (e.g. AKT_FEHLER) in same data-flow task.
How can I do this? Thanks in advance.
You cannot disable or enable transformations within the Data Flow Task. However, you can enable or disable Data Flow Tasks on the Control Flow tab.
Here is one possible way to do this on Control Flow tab:
If it is possible, move the source --> destination transformations to individual data flow tasks. Something like as shown below.
Let's assume you have created variables for each flow to enable or disable the Data Flow Task based on some condition. For this example, I have hard coded some values.
To dynamically enable or disable Data Flow Tasks based on variable. Click on a Data Flow Task and press F4 to view Properties. On the Properties, click the Ellipsis button next to the Expressions property. You will see the Property Expression Editor.
Select the Property Disable and use the Ellipsis button to enter the expression !#[User::Enable_BESTELLDRUCK] Notice the exclamation sign because the variable is declared to Enable but only Disable property is available to you need to do the opposite.
Repeat the process for other Data Flow Tasks with appropriate variables. Run the package and you will notice that the second Data Flow Task did not execute because the variable Enable_AKT_FEHLER was set to the value False.
Hope that helps.
Reference:
To load multiple tables having same schema within ForEach Loop container, take a look at the below SO answer. It transfers data from MS Access to SQL Server. Hopefully, that should give you an idea.
How do I programmatically get the list of MS Access tables within an SSIS package?
I guess there are enough pointers here for Agent 007 to resolve the issue. I would like to add a few general comments.
Enabling/disabling the tasks dynamically is not a good practice. A better way to disable a task is to use an expression within a precedence constraint. One such reference: http://www.sqlis.com/sqlis/post/Disabling-tasks-Through-Expressions.aspx
As suggested convert each STD (Source-Transform-Destination) into its own DFT. Even better use parent-child pattern. This would help in testing future additions of more DFTs.

Check for multiple files

Okay, I'll try to explain as good as I can... Quite a particular case.
Tools: SSIS 2008
We have a control flow that now needs to be triggered by an event: the presence of one or multiple files. (1,2 or 3)
The variables used:
BO_FileLocation_1
BO_FileLocation_2
BO_FileLocation_3
BO_FileName_1
BO_FileName_2
BO_FileName_3
There can be one, two or three files: defined in above variables. When they are filled in,
they should be processed. When they are empty, this means there's just one file file, the process should ignore them and jump to the next (file watcher?) task.
For example:
BO_FileLocation_1= "C:\"
BO_FileLocation_2 NULL
BO_FileLocation_3 NULL
BO_FileName_1= "test.csv"
BO_FileName_2 NULL
BO_FileName_3 NULL
The report only needs one file.
I'd need a generic concept that checks the presence of these files, it could be more generic than my SSIS knowledge can handle right now. For example handy, when there's a 4th file in the future. I was also thinking to work with a single script to handle all the logic.
Thanks in advance
A possibly irrelevant image:
If all you want is to trigger the Copy Source File to handle if one or more of the files is present, just use the OR Constraint in your flow. The following image shows you how:
First connect all to the destination:
Then click one of the green arrows. This will make its properties window pop up. Select the Logical ORinstead of the Logical AND:
If everything went well, you should now see the connections as dashed lines:
There are several possible solutions:
Create a sequence container and include all the file imports in the sequence container. Add int variables for RowCountFile1, RowCountFile2, and RowCountFile3 and set the value to 0 (this is the default value when you create an int variable). Add a RowCount transformation to each of the data flows. Create a precedence constraint from the sequence container to the "Do something" task. Set the precedence constraint to success and expression. Set the expression value to #RowCountFile1 > 0 || #RowCountFile2 > 0 || #RowCountFile3 > 0. The advantage of this approach is that you can take an action as soon as the files are detected, you import all available files, and you only take an action after all the files have been imported. You could then schedule running this SSIS package as a SQL Server Agent job step and run it as frequently as you want.
A variant on solution 1 is to use for each file enumerator containers inside the sequence container. This would be useful if you don't know the exact name of the file and you expect to import more than one under some circumstances. For instance, if you get a file every few minutes with a timestamp in its file name and your process doesn't run for some reason, then you may have to process multiple files to get caught up and then take an action once it has been done.
You could use the file watcher task as you outlined in your question. The only problem I have with the file watcher task is that the package has to be in a constantly running state. This makes it hard to troubleshoot problems and performance. It also can introduce other problems since I remember having some problems with the file watcher task years ago when it first came out. It may well be a totally stable task now, but I prefer other methods over the task after having been burned previously. If you really want the package to run continously instead of having it be called by a job, then you could always use a script task to check for file, sleep thread if not found, check again, etc. I'm sure that's what the file watcher task does, but I would trust my own C# over the task. Power to anyone who has had better experiences than me with File Watcher...
Use PowerShell. If you just want to take an action if a file appears and you aren't importing the data, then a PowerShell script could do this just as well as a SSIS package. The drawback is that you have to learn some basic PowerShell, it may be hard to maintain in the future since PowerShell is probably not your bread and butter core language, and you may have to rewrite the code again to a SSIS package if you want to import the data. You would probably call the PowerShell script from a SQL Server Agent job step, so scheduling can be handled pretty easily.
There are more options than what I listed, so let me know if you still want more suggestions.

SSIS Intermittent variable error: The system cannot find the file specified

Our SSIS pacakges a structured as one Control package and many child packages (about 30) that are invoked from the control package. The child packages are invoked with Execute Package Task. There is one Execute Package Task per child package. Each Execute Package Task uses File Connection Manager to specify path to the child package dtsx file. There is one File Connection Manager per child package. Each File Connection Manager has an expression defined for ConnectionString property. This expression looks like this:
#[Template::FolderPackages]+"MyPackage.dtsx"
The file name is different for each package. The variable (FolderPackages) is specified in the SSIS package configuration file.
The error that is generated during run time is
Error 0x80070002 while loading package file "MyPackage.dtsx"
The system cannot find the file specified." The package that fails is different from run to run and sometimes no packages fail at all. This is when run on exactly the same environment/data etc.
I ran FileMon during this error and found out that when the error happens SSIS tries to read the dtsx file from a wrong place, namely from system32. I checked that this is identical to what would happen if #[Template::FolderPackages] variable were empty, but because the very same variable is used for every child package and works for some but doesn't work sometimes for others, I have no expalnation to this fact.
Anything obvious, or time to raise a support call with Microsoft?
Are you using Expressions on the SSIS variables directly? Variables with Expressions are calculated each time the variable is referenced by the consuming object which needs to use it. That is where the race condition bug exists, because sometimes the expression doesn't get evaluated if another thread is already evaluating a different variable, and the default value for the variable is provided to the consumer object.
If that matches your design, these two bugs on the connect site discuss the problem, and the workarounds:
https://connect.microsoft.com/SQLServer/feedback/details/332372/ssis-variable-expressions-dont-always-evaluate
A second one at
connect.microsoft.com/SQLServer/feedback/details/406534/ssis-2008-variable-expressions-dont-always-evaluate
A summary of workarounds is
{
- Note the parallel tasks that could run in you SSIS control flow and utilize these expression variables. If you have two tasks side-by-side if each relies upon the same variable, and that variable has an Expression to set its value, then you could hit this.
Manually sequentialize such tasks, so that they don't run in parallel. Ie. Add a green arrow on the control flow, so that the tasks occur in order Task1, Task2, Task3, rather than side-by-side on parallel paths and rather than inside the same container with no paths.
You could avoid variable expressions: Assigning local variables in the required order using a home-made script task that does the same kind of work, so that variables are not evaluated using expressions (ie. the thing which can hit this race condition). In other words, manually assign the variable values at a point in time in your control flow just before they are used. The point of using expressions on variables is to dynamically set a value based on another value whenever it is used, so this acheives a similar design goal but in a manual way.
Reduce threads to minimize potential: Setting the Dataflow task EngineThreads to 1 and MaxConcurrentExecutables to 1. This will help sequentalize execution of your package to one task at a time, but that has the side effect which may cause slower performance.
Create and set values on distinct copies of variables at different scope levels in the design, so that they evaluate in different parallel execution scopes and avoid the expression evaluation on parallel threads. Master::Var1, Child1::Var1, Child2::Var1
}
A bit of a stab in the dark but...
I've had a similar issue with variables where readonly=false and multiple components were reading the variable at the same time and causing locking issues.
I consistently recreated the problem by running a pair of dataflows that did nothing but reference the variable inside a for loop container and changed the variable to be read only and this resolved the problem.
If you temporarily hardcode the package name does this resolve the issue?
Turns out after sending trace info to Microsoft that we are encountering heap corruption. I'll update this question if we get to the bottom of it.
The current suggestion is to disable heap lookaside for dtexec.exe.
The official answer to this issue is that it is a bug in SQL 2005 and 2008. Many tasks accessing the same variable cause a race condition, and some tasks get the default value for the expression instead of the evaluated value.
The workaround is to ensure that the default value (the value defined in the property sheet for whatever property you are having trouble with) should be the value that will work in your production environment.
This way, when the race condition happens in prod, SSIS will fall back to the package value, which will still work.
In dev? Well you're just going to have to deal with that manually until we get a bug fix from Microsoft.
There is a KB article relating to this issue: http://support.microsoft.com/kb/2448991 which states when and where this was fixed.