SSIS value of variable not changing - variables

I am new to SSIS . I am trying to use Script Task to get the last modified date and create date of a file. I have declared two variables to read the file path and file name (File_Path,Filename) in my script task as variables with scope as package and datatype as string.
I want to store the create date and modified date to two diff output variables(Create_Date,Last_Updated) with datatype as Datetime.
my code for the script is as follows
FileInfo fileInfo = new FileInfo(Path.Combine(Dts.Variables["File_Path"].Value.ToString(), Dts.Variables["Filename"].Value.ToString()));
if (fileInfo.Exists)
{
// Get file creation date
Dts.Variables["Create_Date"].Value = fileInfo.CreationTime;
// Get last modified date
Dts.Variables["Last_Updated"].Value = fileInfo.LastWriteTime;
}
else
{
Dts.Events.FireWarning(1, Dts.Variables["System::TaskName"].Value.ToString()
, string.Format("File '{0}' does not exist", fileInfo.FullName)
, "", 0);
}```

SSIS has a Design time and a Run time interface.
Variables are created in the Design time space. There you assign data type and a value. There is an explicit Variable's window that you do all of this with.
During run time, the Variables window will still be visible but the values there are not the run-time value. It's just a reference for what the package was initialized with. The actual values of SSIS variables are to be found in the debug windows. I favor the Locals window (Ctrl+Alt+V, L)
From there, expand the Variables node
You can also add explicit logging into your Script tasks. This little bit will enumerate through all the variables you selected for readonly or read/write access and pop off their name and value into the run log. If you're running in Visual Studio, it will show up in the Results tab or the Output window (great place to copy errors for further research or asking on forums). If you're running from the server, these will show in the SSISDB.catalog.operation_messages view (unless you picked an incompatible logging mode)
bool fireAgain = false;
string message = "{0}::{1} : {2}";
foreach (var item in Dts.Variables)
{
Dts.Events.FireInformation(0, "SCR Echo Back", string.Format(message, item.Namespace, item.Name, item.Value), string.Empty, 0, ref fireAgain);
}

Related

Repast: how to add and set a new parameter directly from the code instead of GUI

I want to create a parameter that contains a list of string (list of hub codes). This list of string is created by reading an external csv file (this list could contain the different codes depending on the hub codes in the CSV file)
What I want is to find a easy auto way to perform batch runs by each hub code in the list.
So this question is:
1) how to add and set a new parameter directly from the code (during the initialization when reading the CSV) instead of GUI parameter panel?
2) how to avoid manual configuration of hub list in the batch run configuration
Something like this for adding the parameters should work in your ContextBuilder.
Parameters params = RunEnvironment.getInstance().getParameters();
((DefaultParameters)params).addParameter("foo", "Big Foo", Integer.class, 3, false);
You would read the csv file to get the parameter name and value.
I'm not sure I completely understand the batch run configuration question, but each batch run has a run number associated with it
RunState.getInstance().getRunInfo().getRunNumber()
If you can associate line numbers in your csv parameter file with run number (e.g. run number 1 should use line 1, and so on), then each batch run would use a different parameter line.

Pentaho Data Integration setVariable not working

I am on PDI 7.0 and have a "Modified Java Script Value" step inside a transformation as below:
var numberOfDays = 100;
Alert(numberOfDays);
setVariable("NUMBER_OF_DAYS", numberOfDays, "r");
Alert(getVariable("NUMBER_OF_DAYS", ""));
However, when I run the transformation, the first Alert correctly throws 100, but the next Alert is blank (meaning the variable is not set).
What is wrong here?
As a rule of thumb, you should never set a variable and read it within the same transformation.
See a warning that pops up in Spoon when setting up Set Variables step:
That said, what you could do, if you really insist on setting this via Java Script is the following design:
where
1) Set variable transformation is used to set the value:
var numberOfDays = 100;
Alert(numberOfDays);
setVariable("NUMBER_OF_DAYS", numberOfDays, "r");
2) Get variable transformatoin only reads it
Alert(getVariable("NUMBER_OF_DAYS", ""));
Both transformations use the same steps, but they have separate task.

Debugging u-sql Jobs

I would like to know if there are any tips and tricks to find error in data lake analytics jobs. The error message seems most of the time to be not very detailed.
When trying to extract from CSV file I often get error like this
Vertex failure triggered quick job abort. Vertex failed: SV1_Extract[0] with >error: Vertex user code error.
Vertex failed with a fail-fast error
It seems that these error occur when trying to convert the columns to specified types.
The technique I found is to extract all columns to string and then do a SELECT that will try to convert the columns to the expected type. Doing that columns by columns can help find the specific column in error.
#data =
EXTRACT ClientID string,
SendID string,
FromName string,
FROM "wasb://..."
USING Extractors.Csv();
//convert some columns to INT, condition to skip header
#clean =
SELECT Int32.Parse(ClientID) AS ClientID,
Int32.Parse(SendID) AS SendID,
FromName,
FROM #data
WHERE !ClientID.StartsWith("ClientID");
Is it also possible to use something like a TryParse to return null or default values in case of a parsing error, instead of the whole job failing?
Thanks
Here is a solution without having to use code behind (although Codebehind will make your code a bit more readable):
SELECT ((Func<string, Int32?>)(v => { Int32 res; return Int32.TryParse(v, out res)? (Int32?) res : (Int32?) null; }))(ClientID) AS ClientID
Also, the problem you see regarding error message being cryptic has to do with a bug that should be fixed soon in returning so called inner error messages. The work around today is to do the following:
In the ADL Tools for VisualStudio, open the Job View of the failed job.
In the lower left corner, click on “resources” link in the job detail area.
Once the job resources are loaded, click on “Profile”.
Search for the string “jobError” at the beginning of the line. Copy the entire line of text and paste in notepad (or other text editor) to read the actual error.
That should give you the exact error message.
Yes, you can use TryParse using U-SQL user defined functions. You can do this like:
In code behind:
namespace TestNS
{
public class TestClass
{
public static int TryConvertToInt(string s)
{
int i = 0;
if (Int32.TryParse(s, out i))
return i;
return 0;
}
}
}
In U-SQL Script:
TestNS.TestClass.TryConvertToInt(ClientID) AS clientID
It looks like you have some other issues, as I always get appropriate error in case of conversion problem, something like:
"E_RUNTIME_USER_EXTRACT_COLUMN_CONVERSION_INVALID_ERROR","message":"Invalid character when attempting to convert column data."

How to change the SqlStatementSource in a SSIS package through job step advanced tab

I have a ssis package deployed and created a sql agent job which executes the package.I need to change the SqlStatementSource in one of the sql task in package through job step advanced tab. Can any one help me how to do that? I somewhere read its possible but not able to recall how exactly it can be done?
You have two choices based upon the source provider for changing your query.
DFT Test uses an OLE DB Source and DFT T2 is an ADO.NET Source.
My Data Flow is a Source to a Script Task.
The Source is a simple in-line query: SELECT 1 AS Foo;
The Script Task simply fires an OnInformation event so I can see the data row(s) as they flow through
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
bool fireAgain = false;
ComponentMetaData.FireInformation(0, "Foo value", Row.Foo.ToString(), string.Empty, 0, ref fireAgain);
}
}
OLE DB Source
This is going to require that you have done your work ahead of time. You can control the source statement through an SSIS Variable of type String. Here I chose to name it #[User::QuerySource] and have set the OLE DB Source to use a Variable as the query source.
That you can configure at run-time/SQL Agent
DTEXEC /file so_31100091.dtsx /set "\Package.Variables[User::QuerySource].Properties[Value]";"SELECT 2 AS Foo" /REP I
The above command would run the package and assign the value of SELECT 2 AS Foo to the Variable QuerySource, which lives at the root of the package. Finally, I have the engine report the Information events so it gets logged.
ADO NET Source
This is one of the few times, an ADO NET Source can be helpful. It can be configured directly without modifications to the package itself
DTEXEC /file so_31100091.dtsx /set "\Package\DFT T2.Properties[[ADO_SRC tempdb].[SqlCommand]]";"SELECT 3 AS Foo" /REP I
Here I use the command SELECT 3 AS Foo and then set task, "DFT T2" which has a source called "ADO_SRC tempdb", property of SqlCommand.
Wrapup
The important thing to note is the query provided must match the signature (column names and data types).
In the above I have manually executed the SSIS package. In the SQL Agent Job Step Editor, you will use the "Set values" tab to access the key value pairs.

Applying SSIS Package Configuration to multiple packages

I have about 85 SSIS packages that are using the same connection manager.
I understand that each package has its own connection manager.
I am trying to decide what would be the best configurations approach to simply set the connectionstring of the connection manager based on the server the packages are residing on.
I have visited all kinds of suggestions online, but cannot find anywhere the practice where I can simply copy the configuration from one package to the rest of the packages.
There are obviously many approaches such as XML file, SQL Server, Environment Variable, etc.
All the articles out there are pointing to use an Indirect method by using XML or SQL approach. Why would using an environment variable for just holding a connection string is such a bad approach?
Any suggestions are highly appreciated.
Thanks!
Why would using an environment variable for just holding a connection string is such a bad approach?
I find the environment variable or registry key configuration approach to be severely limited by the fact that it can only configure one item at a time. For a connection string, you'd need to define an environment variable for each catalog on a given server. Maybe it's only 2 or 3 and that's manageable. We had a good 30+ per database instance and we had multi-instanced machines so you can see how quickly this problem explodes into a maintenance nightmare. Contrast that with a table or xml based approach which can hold multiple configuration items for a given configuration key.
...best configurations approach to simply set the connectionstring of the connection manager based on the server the packages are residing on.
If you go this route, I'd propose creating a variable, ConnectionString and using it to configure the property. It's an extra step but again I find it's easier to debug a complex expression on a variable versus a complex expression on a property. With a variable, you can always pop a breakpoint on the package and look at the locals window to see the current value.
After creating a variable named ConnectionString, I right click on it, select Properties and set EvaluateAsExpression equal to True and the Expression property to something like "Data Source="+ #[System::MachineName] +"\\DEV2012;Initial Catalog=FOO;Provider=SQLNCLI11.1;Integrated Security=SSPI;"
When that is evaluated, it'd fill in the current machine's name (DEVSQLA) and I'd have a valid OLE DB connection string that connects to a named instance DEV2012.
Data Source=DEVSQLA\DEV2012;Initial Catalog=FOO;Provider=SQLNCLI11.1;Integrated Security=SSPI;
If you have more complex configuration needs than just the one variable, then I could see you using this to configure a connection manager to a sql table that holds the full repository of all the configuration keys and values.
...cannot find anywhere the practice where I can simply copy the configuration from one package to the rest of the packages
I'd go about modifying all 80something packages through a programmatic route. We received a passel of packages from a third party and they had not followed our procedures for configuration and logging. The code wasn't terribly hard and if you describe exactly the types of changes you'd make to solve your need, I'd be happy to toss some code onto this answer. It could be as simple as the following. After calling the function, it will modify a package by adding a sql server configuration on the SSISDB ole connection manager to a table called dbo.sysdtsconfig for a filter named Default.2008.Sales.
string currentPackage = #"C:\Src\Package1.dtsx"
public static void CleanUpPackages(string currentPackage)
{
p = new Package();
p.app.LoadPackage(currentPackage, null);
Configuration c = null;
// Apply configuration Default.2008.Sales
// ConfigurationString => "SSISDB";"[dbo].[sysdtsconfig]";"Default.2008.Sales"
// Name => MyConfiguration
c = p.Configurations.Add();
c.Name = "SalesConfiguration";
c.ConfigurationType = DTSConfigurationType.SqlServer;
c.ConfigurationString = #"""SSISDB"";""[dbo].[sysdtsconfig]"";""Default.2008.Sales""";
app.SaveToXml(sourcePackage, p, null);
}
Adding a variable in to the packages would not take much more code. Inside the cleanup proc, add code like this to add a new variable into your package that has an expression like the above.
string variableName = string.Empty;
bool readOnly = false;
string nameSpace = "User";
string variableValue = string.Empty;
string literalExpression = string.Empty;
variableName = "ConnectionString";
literalExpression = #"""Data Source=""+ #[System::MachineName] +""\\DEV2012;Initial Catalog=FOO;Provider=SQLNCLI11.1;Integrated Security=SSPI;""";
p.Variables.Add(variableName, readOnly, nameSpace, variableValue);
p.Variables[variableName].EvaluateAsExpression = true;
p.Variables[variableName].Expression = literalExpression;
Let me know if I missed anything or you'd like clarification on any points.