Duplicating a job in Pentaho Data Integration for different connections - pentaho

I've generated a job via the Copy Tables wizard in Spoon UI, that copies some tables from an oracle database source to an SQL Server one, and made some changes to the job as well.
Now I want to duplicate the same job (same tables and same changes), but changing just the connexions. Is that possible in Spoon ?
I've looked through the Spoon UI and didn't find any option that lets me duplicate the job with changing connexions.
EDIT
After I created the two steps: one for generating rows and the other for obfuscating passwords, In the encrypted field, I do not get the 'Encrypted : Obfusctaed Password' output as expected
here is what the step generate rows looks like :
and here is an other picture for the Modified Java Script Value :

You need to make a copy of your kjb file. Jobs and transformations are in fact XML files. You can then edit it manually.
This is pretty straight-forward, with <connection> tags so you should be able to figure it all out by yourself.
I find it the fastest way if you want to keep two jobs instead of changing db connection credentials every time.
If you need to provide an obfuscated password (they are not encrypted, just obfuscated) you can create a transformation that will obfuscate it for you providing you the value to put into XML file.
Steps to reproduce creating a transformation for obfuscating passwords in Kettle 6.1 (for older versions the name of the Script Values / Mod step is Modified Java Script Value):
Step Generate rows with just 1 row storing password as value
Step Script Values / Mod for basic obfuscation

There is example in $KETTLE_HOME/samples/transformation/job-executor.
Pass connection parameters to sub-job
Bad thing u cant pass jdbc driver name so, they have to be same type of database with different connection settings

There is no way to do what you want directly from Pentaho, and one option is to directly alter the transformation's XML to change connections. So the idea is the following:
Figure out how the connection's XML will look like. For this just
register a new connection, use it somewhere in your transformation
and watch the XML source code for element like
........
Make a physical copy of your transformations
Replace connection definition and reference in the XML file. For this you may use XSLT like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- This template will replace the connection definition -->
<xsl:template match="connection[./name='SOURCE_CONNECTION_NAME']">
<!-- This is the connection configuration -->
<connection>
<name>TARGET_CONNECTION_NAME</name>
<server>localhost</server>
<type>ORACLE</type>
<access>Native</access>
<database><!-- DB NAME --> </database>
<port>1521</port>
<username><!-- USERNAME --> </username>
<password><!-- PWD --></password>
<servername/>
<data_tablespace><!-- --></data_tablespace>
<index_tablespace/>
<attributes>
<attribute><code>FORCE_IDENTIFIERS_TO_LOWERCASE</code><attribute>N</attribute></attribute>
<attribute><code>FORCE_IDENTIFIERS_TO_UPPERCASE</code><attribute>N</attribute></attribute>
<attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
<attribute><code>PORT_NUMBER</code><attribute>1521</attribute></attribute>
<attribute><code>PRESERVE_RESERVED_WORD_CASE</code><attribute>Y</attribute></attribute>
<attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribute>
<attribute><code>SUPPORTS_BOOLEAN_DATA_TYPE</code><attribute>Y</attribute></attribute>
<attribute><code>SUPPORTS_TIMESTAMP_DATA_TYPE</code><attribute>Y</attribute></attribute>
<attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
</attributes>
</connection>
</xsl:template>
<!-- And that one will replace the connection's reference in table input/table output -->
<xsl:template match="connection[text()='SOURCE_CONNECTION_NAME']">
<connection>TARGET_CONNECTION_NAME</connection>
</xsl:template>
</xsl:stylesheet>

Ah, I do believe you can do this however I have not done it myself as of yet. No need. But I believe you can use shared objects to get this kind of functionality that you want and just have the one (much easier to maintain) transformation. Here's a forum link where it's discussed.
Let me know how it works out. Pretty curious.
http://forums.pentaho.com/showthread.php?75069-Fully-Dynamic-Database-Configuration-Including-underlying-databsae-type

Related

Using Xsl <fo:external-graphic> to hand over data

Is there a way in XSL 1.0 to hand over variables or parameters using XSL fo:external-graphic like I would do when I'm using xsl:call-template
I know how I could work around the problem but I just wanted to know if there is a way I am not seeing.
If your SVGs are small enough, you could use fo:instream-foreign-object. (If the SVGs are very large, the size of the XSL-FO file might become a problem.)
main.xsl:
<xsl:import href="svg/svg_graphic.xsl" />
<xsl:template match="some/context">
<fo:instream-foreign-object>
<xsl:call-template name="make-svg">
<xsl:with-param name="param-a" select="..." />
</call-template>
</fo:instream-foreign-object>
</xsl:template>
svg_graphic.xsl:
<xsl:template name="make-svg">
<xsl:param name="param-a" select="..." />
<svg:svg>
...
</svg:svg>
</xsl:template>
<fo:external-graphic src="svg/svg_graphic.xsl" /> is not going to work. Inside your XSLT stylesheet, the elements in the XSL-FO namespace are just literal result elements. They are copied to the result tree, and they are not otherwise acted on by the XSLT processor. XSLT-specific attributes on literal result elements (where the attribute in the XSLT namespace and is defined as meaning something when used on literal result elements) are acted on by the XSLT processor. Attribute value templates ({...}) in attribute values of literal attributes (and some XSLT-defined attributes) are acted on by the XSLT processor.
There is no XSLT 1.0 way to get the XSLT processor to run another stylesheet based on the value of an XSL-FO-defined attribute.
There's also no XSLT 1.0 way to generate multiple result documents from one run of a stylesheet. Your XSLT processor probably has a processor-specific (or EXSLT) way to do that. The extension, if it exists, might not let you generate part of the XSL-FO result document, generate an SVG result document, and then go back to generating more of the XSL-FO document.

Can I exclude a specific file name with Wix Heat using transforms?

Is it possible to exclude a specific file name with Wix using transforms?
I can exclude files that contain a certain string, but this excludes any file name matching the string. For example I can exclude file.exe with the following;
<xsl:key name="fileexe-search" match="wix:Component[contains(wix:File/#Source, 'file.exe')]" use="#Id"/>
but this will also exclude files with file.exe in their name, like file.exe.config.
Thanks.
Looks like you should use ends-with instead of contains. But ends-with does not exist in XSLT 1.0. :)
This answer gives enough details to get the idea of how to implement it. Basically it is a combination of substring and string-length functions.
Besides, you should also consider normalizing the casing before comparison. That is, it is better to lower-case (or upper-case) both strings - the original and the one it ends with. This post can give you an idea of how to do it.
Keeping all this in mind, you will end up with something similar to this:
<!-- The starting backslash is there to filter out files like 'abcfile.exe' -->
<!-- Besides, it's lower-cased to ease comparison -->
<xsl:variable name="FileName">\file.exe</xsl:variable>
<xsl:variable name="ABC">ABCDEFGHIJKLMNOPQRSTUVWXYZ</xsl:variable>
<xsl:variable name="abc">abcdefghijklmnopqrstuvwxyz</xsl:variable>
<xsl:key name="fileexe-search" match="wix:Component[translate(substring(wix:File/#Source, string-length(wix:File/#Source) - string-length($FileName) + 1), $ABC, $abc) = $FileName]" use="#Id"/>
Although the answer provided by #Yan works I prefer to use C# which is simpler to use.
<xsl:stylesheet version="1.0"
...
xmlns:my="urn:my-installer">
...
<msxsl:script language="C#" implements-prefix="my">
<msxsl:using namespace="System.IO" />
<![CDATA[
public bool EndsWith(string str, string end)
{
if (string.IsNullOrEmpty(str))
return false;
if (string.IsNullOrEmpty(end))
return false;
return str.EndsWith(end);
}
]]>
</msxsl:script>
...
Usage example:
<xsl:key name="ignored-components-search" match="wix:Component[my:EndsWith(wix:File/#Source, '.pssym')
or my:EndsWith(wix:File/#Source, '.pdb')
or my:EndsWith(wix:File/#Source, '.cs')
or my:EndsWith(wix:File/#Source,'.xml')
]" use="#Id" />

How costly is usage of unnecessary variables in XSLT?

It's more of a clarification that I am in need ..
as per this answer on a question, XSLT variables are cheap! My question is: Is this statement valid for all the scenarios? The instant variables which get created and get destroyed withing 4 line code aren't bothersome but loading a root node or child entities, in my opinion is indeed bad practice..
I have two XSLT files, designed for same input and output requirement:
XSLT1 (without unnecessary variable):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<Collection>
<xsl:for-each select="CATALOG/CD">
<DVD>
<Cover>
<xsl:value-of select="string(TITLE)"/>
</Cover>
<Author>
<xsl:value-of select="string(ARTIST)"/>
</Author>
<BelongsTo>
<xsl:value-of select="concat(concat(string(COUNTRY), ' '), string(COMPANY))"/>
</BelongsTo>
<SponsoredBy>
<xsl:value-of select="string(COMPANY)"/>
</SponsoredBy>
<Price>
<xsl:value-of select="string(number(string(PRICE)))"/>
</Price>
<Year>
<xsl:value-of select="string(floor(number(string(YEAR))))"/>
</Year>
</DVD>
</xsl:for-each>
</Collection>
</xsl:template>
</xsl:stylesheet>
XSLT2 (with unnecessary variable "root" in which whole XML is loaded):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="root" select="."/>
<Collection>
<xsl:for-each select="$root/CATALOG/CD">
<DVD>
<Cover>
<xsl:value-of select="string(TITLE)"/>
</Cover>
<Author>
<xsl:value-of select="string(ARTIST)"/>
</Author>
<BelongsTo>
<xsl:value-of select="concat(concat(string(COUNTRY), ' '), string(COMPANY))"/>
</BelongsTo>
<SponsoredBy>
<xsl:value-of select="string(COMPANY)"/>
</SponsoredBy>
<Price>
<xsl:value-of select="string(number(string(PRICE)))"/>
</Price>
<Year>
<xsl:value-of select="string(floor(number(string(YEAR))))"/>
</Year>
</DVD>
</xsl:for-each>
</Collection>
</xsl:template>
</xsl:stylesheet>
Approach-2 exists in realtime and infact the XML would be several KBs to few MBs, In XSLT usage of variables is extended to child entities as well..
To put-forth my proposal to change the approach, I need to verify the theory behind it..
As per my understanding incase of approach-2, system is reloading the XML data over and over in memory (incase of usage of multiple variables to load child entities the situation turns worst) and thereby slowing down the transformation process.
Before posting this question here I tested the performance of two XSLTs using timer. First approach takes few milliseconds lesser than approach-2. (I used copy-XML files to test two XSL files to avoid complexity with system cache). But again system cache might play huge confusing role here ..
Despite of this analysis of mine I still have a question in mind! Do we really need to avoid usage of variables. And as far as my system is concerned, how worthy is it to modify the realtime XSLT files, so as to use 'approach-1'?
OR Is it like XSLT variables are different than other programming languages (Incase if I'm not aware) .. Say for example, XSLT variables don't actually store the data when you do select="." but they kind of point to the data! or something like this..? AND HENCE continue using XSLT variables without hesitation..
What is your suggestion on this?
Quick Info on current system:
Host Programming Language or System: Siebel (C++ is the backend code)
XSLT Processor: Xalan (Unless Saxon is used explicitely)
I agree with the comments made that you need to measure performance with your particular XSLT processor.
But your descriptions or expectations like "approach-2, system is reloading the XML data over and over in memory" seem wrong to me. The XSLT processor builds an input tree of the primary input XML document anyway and I can't imagine that any implementation then with <xsl:variable name="root" select="."/> does anything like loading the document completely again, it would even be wrong, as node identity and generate-id would not work. The variable will simply keep a reference to the document node of the existing input tree.
Of course in your sample where you have a single input document and a single template where the current node is the document anyway the use of the variable you have is superfluous. But there are cases where you need to store the document node of the primary input document, in particular when you deal with multiple documents.

xslt result tree changing element position

Is it possible to move the result generated by the XSLT processor. e.g. in the below case i wanted Benefit type="Main" and its child elements to be displayed before Benefit type="Rider"
Two separate templates are applied for Rider and Main, hence i think xsl:sort cannot be applied, since it sorts within a single collection.
<Policy>
<Benefit type="Rider">
<ProductAbbreviatedName>BBB</ProductAbbreviatedName>
<ProductCode>U30</ProductCode>
<ProductName>BBB</ProductName>
</Benefit>
<Benefit type="Main">
<ProductAbbreviatedName>AAA</ProductAbbreviatedName>
<ProductCode>231Y</ProductCode>
<ProductName>AAAA</ProductName>
</Benefit>
</Policy>
Kindly advice on some ideas to perform the desired output. Many thanks.
Just use in your code:
<xsl:apply-templates select="Benefit[#type='Main']"/>
<xsl:apply-templates select="Benefit[#type='Rider']"/>

Using a variable as part of a XPath selection

I'm looking to use a variable as part of an XPath expression.
My problem might be the msxsl node-set function... not sure. But don't let that cloud your judgement... read on...
I'm using a .NET function to load up the content file, which is passed in via bespoke XML content. The #file results in an XML file.
The bespoke XML the sits on the page looks like :
<control name="import" file="information.xml" node="r:container/r:group[#id='set01']/r:item[1]" />
The XSL looks like :
<xsl:variable name="document">
<xsl:copy-of select="ext:getIncludedContent(#file)" />
</xsl:variable>
I'm then translating this to a node-set so I can query the document
<xsl:variable name="preset-xml" select="msxsl:node-set($document)" />
The source file I am loading in looks like :
<container>
<group id="set01">
<item>value01</item>
<item>value02</item>
<item>value03</item>
</group>
<group id="set02">
<item>value04</item>
<item>value05</item>
</group>
</container>
It works up until this point. I can see the source file being brought thru and output as XML.
My problem comes in when I am trying to query the source file with an XPath expression fed in from the content.
I've tried :
<xsl:value-of select="$preset-xml/#node" />
Clearly that doesn't work as it looks for #node as a direct child of the loaded in XML.
I've tried :
<xsl:variable name="node" select="#node" />
<xsl:value-of select="$preset-xml/$node" />
But it doesn't like the second variable.
I've tried concat($preset-xml,'/',$node), but that still draws out the entire document in the result.
The only way I can get this working at the moment is to write out the full expression in the template :
<xsl:value-of select="$preset-xml/r:container/r:group[#id='set01']/r:item[1]" />
Which correctly brings thru value01
But that is then a hard coded solution.
I want to develop a solution which can be manipulated from the content to suit any sort of imported content, with the parameters declared in the content file.
p.s. I'm using XSLT 1.0 because the tech admin won't change the Microsoft parser to support later versions of XSL, so any solution would need to be written with that in mind.
There is no feature similar to reflection in XSLT. MS XSLT engine, in particular, compiles XPath queries when the XSLT was loaded, not when it was applied to your XML document. There is a reason to this: it separates the code (XSLT) from the data (input XML).
What are you trying to achieve by mixing up code and data; are you sure you really need to execute an arbitrary XPath expression from the input XML? Think, for example, what if the user will pass in the #name equal to e.g. //*, and then you'll send to the output the content of the entire input XML document, which might contain some sensitive data.
If you really need to do the thing you described, you may write your own XSLT extension object, which will accept the document root and the XPath query and will return the latter applied to the former. Then you'll be able to call it like that:
<xsl:value-of select="myExtensionNs:apply-xpath($preset-xml, #node)"/>
As for your original attempt with concat($preset-xml,'/',$node), concat is the string concatenating function. You supply some arguments to it, and it returns the concatenation of its string representations. So, concat($preset-xml,'/',$node) will return the string value of $preset-xml, concatenated with /, concatenated with the string value of $node. If you were trying to write a needed XPath query, concat('$preset-xml','/',$node) would seem more logical (note the quotes); still, there is no way to evaluate the resulted string (e.g. $preset-xml/r:container/r:group[#id='set01']/r:item[1]) to the value of corresponging XPath query; it could only be sent to output.
One possible trick is to call an parametrized xsl:template which carries out the crucial select comparison part and evaluate it's output as string.
I had a similar problem (count all nodes of an arbitrary attribute value) which I solved this way (btw: $sd = 'document($filename)' so I load a secondary xml files content):
<xsl:template match="/">
<xsl:variable name="string_cntsuccess">
<xsl:for-each select="./Testcase">
<xsl:variable name="tcname" select="#location"/>
<xsl:for-each select="$sd//TestCase[#state='Passed']">
<xsl:call-template name="producedots">
<xsl:with-param name="name" select="$tcname"/>
</xsl:call-template>
</xsl:for-each>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="cntsuccess" select="string-length($string_cntsuccess)"/>
</xsl:template>
<xsl:template name="producedots">
<xsl:param name="name"/>
<xsl:variable name="ename" select="#name"/>
<xsl:if test="$name = $ename">
<xsl:text>.</xsl:text>
</xsl:if>
</xsl:template>