Executing Pentaho transformation(ktr) using node js with Pentaho CE - pentaho

I can able to successfully execute the .ktr files using browser and as well as using postman tool by using below url
http://localhost:8089/kettle/executeTrans/?trans=D:\Pentaho\ktr\MyJson_to_Database.ktr
But I want to automate the process and this ktr and it need to accept a json file as input(right now the json data is in side the ktr file itself). As I am using NodeJS to automate the ktr executing processing, I am trying to use wreck and post method to execute it(I am new to wreck), I am facing difficulties to identify the problem whether the error is due to wrek or kettle transformation itself
In the mean time I am trying to execute it without passing path as query string in url and instead I want to use it in body, I have searched google with no success so far.
EDIT 1
I am able to reach to the ktr file from NodeJS Microservice and now the challenge is to read the file path inside docker image.

Could you work storing the json data in a file, and modifying/adding the transformation to read the file and pass the information in the file?

Related

Azure Data Factory HTTP Connector Data Copy - Bank Of England Statistical Database

I'm trying to use the HTTP connector to read a CSV of data from the BoE statistical database.
Take the SONIA rate for instance.
There is a download button for a CSV extract.
I've converted this to the following URL which downloads a CSV via web browser.
[https://www.bankofengland.co.uk/boeapps/database/_iadb-fromshowcolumns.asp?csv.x=yes&Datefrom=01/Dec/2021&Dateto=01/Dec/2021 &SeriesCodes=IUDSOIA&CSVF=TN&UsingCodes=Y][1]
Putting this in the Base URL it connects and pulls the data.
I'm trying to split this out so that I can parameterise some of it.
Base
https://www.bankofengland.co.uk/boeapps/database
Relative
_iadb-fromshowcolumns.asp?csv.x=yes&Datefrom=01/Dec/2021&Dateto=01/Dec/2021 &SeriesCodes=IUDSOIA&CSVF=TN&UsingCodes=Y
It won't fetch the data, however when it's all combined in the base URL it does.
I've tried to add a "/" at the start of the relative URL as well and that hasn't worked either.
According to the documentation ADF puts the "/" in for you "[Base]/[Relative]"
Does anyone know what I'm doing wrong?
Thanks,
Dan
[1]: https://www.bankofengland.co.uk/boeapps/database/_iadb-fromshowcolumns.asp?csv.x=yes&Datefrom=01/Dec/2021&Dateto=01/Dec/2021 &SeriesCodes=IUDSOIA&CSVF=TN&UsingCodes=Y
I don't see a way you could download that data directly as a csv file. The data seems to be manually copied from the site, using their Save as option.
They have used read-only block and hidden elements, I doubt there would any easy way or out of the box method within ADF web activity to help on this.
You can just manually copy-paste into a csv file.

SAP ArchiveLink object cannot be opened

I have requirement to make SF Spool to PDF and save with Archivelink,
I've used FMs: RSTS_GET_ATTRIBUTES, CONVERT_OTFSPOOLJOB_2_PDF, SCMS_XSTRING_TO_BINARY then save to application server via Open Dataset.
For archiving part FMs are ARCHIVOBJECT_CREATE_TABLE and ARCHIV_CONNECTION_INSERT. Setup has been made by basis in OAC0 and I've setup the OAC2 and OAC3. Upon executing the program, TOA01 tables are having entries.
But when checking the PDF file using FM ARCHIVOBJECT_DISPLAY this error pops up
When I'm trying to directly download the PDF file from application server to presentation layer it can be viewed normally.
What I am missing?
Managed to solve after looking closely with data types
BINARCHIVOBJECT parameter from ARCHIVOBJECT_CREATE_TABLE is set at RAW1024 while
BINARY TAB parameter from SCMS_XSTRING_TO_BINARY is RAW128.
Executed cl_bcs_convert=>xstring_to_xtab

Is it possible to automate updating Tableau extract for Tableau Reader?

Situation now:
I have a data warehouse job profile that publishes .txt file in Data folder every day in the morning. I open Tableau workbook which automatically updates data visualisations because of union I made. I save this workbook as extract and collages without Tableau Desktop can view it via Tableau Reader.
What I need:
This reporting format is heavily dependent on me and I need to automate this.
Is this even possible without Tableau Server?
Since Tableau Viewer can only use packaged workbooks with extracted data, you may not directly achieve this.
However, you may automate the packaging process using Tableau's command line parameters and the process will not be dependent on anyone anymore.
You may check the .PDF file on below link. Using that help document, you may create a .BAT file and get that .BAT file periodically started using Task Scheduler on your computer. The users then may open the packaged file from the network location you have saved. Or else (If all user computers have Tableau Desktop installed) you may put the file opening line at the end of the .BAT file, so the user can run the .BAT when they want to see the report.
https://community.tableau.com/docs/DOC-5209
Bernardo was correct in saying the Extract API can be used to programatically create extracts, and thus "refresh" an extract by simply recreating it (the point about Tableau Server is only relevant if you want to publish the extract that you create with the Extract API).
Where you might have trouble is that there is no currently supported way to programatically replace an extract within a .twbx file. That said, it should be possible to do this by simply renaming the .twbx to .zip (it is after all just an archive) and then using something like Python's zip module to manipulate the archive to replace the extract with your new extract.
NB: The Extract API can only be used to create .hyper files. If you want to work with .tde files, then you'll need to use the Tableau SDK instead

Execute a script in Pentaho DI using an URL

I am new to use this tool.
I am trying to load or execute a script in PHP (http:\.....\re_certification/re_certification.php) in Pentaho. I don't know what tool can I use to be able to do it.
Any idea or example?
Use the HTTP Client step, it will fetch an URL using parameters given - http://wiki.pentaho.com/display/EAI/HTTP+Client.
Do not forget that it needs to be triggered - use generate rows or any other input step before this.
There is also a sample at the
samples/transformations/HTTP Client - simple retrieval example.ktr

Executing Abaqus Model in Taverna

I'm pretty new to both Taverna and Abaqus but I am trying to run an Abaqus model using a "Tool" in Taverna remotely on a HPC. This works fine if I already have my model file and inputs on the HPC but I need a way of uploading the files dynamically in Taverna (trying to generically wrap Abaqus models).
I've tried adding a input port that takes a file list but I don't know how I can copy it to the "location" that I've set for the tool. Could a beanshell service be the answer or can I iterate through the file list and copy them up before executing the abaqus model?
Thanks
When you say that you created an input port that takes a file list, I guess you mean an input to the tool service.
Assuming the input port is called my_file_list, when the tool service is run, it will take a list of data values on port my_file_list. As an example, say it has "hello", "hi" and "hola" is the three values in the list.
On the location where the tool service is run, it executes in a temporary directory - a different directory for each execution of the service. It is normally something like /tmp/usecase-2029778474741087696
Three files will be created in the temporary directory; those files contain the (in this example) three values the tool service received on port my_file_list. The files could be called
/tmp/usecase-2029778474741087696/tempfile.0.tmp containing hello
/tmp/usecase-2029778474741087696/tempfile.1.tmp containing hi
/tmp/usecase-2029778474741087696/tempfile.2.tmp containing hola
There will also be a file called my_input_list. That file will contain
/tmp/usecase-2029778474741087696/tempfile.0.tmp
/tmp/usecase-2029778474741087696/tempfile.1.tmp
/tmp/usecase-2029778474741087696/tempfile.2.tmp
The script of your tool service would normally read the contents of my_input_list line by line and do something with the contents of the listed file(s).
I have also seen some scripts that 'cheat' and iterate directly over tempfile*.tmp but that would be "a bad thing". The problem with that trick, is that if you want to add a second list of files to the tool service then the file my_input_list could contain
/tmp/usecase7932018053449784034/tempfile.4.tmp
/tmp/usecase7932018053449784034/tempfile.5.tmp
/tmp/usecase7932018053449784034/tempfile.6.tmp
as other temporary files were used for the other file list port.
I hope that helps
The tool service allows you to upload files - but if you are using the HPC through a job submission node, then you would have to modify your command line tool to then use the job file staging command to further push the files as part of the job. The files would be available in the current (temporary) directory of the specified tool script.
I would try to do it through the Tool service and not involve the beanshell - then you can keep your workflow simpler.
A good thing to remember is that you can write multiple shell commands in the box.
Similarly you would probably want to retrieve back the results so that you can process them further in the workflow (unless they are massive - in which case you should just output their remote filenames and send them in again to the next HPC job)
The exact commands to use for staging files and retrieving them depends on the HPC job submission system. Which one are you using?
Thanks for the input guys.
It was my misunderstanding of how Taverna uses the File list. All the files in the list are copied to the temp "sandbox" and are therefore available for use.
Another nice easy way is to zip the directory and pass the zipped files into an input port for the service. Then just unzip the files inside the command.
Thanks again