I am using a pentaho community edition 7. I want to schedule a JOB with two sub transformations in it.
I want to schedule it to run on every monday. Can anyone please guide me in saving the files using the correct filepaths and schedule them from BI server.
You can find a guidance for your question in an accepted answer here:
How to deploy scheduled Kettle jobs on Pentaho BI server v6 CE
Basically:
Add job file and xaction file (triggers job file) to Pentaho server
Add transformation files to server's filesystem
Schedule xaction file in Pentaho using "Schedule..." file action (Weekly, Monday, preferred time can be set up in pop-up dialog)
Related
I'm looking at a migration project to migrate a client from Datamanager to Datastage. I can see that IBM have helpfully added in the Migration Assistant tool, but less helpfully, I cannot find any details on how to actually use it.
I did use it some years ago, and I'm aware that you need to do it in a command line interface and it works by taking an extract file and creating a Datastage job out of it, which is then reinstalled. However I haven't got my notes any more from that process.
If there is a user guide out there for this tool, I'd love to see it.
Cheers
JK
Your starting point would be the IBM support page and then see "How to download the tool" and ensure you have the required version (10.2.1 + Fix Pack 13 + Interim Fix) installed. The user guide PDF is part of the install in the sub-folder "datamanager/migration".
I am trying to use pentaho which I downloaded from sourceforge (pentaho files). I run the schema-workbench shell correctly and a window opens with the interface, but I still haven't been able to connect to the admin console on http://localhost:8080/pentaho.
Any ideas on which this doesn't seem to work for me?
Best regards
You have a start-pentaho.sh which launches (after a long the first time) the pentaho server on port 8080.
That is, if you have downloaded the correct package, because Pentaho contains many packages: one is the server, another one is the client-tools which contains the schema-workbench as well as the pdi (Pentaho Data Integrator), and the prd (Pentaho Report Designer) as well as few others.
You are running the wrong file. To open the pentaho console, you need to download the PNTAHO SERVER and run 'start-pentaho.sh'
Pentaho by default will start PuC Pentaho USer Console on http://localhost:8080/pentaho once server is up and running. For getting the data integration i.e Spoon interface go to
For Windows : Pentaho install directory>> design-tools>> data-integration>>spoon.bat
For Linux/Mac:Pentaho install directory>> design-tools>> data-integration>>spoon.sh
I hope this helps.
We can extend U-SQL scripts with R/Python code in Azure Data Lake Analytics, but how can we do it locally?
Install U-SQL Advanced Analytics extensions in your Data Lake Analytics Account
1.1 Launch your Azure Portal
1.2 Navigate to your Data Lake Analytics Account
1.3 Click Sample Scripts
1.4 Click More and select Install U-SQL Extensions
1.5 Wait until the extensions have finished installing (2GB)
1.6 Have you waited? Then go to your Data Lake Analytics Account
1.7 Navigate to your default Data Lake Store Account
1.8 Click on Data Explorer and verify that a folder /usqlext exists
Get your USQL Local Run path
2.1 Launch your Visual Studio
2.2 Select Tools > Options > Azure Data Lake > General
2.3 Under U-SQL Local Run, find and copy the value for DataRoot
2.4 The value will look like this: C:\Users\username\AppData\Local\USQLDataRoot
Copy U-SQL Advanced Analytics extensions from Azure to your localhost
3.1 Use a powershell script or ... go to the next line
3.2 Launch Microsoft Azure Storage Explorer (great tool, install it)
3.3 Locate your default Data Lake Store, the one of your Data Lake Analytics Account
3.4 Open data explorer and Download the folder /usqlext to your USQL Local Run's path
3.5 The full path should look like this: C:\Users\username\AppData\Local\USQLDataRoot\usqlext
Final step, register all Azure U-SQL Extensions under U-SQL Local Run
4.1 Launch your Visual Studio
4.2 Start a new U-SQL project
4.3 Open the file C:\Users\username\AppData\Local\USQLDataRoot\usqlext\RegisterAll.usql
4.4 Copy the text into your own U-SQL script
4.5 Run it in Local Execution mode
4.6 Once the script finishes...
You will be able to use all the U-SQL Advanced Analytics features (Python, R, Cognitive) on your own machine and explore all the samples in \usqlext\samples!
Have a nive day!
This answer does not apply directly to the OP, but this article is the closest match to the issue that I had. Knowing others may find this article while searching for the solution, I am recording my answer to this thread.
Problem: In the Azure Portal (not locally), if you choose "Install Azure U-SQL Extensions", the job eventually fails with a non-descript error.
What is happening behind the scenes is that all of the files are copying into storage, but the assemblies fail to register. We have to create a job manually to register the assemblies.
Answer:
Open Data Explorer
Navigate to /usqlext
Download the file "RegisterAll.usql". Open in Notepad, and copy the text
Create a new Job. Paste in the text.
Execute the job.
Assemblies will be registered (verify by checking Catalog->master->Assemblies). You can now run the Cognition and Python samples.
I could not see any reference to the Pentaho on the Skybot documentation. Is there a way to schedule Pentaho transformations and jobs on the Skybot? I have tried creating agents and referring to the file path but nothing is working! Any pointers?
For executing or scheduling in Pentaho, you need to have Pentaho installed in your system. If you are using Linux system, first of all install Pentaho DI in your system. Once you have done that make use of the Skybot schedular. Point the Kitchen.sh or Pan.sh file of pentaho DI and the files you need to schedule/execute. You can take help of this link:
How to schedule Pentaho Kettle transformations?
If all is done you can execute a transformation. Skybot needs OS and Pentaho to execute/schedule a job. The same goes with Windows schedular or any other scheduling tool.
Hope it helps :)
This should be pretty simple. Just use the CLI tools to start your job. Kitchen I believe runs jobs. Pan will run transformations.
Here's the documentation for kitchen. It's very straight forward.
http://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation
I developed a simple job that loads data from multiple excel file to a mysql database in pentaho Kettle, I'm using Kitchen.bat from cmd line to run the job. If i need to move the job to other production serverthe application size is more than a GB, is there any way i could deploy the job without moving all the libraries to the production server.