Pentaho PDI very slow when i access pentaho server

Pentaho PDI very slow when i access pentaho server - pentaho

I have jobs and transformations in a pentaho server but when try to open, edit, save and close them i have to wait a long time...when I try to move a step it's like my screen is at 10fps.
How can i improve the spoon's performance, somebody helps me.
the pentaho server is in a cetos machine with 16G ram
I modified the spoon.sh to Xms5G and Xmx10G
There is other parameter Xx maspermsize that is 256 mb (i dont know whate means this parameter)

One thing we found was under "Administration" in the Pentaho Users Console" Click on Settings. There is options for "Deleting Generated Files Now". Each time Pentaho Server runs a job it creates files. Over time these build up. You can also set a schedule on the same page to remove these files on a scheduled basis. Our time to browse Job/Transformations went from almost 2 minutes to 5-10 seconds.

Related

pentaho job stops in the middle of a transformation without any indication in log file

I'm new in using pentaho and I need your help to investigate a problem.
I have scheduled in crontab to run a job by kitchen command. I'm using pentaho release 6.0.1.0.386.
Sometimes (it's not a deterministic problem) one of the transformation stops after "Loading transformation from repository" and before "Dispatching started for transformation". The log interrupts. No errors. Nothing. And the job doesn't go on.
Any idea? Any check I can do ? Thanks

is so many bigger the quantity data in this transformation?

There are some files that can cause some errors, you can find them in this path:
enter image description here
my computer/users / your user / .kettle
If you delete the ones I marked in the image, they will be created automatically when you open the pentaho again.

Refreshing Power Pivot automatically

I hope that you can help me.
That's my situation: daily I'm importing in Power Pivot some data through a query on a SQL database.
Actually every morning I open the Power Pivot and I refresh it for import the data of the previous day present in the database.
This action require 20 minutes because I have a lot of data to import.
I was wondering if there is a way to do this action during the night, maybe an automatic refresh, so that I can open the file in the morning and I alredy have the data of the previous day.
I hope that I was clear with my request, thanks in advice.

If the Excel workbook is on a machine that does not shut down, you can keep the workbook open and configure the query to automatically refresh ever x minutes.
Or you can keep the workbook open and run VBA code to refresh the query on a timer.
There are plenty of examples for VBA timers if you just care to search.
Or you can configure the queries to refresh automatically when the file is opened, then create a Windows Task Scheduler job to open the workbook at a specific time. Again, the computer running this must be turned on.
You see that there are many options and they are all well documented and just a short google search away.

Error 3 after OPEN DATASET if big data volume is processed, none otherwise

The problem is that I received a ticket from the AMS support team, which I cannot debug because for given input parameters on the selection screen, the program is looping for 10 hours and that's why the program is set as a background job.
The point of the program is that it should save some data in xls file on the application server.
The important thing is that for some input parameters on the selection screen program WORKS (smaller date intervals, also fewer data to work with), but right now I have to explain to the consultant why the program cannot write that much data into the file on the application server.
To conclude, a Background job is linked to the program which is grabbing a lot of data from DB, in some cases when there is an enormous amount of the data, the program cannot open the file for output so there is no data in xls.
My question is, how big the limit for OUTPUT mode in OPEN DATASET is and why I get an "error opening file" when I have bigger intervals in the selection screen.
OPEN DATASET lv_file FOR OUTPUT IN TEXT MODE ENCODING NON-UNICODE
IGNORING CONVERSION ERRORS.
IF sy-subrc EQ 0. "PROGRAM FAILS HERE, SY-SUBRC eq 3
|
|
The program works when we select fewer data from DB, I have to provide the answer to the question: "why it fails when I grab a big amount of data.
Error in dialog mode :
Error in background mode :

UPDATE: this answer assumes that the original direction ("because of data volume") was based on a misinterpretation of what happened, because of a simple coincidence. It often happens, but I may be wrong of course. This assumption is based on the latest OP comment: "What i found interesting, that on the background job list, if there are 3 jobs for that user, two of them have failed and the target server was the 2nd one,but there is one job which succeeded in opening the file, his target system is system #1, but the difference is that that job had duration of ~1 hour and not 10 hours like two others.")
When you run a background job and there's an error opening a file from time to time, it may be due to the fact that you have an ABAP system with several application servers, and that one of them (at least) is not configured correctly to map a given folder to a "network" folder shared by all other application servers.
To make sure, you can see in which application server the failed job has been executed, by displaying its details (transaction code SM37). Then run the program twice, once in the application server where a job failed, once in the application server where a job succeeded, with the same input parameters.
It should succeed and fail accordingly.
To run a program in a given application server, there are two solutions:
Either start a job by indicating the desired target application server
Or switch your SAP GUI user session to the application server you want:
Use SM51 to display the list of all application servers
double click the concerned server
that opens the overview screen in a new user session started in that server
Enter /NSE38 in the command field and start the program in dialog (it will run in that server).
Now that it's almost certain this is the cause, you should ask the administrator to correct the issue, that in the given application server, he should add a "mapping" from the file folder to the shared folder (do the same as he did in other application servers).

How to free up disk space on a drive which has BiztalkMsgBoxDB_log file in it

I have been asked to analyze a issue regarding one of the biztalk servers. I was asked to free up space on a particular drive, where I found the only file BiztalkMsgBoxDB_log.bak is taking up close 90% of the drive.
Running the following query I later found out that the log space used is only 1.25%.
EXEC ('DBCC sqlperf(LOGSPACE) WITH NO_INFOMSGS')
**Database Name** **Log Size (MB)** **Log Space Used (%)** **Status**
BizTalkMsgBoxDb 24930.49 1.257622 0
currently the Recovery Mode is : FULL and the transaction log back up was taken an hour ago.
I have no clue as to why the log file was created so large.
How can I free up data on this drive.
Thanks in advance,
GHR

You have to shrink your database
Right click on your database => Shrink that's it

Make sure your "Backup BizTalk Server Job" is properly configured and is not failing (check the SQL Server Agent node on the BizTalk database server).
For reference on how to configure this job (and more details of what it does), check MSDN.

coldfusion - cfprint issues with large spool files

I am using cfprint from ColdFusion to print multiple PDFs from a directory. The problem I am having is that when the files are spooled to the printer the size of the file dramatically increases and slows down everything. The file in the folder is 125K and when it is in the printer spool it increases up to 15.7MB. Here is the ColdFusion code:
<cfprint
source="[FILELOCATION]/[FILE].pdf"
color="yes"
printer="[printer name]">
The files will eventually print but it can take upwards of 15-20 minutes. Does anyone have any solutions for this issue? I have tried with both CF generated PDFs and ones that I have created from scratch. Thanks

Queue up two to five at a time. Pause to allow processing. Mark them as printed, move or delete them, move to the next batch...Time this out yourself to see how much time you need to allow. That way you don't compound a bunch of work for the server and create a bottleneck on your CF server.
If you are just doing this with one server consider using a secondary low priority server and run a developer edition fully paid for EULA compliant registered version of Coldfusion (or Railo) and dedicate that server for just printing so your other server can do useful things.
Edit
So the OP has a Coldfusion print bottleneck. In your server that does the printing (same as your CF server I assume?) and IF this is a windows server (not sure your server version), there is print queue folder. Provided you have access to this folder, you can do a few things. You can create a method for FTP-ing your files to this folder (or copy if it is the same server). The printer will queue up the job and off it goes. You can do some functions like check the print queue for file count. If the file count is greater zero, check back in 15 minutes. If the count is zero, copy over a few more files.
You be creating a scheduled task in your CFAdmin and automate. There is a getprinterInfo() so you can check if the printer is off line and do other things like check for another printer somewhere else if you need to reroute print jobs. You can also set up several print servers and attach printers to them and hit several print servers and check print queue folders.
The magic is endless, goal is to offset work to something other than your Coldfusion server.
So to recap:
Seperate concerns by not doing cfprint
Create escape routes to other priters if you can.
If you must use coldfusion then queue up a dedicated Coldfusion server for print management stuff.
Use getPrinterInfo() and dump out things to see what you can use, trap etc.
Ben forta has a tool that can check for several printers, consider incorperating this.
Next use cfftp (or cffile if you are on the same server) provided you have access and copy files to print queue folders, doing no cfprint at all.
Here is a link on print spool stuff (another link in the doc shows you how you can change the spool location).
When it is over you are going to be the coldfusion print master with escape routes and checks and everything.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas