How to set KOFAX KTM Server global variable value which will be initialized in Batch open, updated in SeparateCurrentPage & used in BatchClose? - scripting

I am trying to count a specific barcode value from Project.Document_SeparateCurrentPage and use it in BatchClose to compare if the count is greater than 1 and if it is >1 then send the batch to a specific queue with specific priority. I used a global variable in KTM Project Script to hold the count value which was initialized to 0 in Batch open. It worked fine until unit testing. But our automation team found that out of 20 similar batches, few batches were sent to the queue where the batch should go only if the count satisfies the greater than one condition, though they used only one barcode.
I googled and found that KTM Server script events do not allow to use shared information in different processes(https://docshield.kofax.com/KTM/en_US/6.4.0-uuxag78yhr/help/SCRIPT/ScriptDocumentation/c_ServerScriptEvents.html). Then I tried to use a batch field to hold the barcode count but unable to update its value from Project.Document_SeparateCurrentPage function using pXRootFolder.Fields.ItemByName("BatchFieldName").Text = "GreaterThanOne". The logs show that the batch reads the first page three times and then errors out.
Any links would help. Thanks in advance.

As you mentioned, the different phases of batch/document processing can execute in different processes, so global variables initialized in one event won’t necessarily be available in others. Ideally you should only use global variables if their content can be set from Application_InitializeScript or Application_InitializeBatch, because these events occur in each separate process. As you’ve found out, you shouldn’t use a global variable for your use case, because Document_SeparateCurrentPage and Batch_Close for one batch may occur in different processes, just as the same process will likely execute those events for multiple batches.
Also, you cannot set batch fields from document level events for a related reason: any number of separate processes could be processing documents of a batch in parallel, so batch level data is read-only to document events. It is a bit unintuitive, but separation is a document level event even though it seems like it is acting on the whole batch. (The three times you saw is just an error retry mechanism.)
If it meets your needs, the simplest answer might be to use a barcode locator as part of normal extraction (not just separation), and assign to a field if needed. While you cannot set batch fields from document events, you can read document data from batch events. So instead of trying to track something like a count over the course of document events, just make sure whatever data you need is saved at a document level. Then in a Batch_Close you can iterate the documents and count/calculate whatever you need. (In your case maybe the number of locator alternatives for the barcode locator, across each document.)

Related

How to add(concatenate) variables inside batch processing in mule 4?

I am processing records from one DB to another DB. The batch job is being called multiple times in a single request(triggering the process API URL only one time).
How can I add the total records processed(given by the payload at the on-complete phase) for one complete request?
For eg, I ran the process, and three times the batch job executed. So I want to have the sum of all the records in all the 3 batch jobs.
That's not possible because of how the Batch scope works:
In the On Complete phase, none of these variables (not even the
original ones) are visible. Only the final result is available in this
phase. Moreover, since the Batch Job Instance executes asynchronously
from the rest of the flow, no variable set in either a Batch Step or
the On Complete phase will be visible outside the Batch Scope.
source: https://docs.mulesoft.com/mule-runtime/4.3/batch-processing-concept#variable-propagation
What you could do is to store the results in a persistent repository, for example in you database.

How can I make sysdate variable in VS 2019 SSIS

I want to make a variable within SSIS that is the current date so that I can reference it in a script task but I have only been able to do this with start date and creation date instead of sysdate. Can anyone help?
SSIS has two states: design-time and run-time. Design-time is the experience in Visual Studio/BIDS/SSDT. There are artifacts on the screen, interactive windows, and our Variables window show the values of the package "at rest".
The Run-time is the experience in the Debugger (or an unattended execution). In the debugger, it looks like the run-time - you see the objects, the data flow components light up and you can see data flowing between components but you can find discrepancies between the two. For example, the Variables window won't show you what the value of a variable is "RIGHT NOW." Instead, it is going to show the design-time value. If you want to see what the internals look like now, that's the Debug menu, Locals window. There you'd see that the current values of all the variables that were defined as design-time.
The System::StartTime has the run-time value set when the package begins (OnPackageStart event). The time the package starts is constant for the run of a package, whether the package run lasts a minute or 3 days, the start time is the time the package started. The design-time value won't ever be passed to a consumer of that variable because the value was updated when the package starts. SSIS does not update the design-time values with the previous run's values. i.e. A design-time start time of 2021-02-18 will always be the at rest value despite being run every day
You cannot control this behavior, nor do you need to worry about it never being accurate as it is part of how run-time works.
An expression exists, GetDate() which is evaluated every time it is inspected (design and run time). I usually advise against this because I am likely using the current time to correlate database activities.
e.g. I created these 10, 100, or 1000000 records at 2021-02-22T11:16:32.123. If I inserted in batches of ten, the first scenario would be recorded under the same timestamp. The second would look something like the first 10 at 2021-02-22T11:16:32.123, the next 10 at 2021-02-22T11:16:32.993, the next ten at 2021-02-22T11:16:33.223 etc. Maybe more, maybe less. Why that matters is I can't prove to the business "these 10/100/1000000 are the rows from load X because they all have the same timestamp" Instead, I need to find all the rows from 2021-02-22T11:16:32.123 to 2021-02-22T11:16:38.532 and oops, a different process also ran in that timeframe so my range query now identifies (10/105/1000003) rows.
GetDate for longer running processes that start before, but near the midnight boundary can result in frustrating explanations to the business.
Finally, since you're referencing a Script Task, you're already in .NET space so you can use Now/Today in your methods and not worry about passing an SSIS variable into the environment.

ABAP Program to notify Users X amount of days before user account will be disabled

I'm currently learning ABAP and trying to make an enhancement but have broken down in confusion on how to go about building on top of existing code. I have a program that runs periodically via a background job that disables user accounts X amount of days (in this case 90 days of inactive usage based on USR02~TRDAT).
I want to add an enhancement to notify the User via their email address (result of usr02~bname to match usr21~bname to pass the usr21~persnumber and usr21~addrnumber to adr6 which will point to the adr6~smtp_addr of the user, providing the usr02~bname -> adr6~smtp_addr relationship) based on their last logon date being 30, 15, 7, 5, 3, and 1 day away from the 90 day inactivity threshold with a link to the SAP system to help them reactivate the account with ease.
I'm beginning to think that an enhancement might not be a good idea but rather create a new program and schedule the background job daily. Any guidance or information would be greatly appreciated...
Extract
CLASS cl_inactive_users_reader DEFINITION.
PUBLIC SECTION.
TYPES:
BEGIN OF ts_inactive_user,
user_name TYPE syst_uname,
days_of_inactivity TYPE int1,
END OF ts_inactive_user.
TYPES tt_inactive_users TYPE STANDARD TABLE OF ts_inactive_user WITH EMPTY KEY.
CLASS-METHODS read_inactive_users
IMPORTING
min_days_of_inactivity TYPE int1
RETURNING
VALUE(result) TYPE tt_inactive_users.
ENDCLASS.
Then refactor
REPORT block_inactive_users.
DATA(inactive_users) = cl_inactive_users_readers=>read_inactive_users( 90 ).
LOOP AT inactive_users INTO DATA(inactive_user).
" block user
ENDLOOP.
And add
REPORT warn_inactive_users.
DATA(inactive_users) = cl_inactive_users_readers=>read_inactive_users( 60 ).
LOOP AT inactive_users INTO DATA(inactive_user).
CASE inactive_user-days_of_inactivity.
" choose urgency
ENDCASE.
" send e-mail
ENDLOOP.
and run both reports daily.
Don't create a big ball of mud by squeezing new features into existing code.
From SAP wiki:
The enhancement concept allows you to add your own functionality to SAP's standard business applications without having to modify the original applications. To modify the standard SAP behavior as per customer requirements, we can use enhancement framework.
As per your description, it doesn't sound like a use case for an enhancement. It isn't an intervention in an existing process. The original process and your new requirement are two different processes with some mutual logical part - selection of days of inactivity of users. The two shouldn't rely on each other.
Structurally I think it is best to have a separate program for computing which e-mails need to be sent and when, and a separate program for actually sending them.
I would copy your original program to a new one, and modify it a little bit so that instead of disabling a user, it records into some table for each user: 1) an e-mail 2) a date when to send 3) how many days left (30, 15, 7, etc) 4) status if the e-mail was sent or not. Initially you can even have multiple such jobs for each period (30, 15, 7 etc) and pass it as a parameter (which you use inside instead of 90).
This program you run daily as a job and it populates that table with e-mail "tasks" of what needs to be sent today. It just adds new lines, so lines from yesterday should stay in there.
The 2nd program should just read that table and send actual e-mails and update the statuses. You run that program daily as well.
This way you have:
overview: just check the table to see what's going on
control: if the e-mailer dies or hangs, you can restart it and it will continue where it left off; with statuses you avoid sending duplicate mails
you can make sure that you don't send outdated e-mails if in your mailer script you ignore all tasks older than say 2 days
I want to clarify your confusion about the use of enhancements:
You would want to use enhancements in terms of 'something' happens or is going to happen in the system and you would want to change this standard way.
That something, let's call it event or process could be for example an order is placed, a certain user is logging onto the system or a material has been or is going to be changed.
The change could be notifying another system of an order or checking the logged on user with additional checks for example his GUI version and warn him/her if not up-to-date.
Ask yourself, what process on the system does the execution of your program or code depend on. Does anything need to happen before the program is executed? No, only elapsing time.
Even if you had found an enhancement, you would want to use. If this process using the enhancement would not be run in 90 days, your mails would not be sent, because the enhancement would never been called.
edit: That being said, supposing you mean by enhancement 'building on your existing program' instead of 'creating a new one' would be absolutely not the right terminology for enhancement in the sap universe.
I would extend the functionality of your existing program, since you already compute how many days are left and you would have only one job to maintain.

Why BigtableIO writes records one by one after GroupBy/Combine DoFn?

Is someone aware of how the bundles are working within BigtableIO? Everything looks fine until one is using GroupBy or Combine DoFn. At this point, the pipeline would change the pane of our PCollection element from PaneInfo.NO_FIRING to PaneInfo{isFirst=true, isLast=true, timing=ON_TIME, index=0, onTimeIndex=0} and then BigtableIO will output the following log INFO o.a.b.sdk.io.gcp.bigtable.BigtableIO - Wrote 1 records. Is the logging causing a performance issue when one have millions records to output or is it the fact that BigtableIO is opening and closing a writer for each record?
BigtableIO sends multiple records in a batch RPC. However, that assumes there there are multiple records sent in the "bundle". Bundle sizes are dependent on a combination of the step before hand, and the Dataflow framework. The problems you're seeing don't seem to be related to BigtableIO directly.
FWIW, here's the code for logging the number of records that occurs in the finishBundle() method.

How to create a Priority queue schedule in Autosys?

Technologies available: Autosys, Informatica, Unix scripting, Database (available via informatica)
How our batch currently works is with filewatchers looking for a file called "control.txt" which gets deleted when a feed starts processing. It gets recreated once completed which allows all "control" autosys jobs waiting, to have one pick up the control file and begin processing data feeds one by one.
However, the system has grown large, and some feeds have become more important than others, and we're looking at ways to improve our scheduler to prioritize feeds over others.
With the current design, of one a file deciding when the next feed runs, it can't be done, and I haven't been able to come up with a simple solution to make it happen.
Example:
1. Feed A is processing
2. Feed B, Feed C, Feed X, Feed F come in while Feed A is processing
3. Need to ensure that Feed B is processed next, even though C, X, F are ready.
4. C, X, F have a lower priority than A and B, but have the same priority and can process in any order
A very interesting question. One thing that I can think of is to have an extra Autosys job with a shell script that copies the file in certain order. Like:
Create input folder e.g. StageFolder
Let's call your current Autosys input folder "the InputFolder"
Have Autosys monitor it and for any file run a OrderedFileCopyScript.sh, every minute
OrderedFileCopyScript.sh should copy one file from StageFolder to InputFolder in desired order only if InputFolder is empty
I hope I made myself clear.
I oppose use of Autosys for this requirement ! Wrong tool !
I don't know all the details but considering an application with the usual reference tables.
In this case you should make use of feed reference table to include relative priorities.
I would suggest to create(or reuse) a table to loaded by the successor job of the file watcher.
1) Table to contain the unprocessed file with the corresponding priority and then use this table to process the files based on the priority.
2) Remove/archive the entries once done.
3) Have another job of this and run like a daemon with start_times/run_window.
This gives the flexibility to deal with change in priorities and keeps overall design simple.
This gives