How to make batch processing with Apex? - batch-processing

How can I create batch processing application with Apache Apex?
All the examples I've found were streaming applications, which means they are not ending and I would like my app to close once it has processed all the data.
Thanks

What is your use-case? Supporting batch natively is on the roadmap and is being worked on right now.
Alternately, till then, once you are sure that your processing is done, the input operator can send a signal as ShutdownException() and that will propogate through the DAG and shutdown the DAG.
Let us know if you need further details.

You can add an exit condition before running the app.
for example
public void testMapOperator() throws Exception
{
LocalMode lma = LocalMode.newInstance();
DAG dag = lma.getDAG();
NumberGenerator numGen = dag.addOperator("numGen", new NumberGenerator());
FunctionOperator.MapFunctionOperator<Integer, Integer> mapper
= dag.addOperator("mapper", new FunctionOperator.MapFunctionOperator<Integer, Integer>(new Square()));
ResultCollector collector = dag.addOperator("collector", new ResultCollector());
dag.addStream("raw numbers", numGen.output, mapper.input);
dag.addStream("mapped results", mapper.output, collector.input);
// Create local cluster
LocalMode.Controller lc = lma.getController();
lc.setHeartbeatMonitoringEnabled(false);
//Condition to exit the application
((StramLocalCluster)lc).setExitCondition(new Callable<Boolean>()
{
#Override
public Boolean call() throws Exception
{
return TupleCount == NumTuples;
}
});
lc.run();
Assert.assertEquals(sum, 285);
}
for the complete code refer https://github.com/apache/apex-malhar/blob/master/stream/src/test/java/org/apache/apex/malhar/stream/FunctionOperator/FunctionOperatorTest.java

Related

How to determine job's queue at runtime

Our web app allows the end-user to set the queue of recurring jobs on the UI. (We create a queue for each server (use server name) and allow users to choose server to run)
How the job is registered:
RecurringJob.AddOrUpdate<IMyTestJob>(input.Id, x => x.Run(), input.Cron, TimeZoneInfo.Local, input.QueueName);
It worked properly, but sometimes we check the log on Production and found that it runs on the wrong queue (server). We don't have more access to Production so that we try to reproduce at Development but it's not happened.
To temporarily fix this issue, we need to get the queue name when the job running, then compare it with the current server name and stop it when they are diferent.
Is it possible and how to get it from PerformContext?
Noted: We use HangFire version: 1.7.9 and ASP.NET Core 3.1
You may have a look at https://github.com/HangfireIO/Hangfire/pull/502
A dedicated filter intercepts the queue changes and restores the original queue.
I guess you can just stop the execution in a very similar filter, or set a parameter to cleanly stop execution during the IElectStateFilter.OnStateElection phase by changing the CandidateState to FailedState
Maybe your problem comes from an already existing filter which messes up with the queues.
Here is the code from the link above :
public class PreserveOriginalQueueAttribute : JobFilterAttribute, IApplyStateFilter
{
public void OnStateApplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
var enqueuedState = context.NewState as EnqueuedState;
// Activating only when enqueueing a background job
if (enqueuedState != null)
{
// Checking if an original queue is already set
var originalQueue = JobHelper.FromJson<string>(context.Connection.GetJobParameter(
context.BackgroundJob.Id,
"OriginalQueue"));
if (originalQueue != null)
{
// Override any other queue value that is currently set (by other filters, for example)
enqueuedState.Queue = originalQueue;
}
else
{
// Queueing for the first time, we should set the original queue
context.Connection.SetJobParameter(
context.BackgroundJob.Id,
"OriginalQueue",
JobHelper.ToJson(enqueuedState.Queue));
}
}
}
public void OnStateUnapplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
}
}
I have found the simple solution: since we have known the Recurring Job Id, we can get its information from JobStorage and compare it with the current queue (current server name):
public bool IsCorrectQueue()
{
List<RecurringJobDto> recurringJobs = Hangfire.JobStorage.Current.GetConnection().GetRecurringJobs();
var myJob = recurringJobs.FirstOrDefault(x => x.Id.Equals("My job Id"));
var definedQueue = myJob.Queue;
var currentServerQueue = string.Concat(Environment.MachineName.ToLowerInvariant().Where(char.IsLetterOrDigit));
return definedQueue == "default" || definedQueue == currentServerQueue;
}
Then check it inside the job:
public async Task Run()
{
//Check correct queue
if (!IsCorrectQueue())
{
Logger.Error("Wrong queue detected");
return;
}
//Job logic
}

Custom command to go back in a process instance (execution)

I have a process where I have 3 sequential user tasks (something like Task 1 -> Task 2 -> Task 3). So, to validate the Task 3, I have to validate the Task 1, then the Task 2.
My goal is to implement a workaround to go back in an execution of a process instance thanks to a Command like suggested in this link. The problem is I started to implement the command by it does not work as I want. The algorithm should be something like:
Retrieve the task with the passed id
Get the process instance of this task
Get the historic tasks of the process instance
From the list of the historic tasks, deduce the previous one
Create a new task from the previous historic task
Make the execution to point to this new task
Maybe clean the task pointed before the update
So, the code of my command is like that:
public class MoveTokenCmd implements Command<Void> {
protected String fromTaskId = "20918";
public MoveTokenCmd() {
}
public Void execute(CommandContext commandContext) {
HistoricTaskInstanceEntity currentUserTaskEntity = commandContext.getHistoricTaskInstanceEntityManager()
.findHistoricTaskInstanceById(fromTaskId);
ExecutionEntity currentExecution = commandContext.getExecutionEntityManager()
.findExecutionById(currentUserTaskEntity.getExecutionId());
// Get process Instance
HistoricProcessInstanceEntity historicProcessInstanceEntity = commandContext
.getHistoricProcessInstanceEntityManager()
.findHistoricProcessInstance(currentUserTaskEntity.getProcessInstanceId());
HistoricTaskInstanceQueryImpl historicTaskInstanceQuery = new HistoricTaskInstanceQueryImpl();
historicTaskInstanceQuery.processInstanceId(historicProcessInstanceEntity.getId()).orderByExecutionId().desc();
List<HistoricTaskInstance> historicTaskInstances = commandContext.getHistoricTaskInstanceEntityManager()
.findHistoricTaskInstancesByQueryCriteria(historicTaskInstanceQuery);
int index = 0;
for (HistoricTaskInstance historicTaskInstance : historicTaskInstances) {
if (historicTaskInstance.getId().equals(currentUserTaskEntity.getId())) {
break;
}
index++;
}
if (index > 0) {
HistoricTaskInstance previousTask = historicTaskInstances.get(index - 1);
TaskEntity newTaskEntity = createTaskFromHistoricTask(previousTask, commandContext);
currentExecution.addTask(newTaskEntity);
commandContext.getTaskEntityManager().insert(newTaskEntity);
AtomicOperation.TRANSITION_CREATE_SCOPE.execute(currentExecution);
} else {
// TODO: find the last task of the previous process instance
}
// To overcome the "Task cannot be deleted because is part of a running
// process"
TaskEntity currentUserTask = commandContext.getTaskEntityManager().findTaskById(fromTaskId);
if (currentUserTask != null) {
currentUserTask.setExecutionId(null);
commandContext.getTaskEntityManager().deleteTask(currentUserTask, "jumped to another task", true);
}
return null;
}
private TaskEntity createTaskFromHistoricTask(HistoricTaskInstance historicTaskInstance,
CommandContext commandContext) {
TaskEntity newTaskEntity = new TaskEntity();
newTaskEntity.setProcessDefinitionId(historicTaskInstance.getProcessDefinitionId());
newTaskEntity.setName(historicTaskInstance.getName());
newTaskEntity.setTaskDefinitionKey(historicTaskInstance.getTaskDefinitionKey());
newTaskEntity.setProcessInstanceId(historicTaskInstance.getExecutionId());
newTaskEntity.setExecutionId(historicTaskInstance.getExecutionId());
return newTaskEntity;
}
}
But the problem is I can see my task is created, but the execution does not point to it but to the current one.
I had the idea to use the activity (via the object ActivityImpl) to set it to the execution but I don't know how to retrieve the activity of my new task.
Can someone help me, please?
Unless somethign has changed in the engine significantly the code in the link you reference should still work (I have used it on a number of projects).
That said, when scanning your code I don't see the most important command.
Once you have the current execution, you can move the token by setting the current activity.
Like I said, the code in the referenced article used to work and still should.
Greg
Referring the same link in your question, i would personally recommend to work with the design of you your process. use an exclusive gateway to decide whether the process should end or should be returned to the previous task. if the generation of task is dynamic, you can point to the same task and delete local variable. Activiti has constructs to save your time from implementing the same :).

usbManager openDevice call fails after several hundred successful attempts

I'm using usbmanager class to manage USB host on my android 4.1.1 machine.
all seems to work quite well for a few hundreds of transactions until (after ~ 900 transactions) opening the device fails, returning null without exception.
Using a profiler it doesn't seem to be a matter of memory leakage.
this is how I initialize the communication from my main activity (doing this once):
public class MainTestActivity extends Activity {
private BroadcastReceiver m_UsbReceiver = null;
private PendingIntent mPermissionIntent = null;
UsbManager m_manager=null;
DeviceFactory m_factory = null;
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
mPermissionIntent = PendingIntent.getBroadcast(this, 0, new Intent(ACTION_USB_PERMISSION), 0);
IntentFilter filter = new IntentFilter(ACTION_USB_PERMISSION);
filter.addAction(UsbManager.ACTION_USB_DEVICE_DETACHED);
m_UsbReceiver = new BroadcastReceiver() {
public void onReceive(Context context, Intent intent) {
String action = intent.getAction();
if (UsbManager.ACTION_USB_DEVICE_DETACHED.equals(action)) {
UsbDevice device = (UsbDevice)intent.getParcelableExtra(UsbManager.EXTRA_DEVICE);
if (device != null) {
// call your method that cleans up and closes communication with the device
Log.v("BroadcastReceiver", "Device Detached");
}
}
}
};
registerReceiver(m_UsbReceiver, filter);
m_manager = (UsbManager) getSystemService(Context.USB_SERVICE);
m_factory = new DeviceFactory(this,mPermissionIntent);
}
and this is the code of my test:
ArrayList<DeviceInterface> devList = m_factory.getDevicesList();
if ( devList.size() > 0){
DeviceInterface devIf = devList.get(0);
UsbDeviceConnection connection;
try
{
connection = m_manager.openDevice(m_device);
}
catch (Exception e)
{
return null;
}
The test will work OK for 900 to 1000 calls and after this the following call will return null (without exception):
UsbDeviceConnection connection;
try
{
connection = m_manager.openDevice(m_device);
}
You might just run out of file handles, a typical limit would be 1024 open files per process.
Try calling close() on the UsbDeviceConnection, see doc.
The UsbDeviceConnection object has allocated system ressources - e.g. a file descriptor - which will be released only on garbage collection in your code. But in this case you run out of ressources before you run out of memory - which means the garbage collector is not invoked yet.
I had opendevice fail on repeated runs on android 4.0 even though I open only once in my code. I had some exit paths that did not close the resources and I had assumed the OS would free it on process termination.
However there seems to be some issue with release of resources on process termination -I used to have issues even when I terminated and launched a fresh process.
I finally ensured release of resources on exit and made the problem go away.

Is there an easy way to subscribe to the default error queue in EasyNetQ?

In my test application I can see messages that were processed with an exception being automatically inserted into the default EasyNetQ_Default_Error_Queue, which is great. I can then successfully dump or requeue these messages using the Hosepipe, which also works fine, but requires dropping down to the command line and calling against both Hosepipe and the RabbitMQ API to purge the queue of retried messages.
So I'm thinking the easiest approach for my application is to simply subscribe to the error queue, so I can re-process them using the same infrastructure. But in EastNetQ, the error queue seems to be special. We need to subscribe using a proper type and routing ID, so I'm not sure what these values should be for the error queue:
bus.Subscribe<WhatShouldThisBe>("and-this", ReprocessErrorMessage);
Can I use the simple API to subscribe to the error queue, or do I need to dig into the advanced API?
If the type of my original message was TestMessage, then I'd like to be able to do something like this:
bus.Subscribe<ErrorMessage<TestMessage>>("???", ReprocessErrorMessage);
where ErrorMessage is a class provided by EasyNetQ to wrap all errors. Is this possible?
You can't use the simple API to subscribe to the error queue because it doesn't follow EasyNetQ queue type naming conventions - maybe that's something that should be fixed ;)
But the Advanced API works fine. You won't get the original message back, but it's easy to get the JSON representation which you could de-serialize yourself quite easily (using Newtonsoft.JSON). Here's an example of what your subscription code should look like:
[Test]
[Explicit("Requires a RabbitMQ server on localhost")]
public void Should_be_able_to_subscribe_to_error_messages()
{
var errorQueueName = new Conventions().ErrorQueueNamingConvention();
var queue = Queue.DeclareDurable(errorQueueName);
var autoResetEvent = new AutoResetEvent(false);
bus.Advanced.Subscribe<SystemMessages.Error>(queue, (message, info) =>
{
var error = message.Body;
Console.Out.WriteLine("error.DateTime = {0}", error.DateTime);
Console.Out.WriteLine("error.Exception = {0}", error.Exception);
Console.Out.WriteLine("error.Message = {0}", error.Message);
Console.Out.WriteLine("error.RoutingKey = {0}", error.RoutingKey);
autoResetEvent.Set();
return Task.Factory.StartNew(() => { });
});
autoResetEvent.WaitOne(1000);
}
I had to fix a small bug in the error message writing code in EasyNetQ before this worked, so please get a version >= 0.9.2.73 before trying it out. You can see the code example here
Code that works:
(I took a guess)
The screwyness with the 'foo' is because if I just pass that function HandleErrorMessage2 into the Consume call, it can't figure out that it returns a void and not a Task, so can't figure out which overload to use. (VS 2012)
Assigning to a var makes it happy.
You will want to catch the return value of the call to be able to unsubscribe by disposing the object.
Also note that Someone used a System Object name (Queue) instead of making it a EasyNetQueue or something, so you have to add the using clarification for the compiler, or fully specify it.
using Queue = EasyNetQ.Topology.Queue;
private const string QueueName = "EasyNetQ_Default_Error_Queue";
public static void Should_be_able_to_subscribe_to_error_messages(IBus bus)
{
Action <IMessage<Error>, MessageReceivedInfo> foo = HandleErrorMessage2;
IQueue queue = new Queue(QueueName,false);
bus.Advanced.Consume<Error>(queue, foo);
}
private static void HandleErrorMessage2(IMessage<Error> msg, MessageReceivedInfo info)
{
}

Async Web Service call from Silverlight 3

I have a question regarding the sequencing of events in the scenario where you are calling a wcf service from silverlight 3 and updating the ui on a seperate thread. Basically, I would like to know whether what I am doing is correct... Sample is as follows. This is my first post on here, so bear with me, because i am not sure how to post actual code. Sample is as follows :
//<summary>
public static void Load(string userId)
{
//Build the request.
GetUserNameRequest request =
new GetUserNameRequest { UserId = userId };
//Open the connection.
instance.serviceClient = ServiceController.UserService;
//Make the request.
instance.serviceClient.GetUserNameCompleted
+= UserService_GetUserNameCompleted;
instance.serviceClient.GetGetUserNameAsync(request);
return instance.VM;
}
/// <summary>
private static void UserService_GetUserNameCompleted(object sender, GetUserNameCompletedEventArgs e)
{
try
{
Controller.UIDispatcher.BeginInvoke(() =>
{
//Load the response.
if (e.Result != null && e.Result.Success)
{
LoadResponse(e.Result);
}
//Completed loading data.
});
}
finally
{
instance.serviceClient.GetUserNameCompleted
-= UserService_GetUserNameCompleted;
ServiceHelper.CloseService(instance.serviceClient);
}
}
So my question basically is, inside of my UI thread when I am loading the response if that throws an exception, will the "finally" block catch that ? If not, should i put another try/catch inside of the lambda where I am loading the response ?
Also, since I am executing the load on the ui thread, is it possible that the finally will execute before the UI thread is done updating ? And could as a result call the Servicehelper.CloseService() before the load has been done ?
I ask because I am having intermittent problems using this approach.
The finally block should get executed before the processing of the response inside the BeginInvoke. BeginInvoke means that the code will get executed in the next UI cycle.
Typically the best approach to this type of thing is to pull all the data you need out of the response and store it in a variable and then clean up your service code. Then make a call to BeginInvoke and update the UI using the data in the variable.