We are working on a problem wherein we have multiple Camunda workflows running at the same time. All of them have the same context.
Ex : Each of them has the same user tasks and service tasks, process variables. Just that the ordering can be a bit different. Some workflow can have these tasks in parallel, others can have it sequential.
Context : Each workflow (bpmn file) represents journey via different mediums. One can be for journey via PC, other for journey via Mobile.
We need the workflow states to be in sync throughout the life of the application. The business data is de-coupled from camunda and is at a centralised place, so it always remains in sync. But we wanted to sync the camunda process variables, task completion states and all such workflow parameters.
If a person completes step A, it should be reflected on the other workflows as well, in near real time
Is there a suggested design/way to handle this?
Related
I want to manage a huge workflow in Camunda.
I have decided to split this into different processes like Create, Configuration, Review & Confirm. Each of these processes have 10 to 15 tasks. These processes should be executed in sequence.
If I want to design my workflow like this, how will I link each process. What is the proper way for Camunda modular design.
You would probably go with some kind of SubProcess. If you plan to model different processes you most likely will use Call Activities and execute them one ofter another in some kind of root process.
Beware of the fact that each sub process starts its own process instance and thus you have to handle different execution scopes. That will be relevant if you request information from the system like e.g. the List of UserTasks. You can not use the processInstanceId of the root process in this case and will have to use a businessKey.
You also have to handle the process variables and decide which variables you want to propagate to the sub process.
I am modeling some processes to be used by non-IT people (i.e. they need to be as clear as possible but I also don't wanna break any BPMN rules).
I attached a mockup of what I'm trying to show => a person performs some steps in the system but it's also important for the people reading the model to understand what system does after each of the user steps (e.g. that system automatically calculates a risk score). What's the best practice to model this in BPMN? I assume in any case (read: if this is a good approach in general) it is a pool, not a lane - but in this case the system pool would also need a start and finish, right?
The system is part of your organization so model it as a separate lane in the same pool as the rest of your process.
To indicate if the step is automated or done by a user use action types - script for steps done automatically by the system and user for those performed by a user.
Actions within the same pool are connected with solid lines to indicate business flow.
If we use MDA/CIM system not modeled as a part of proces (lane). Software is the tool not role....
(PS two pools, one for company second for system is bad, BPMN use one poll for one proces...).
We use mapping "activity to use case" for showing where is the system using.
How should we best handle code that is part of a single Rails app, but is used in several different "modes"?
We have several different cases of an app that is driven from the same data sources (MySQL, MongoDB, SOLR) and shares core logic, assets, etc. across multiple different uses.
Background/details:
HTML vs REST API
A common scenario is that we have HTML and REST interfaces. These differences are handled through routing (e.g. /api/v1/user/new vs /user/new) -- with minor differences they provide the same functions. This seems reasonably clean to me.
Multi-tenant
Another common scenario is that the app is "multi-tenant", determined mainly by subdomain of the URL, e.g. partner1.example.com and partner2.example.com (or query-string parameter for API customers) -- each has a number of features or properties that differ. This is handled by a filter ApplicationController using data largely stored in a set of tenant-specific database tables with tenant-specific functionality encapsulated by methods. This also seems reasonably clean to me.
Offline Tasks
One scenario is that a great deal of the data is acquired through a very large number of tasks, running pretty much continuously: feed loaders, scrapers, crawlers, and other tasks of this sort ... the kinds of things you would find in a search engine, which is a large part of what we do. These tasks are launched on idle server instances and run periodically ... but are just rake tasks that are part of the app.
These tasks are characteristically different than our front-end code -- they update data, run calculations, do maintenance tasks and so on -- some tasks run for days (e.g. update 30M documents from an external web service). In the end, these tasks create and keep fresh the core data that our front end app uses.
This one doesn't seem as clean to me, in particular, in some cases, these tasks are running and doing data updates at the same time as our application is using them, so occasionally need to defer to the front-end app when we're under peak loads.
Major Variants of the App
This last case is clearly wrong -- we have made major customizations of our app -- 15% or 20% different, by making branches and then running as an entirely separate app, sharing some of the core data sometimes, but using some of its own data other times. We have mostly fixed this now, as it was, of course, untenable.
OK, there's a question in here somewhere, right?
So in particular for the offline tasks I feel like the app really needs to be launched in a "mode" or perhaps "sub-environment". But we still have normal development, test, qa, demo, pre_release, production environments that have their own isolated data and other configuration parameters. For each of these, we want to be able to run, develop, test and deploy the various "modes" of the application.
Can anyone suggest an appropriate architecture that is similar to the declarative notions of standard Rails environments?
If the number of modes is ever-increasing:
Perhaps the offline tasks could be separated from the main app, into their own application (or a parent abstract task with actual tasks inheriting from it and deployed individually).
If the number of modes is relatively small and won't be changing often:
You could put the per-mode configuration into a config file, logically separate from the rest of the code. Then during the deployments, you would be able to provide a combination of (environment, mode, set of hosts) and get a good level of control of your environments while using the same codebase.
The way my team currently schedules jobs is through the SQL Server Job Agent. Many of these jobs have dependencies on other internal servers which in turn have their own SQL Server Jobs that need to be run to keep their data up to date.
This has created dependencies in the start time and length of each of our SQL Server Jobs. Job A might depend on Job B finishing, so we schedule Job B a certain estimated time in advance to Job A. All of this process is very subjective and not scalable, as we add more jobs and servers which create more dependencies.
I would love to get out of the business of subjectively scheduling these jobs and hoping that the dominos fall in the right order. I am wondering what the accepted practices for scheduling SQL Server jobs are. Do people use SSIS to chain jobs together? Is there tooling already built into the SQL Server Job Agent to handle this?
What is the accepted way to handle the scheduling of multiple SQL Server jobs with dependencies on each other?
I have used Control-M before to schedule multiple inter-dependent jobs in different environment. Control-M generally works by using batch files (from what I remember) to execute SSIS packages.
We had a complicated environment hosting 2 data warehouses side by side (1 International and 1 US Local). There were jobs that were dependent on other jobs and those jobs on others and so on, but by using Control-M we could easily decide on the dependency (It has a really nice and intuitive GUI). Other tool that comes to my mind is Tidal Scheduler.
There is no set standard for job scheduling, but I think its safe to say that job schedules depend entirely on what an organization needs. For example Finance jobs might be dependent on Sales and Sales on Inventory and so on. But the point is, if you need to have job inter dependency, using a third party software such as Control-M is a safe bet. It can control jobs on different environments and give you real sense of the company wide job control.
We too had the requirement to manage dependencies between multiple agent jobs - after looking at various 3rd party tools and discounting them for various reasons (mainly down to the internal constraints relating to the use of 3rd party software) we decided to create our own solution.
The solution centres around a configuration database that holds details about processes (jobs) that need to run and how they are grouped (batches), along with the dependencies between processes.
Summary of configuration tables used:
Batch - highlevel definition of a group of related processes, includes metadata such as max concurrent processes, and current batch instance etc.
Process - meta data relating to a process (job) such as name, max wait time, earliest run time, status (enabled / disabled), batch (what batch the process belongs to), process job name etc.
Batch Instance - the active instance of a given batch
Process Instance - active instances of processes for a given batch
Process Dependency - dependency matrix
Batch Instance Status - lookup for batch instance status
Process Instance Status - loolup for process instance status
Each batch has 2 control jobs - START BATCH and UPDATE BATCH. The 1st deals with starting all processes that belong to it and the 2nd is the last to run in any given batch and deals with updating the outcome statuses.
Each process has an agent job associated with it that gets executed by the START BATCH job - processes have a capped concurrency (defined in the batch configuration) so processes are started up to a max of x at a time and then START BATCH waits until a free slot becomes available before starting the next process.
The process agent job steps call a templated SSIS package that deals with the actual ETL work and with the decision making around whether the process needs to run and has to wait for dependencies etc.
We are currently looking to move to a Service Broker solution for greater flexibility and control.
Anyway, probably too much detail and not enough example here so VS2010 project available on request.
I'm not sure how much this will help, but we ended up creating an email solution for scheduling.
We built an email reader that accesses an exchange mailbox. As jobs finish, they send an email to the mail reader to start another job. The other nice part, is that most applications have email notifications built in, so there really isn't much in the way of custom programming.
We really only built it in the first place to handle data files coming in from lots of other partners. It was much easier to give them an email address rather than setting them up with an ftp site, etc.
The mail reader app now has grown to include basic filtering, time of day scheduling, use of semaphores to prevent concurrent jobs, etc. It really works great.
Given a SQL Server-persisted .NET 4 Windows Workflow Foundation (WF) workflow service deployed under AppFabric, how can I "jump" the service from one activity to another? The workflow could be sequential or flowchart.
The use case is administrative. A long-running workflow is idle at Receive activity A. Some client mistakenly calls the service, progressing it to Receive activity B. The workflow (which could be embedded in a larger workflow) has no path back to A. The client calls the support desk and requests that the workflow be set back to A.
We've seen this case occur frequently in production. Our existing BPM system supports a "goto" call. How can this be accomplished in WF 4?
EDIT: If the above is not practical, what is a good design pattern for implementing a "fail" activity off from the "happy path" that can branch to one of a limited number of known prior activities (restart from here) based on a variable? The goal is to avoid creating an unreadable workflow with a multitude of lines.
EDIT 2: We decided not to go this route, but there's a newer MSDN article on doing just this.
EDIT 3: We changed our minds again and are going with Leon Welicki's solution from the MSDN article linked above. :)
This can't be done out of the box.
If it can be done at all it would mean opening up the workflow state, stored in 4 binary columns and changing those to the previous state knowing that any number of activities could have executed and any variables could have been changed or even dropped because they are no longer in scope.
Suppose I was going to try this I would try copying the state from the SQL database every time a workflow went idle so you get a sort of stack with all previous idle states of a workflow. Then at some later time when the workflow is idle and not in memory you can replace the current state with a previous state and reload the workflow. I have never tried it so don't know if it will work and see quite a few potential problems, thinks like DB transaction having competed or emails having been send but executing a second time.