Save history of incremental changes of the flow without cloning them in Mosaic Decisions

Save history of incremental changes of the flow without cloning them in Mosaic Decisions - mosaic-decisions

While configuring a particular data pipeline in Mosaic Decisions, I want to try out different operations by using the available process nodes. I would like to keep the first few configured nodes for future reference and continue to add some other nodes.
To do this, I'm currently cloning the flow after each incremental change. But, due to this, many flows are getting configured and it becomes very difficult to keep track.
Is there any alternative way to save the history of these multiple configurations of the flow for future reference without cloning and executing them separately?

You can save the history of changes you have made in the flow by simply saving it as a version using Save As Version option provided in the canvas header.
You can also add a description for each of the incremental steps and edit a particular version later if you want. Later, you can also execute each of the saved versions separately by publishing that version from the Version tab and then
executing it normally.

Related

update old processes with the new process definition -Activiti

I have some processes that ran with old process definitions. But due to requirement change the user task data has been updated with new attributes and this process definition has been deployed. I'm aware that "SetProcessDefinitionVersionCmd" can be set to "yes" to point the processes to the new definition/version.
I would like to know how to migrate the old process data to have the newly added attributes of the user task updated in them?

There is no easy way to migrate process instance data, however, when you set the version to the new process definition the instance data will go with the migrated instance.
What you have to be careful of is to make sure you include null checks for any of the data that may not be present in the migrated process instances.
Hope this helps,
Greg

Indeed there is no easy way for migration, however depending on the differences between the two definitions and to what extend you may not prefer to use SetProcessDefinitionVersionCmd, you may find DynamicBpmnService useful when combined with detecting definitions' versions inside your logic.
And yes another way would be to use SetProcessDefinitionVersionCmd but be extra cautions for tasks that were actually active prior to migration, as Activiti's database model have some redundant data (some for performance reasons), you are better studying the DB tables first for these tasks and then inspecting the before and after migration state. For example, keeping up with a simple changed attribute is much easier than an added boundary event on an active User Task, which affects the "execution tree".
I would also advice to compare SetProcessDefinitionVersionCmd's implementations between Activiti and Camunda, it is sad to have such enhancements efforts separated, but that is another story.

Call a pipeline from a pipeline in Amazon Data Pipeline

My team at work is currently looking for a replacement for a rather expensive ETL tool that, at this point, we are using as a glorified scheduler. Any of the integrations offered by the ETL tool we have improved using our own python code, so I really just need its scheduling ability. One option we are looking at is Data Pipeline, which I am currently piloting.
My problem is thus: imagine we have two datasets to load - products and sales. Each of these datasets requires a number of steps to load (get source data, call a python script to transform, load to Redshift). However, product needs to be loaded before sales runs, as we need product cost, etc to calculate margin. Is it possible to have a "master" pipeline in Data Pipeline that calls products first, waits for its successful completion, and then calls sales? If so, how? I'm open to other product suggestions as well if Data Pipeline is not well-suited to this type of workflow. Appreciate the help

I think I can relate to this use case. Any how, Data Pipeline does not do this kind of dependency management on its own. It however can be simulated using file preconditions.
In this example, your child pipelines may depend on a file being present (as a precondition) before starting. A Master pipeline would create trigger files based on some logic executed in its activities. A child pipeline may create other trigger files that will start a subsequent pipeline downstream.
Another solution is to use Simple Workflow product . That has the features you are looking for - but would need custom coding using the Flow SDK.

This is a basic use case of datapipeline and should definitely be possible. You can use their graphical pipeline editor for creating this pipeline. Breaking down the problem:
There are are two datasets:
Product
Sales
Steps to load these datasets:
Get source data: Say from S3. For this, use S3DataNode
Call a python script to transform: Use ShellCommandActivity with staging. Data Pipeline does data staging implicitly for S3DataNodes attached to ShellCommandActivity. You can use them using special env variables provided: Details
Load output to Redshift: Use RedshiftDatabase
You will need to do add above components for each of the dataset you need to work with (product and sales in this case). For easy management, you can run these on an EC2 Instance.
Condition: 'product' needs to be loaded before 'sales' runs
Add dependsOn relationship. Add this field on ShellCommandActivity of Sales that refers to ShellCommandActivity of Product. See dependsOn field in documentation. It says: 'One or more references to other Activities that must reach the FINISHED state before this activity will start'.
Tip: In most cases, you would not want your next day execution to start while previous day execution is still active aka RUNNING. To avoid such a scenario, use 'maxActiveInstances' field and set it to '1'.

How to tracking process's status in Tibco?

I hope you show me resolve in my case.
When I define many process, how to get status data's tracking of that process. In other word, I want to get process's history. My purpose to show for my client checking.
I have defined a process communicate 3 applications and i deploy it to client.but unfortunately, my client would like to add more an application ( up to 4 apps) in the future. i wonder if how to do that? i perhaps open process again and edit it. Have a way create dynamic process.
Thanks very much.
PVA.

You get a very limited "history" in TIBCO Administator (more or less which process instances completed with success/failure; in case of failure it will also provided the exception and where in the process it failed). However that doesn't show you any tracking of the individual steps/activities that the process passed through. For this, you'd either have to put lots of logging steps into your process (and need to build something that parses this information from log files). Or you could use BusinessWorks ProcessMonitoring, which gives you a full history trail for each process automatically. However it not included with BW and you'll probably need a separate license.
Change the process in TIBCO Designer, build a new ear file, re-deploy the new EAR file in TIBCO Administrator.

Accurev - why not Auto-Update?

Why isn't it standard behavior for Accurev to automatically run an "Update" upon opening the program? "Update" updates a user's local sandbox with the latest files from the building/promoted area.
It seems like expected functionality that the most recent files should be synchronized first.
I'm not claiming that it should always update, but curious as to why an auto-Update wouldn't be correct.

Auto-updating could produce some very unwanted results.
Take this scenario: you're in the middle of a development task, but you've made a mistake and need to revert a file that you just modified. So you open AccuRev, but before you have a chance to "revert to most recent version", you are bombarded with 100 files that have been changed upstream including the one you want to revert. You are now forced into the position of resolving all the merge conflicts before your solution will build, including the merge of your (possibly unstable) code in progress.
Requiring the user to manually update keeps a protective 'bubble' around the developer, allowing them to commit (keep) changes within their own workspace without bringing down code changes that could destabilise the work in their sandbox. When the developer gets to a point where his code is ready to share with others, that is the appropriate time to do an update and subsequently build/retest the merged codebase before promoting.
However there is one scenario that I do believe auto-updating could be useful: after a workspace is reparented. i.e. when a developer's workspace is moved from one part of the stream hierarchy to another. Every time we reparent we have to do a little dance:
Accept the confirmation dialog that reminds us (rather verbosely) that we need to update our workspace before we can promote any changes.
Double-click the workspace to view its files.
Wait for AccuRev to do a "Pending" search, to determine whether any file changes are waiting to be committed.
And finally, perform the Update.
Instead of just giving us a confirmation dialog, it would be nice if AccuRev could just ask us if we want to Update immediately.

I guess it depends on preference. I for one wouldn't like the auto-update feature.
Imagine you have a huge project and you don't want to build it every time you start Accurev. But you also can't debug because the source files and debugging info no longer correspond.

TeamCity: Managing deployment dependencies for acceptance tests?

I'm trying to configure a set of build configurations in TeamCity 6 and am trying to model a specific requirement in the cleanest possible manner way enabled by TeamCity.
I have a set of acceptance tests (around 4-8 suites of tests grouped by the functional area of the system they pertain to) that I wish to run in parallel (I'll model them as build configurations so they can be distributed across a set of agents).
From my initial research, it seems that having a AcceptanceTests meta-build config that pulls in the set of individual Acceptance test configs via Snapshot dependencies should do the trick. Then all I have to do is say that my Commit build config should trigger AcceptanceTests and they'll all get pulled in. So, lets say I also have AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC
So far, so good (I know I could also turn it around the other way and cause the Commit config to trigger AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC - problem there is I need to manually aggregate the results to determine the overall success of the acceptance tests as a whole).
The complicating bit is that while AcceptanceSuiteC just needs some Commit artifacts and can then live on it's own, AcceptanceSuiteA and AcceptanceSuiteB need to:
DeploySite (lets say it takes 2 minutes and I cant afford to spin up a completely isolated one just for this run)
Run tests against the deployed site
The problem is that I need to be able to ensure that:
the website only gets configured once
The website does not get clobbered while the two suites are running
If I set up DeploySite as a build config and have AcceptanceSuiteA and AcceptanceSuiteB pull it in as a snapshot dependency, AFAICT:
a subsequent or parallel run of AcceptanceSuiteB could trigger another DeploySite which would clobber the deployment that AcceptanceSuiteA and/or AcceptanceSuiteB are in the middle of using.
While I can say Limit the number of simultaneously running builds to force only one to happen at a time, I need to have one at a time and not while the dependent pieces are still running.
Is there a way in TeamCity to model such a hierarchy?
EDIT: Ideas:-
A crap solution is that DeploySite could set a 'in use flag' marker and then have the AcceptanceTests config clear that flag [after AcceptanceSuiteA and AcceptanceSuiteB have completed]. The problem then becomes one of having the next DeploySite down the pipeline wait until said gate has been opened again (Doing a blocking wait within the build, doesnt feel right - I want it to be flagged as 'not yet started' rather than looking like it's taking a long time to do something). However this sort of stuff a flag over here and have this bit check it is the sort of mutable state / flakiness smell I'm trying to get away from.
EDIT 2: if I could programmatically alter the agent configuration, I could set Agent Requirements to require InUse=false and then set the flag when a deploy starts and clear it after the tests have run

Seems you go look on the Jetbrains Devnet and YouTrack tracker first and remember to use the magic word clobber in your search.
Then you install groovy-plug and use the StartBuildPrecondition facility
To use the feature, add system.locks.readLock. or system.locks.writeLock. property to the build configuration.
The build with writeLock will only start when there are no builds running with read or write locks of the same name.
The build with readLock will only start when there are no builds running with write lock of the same name.
therein to manage the fact that the dependent configs 'read' and the DeploySite config 'writes' the shared item.
(This is not a full productised solution hence the tracker item remains open)
EDIT: And I still dont know whether the lock should be under Build Parameters|System Properties and what the exact name format should be, is it locks.writeLock.MYLOCKNAME (i.e., show up in config with reference syntax %system.locks.writeLock.MYLOCKNAME%) ?
Other puzzlers are: how does one manage giving builds triggered by build completion of a writeLock task read access - does the lock get dropped until the next one picks up (which would allow another writer in) - or is it necessary to have something queue up the parent and child dependency at the same time ?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas