I have a Read Side Processor that uses JdbcReadSide. In the globalPrepare method creates a table in the DB and in the handler map some entities from the event and inserts some records. How can i unit test it?
Related
I have a number of pipelines that need to cycle depending on availability of data. If the data is not there wait and try again. The pipe behaviours are largely controlled by a database which captures logs which are used to make decisions about processing.
I read the Microsoft documentation about the Execute Pipeline activity which states that
The Execute Pipeline activity allows a Data Factory or Synapse
pipeline to invoke another pipeline.
It does not explicitly state that it is impossible though. I tried to reference Pipe_A from Pipe_A but the pipe is not visible in the drop down. I need a work-around for this restriction.
Constraints:
The pipe must not call all pipes again, just the pipe in question. The preceding pipe is running all pipes in parallel.
I don't know how many iterations are needed and cannot specify this quantity.
As far as possible best effort has been implemented and this pattern should continue.
Ideas:
Create a intermediary pipe that can be referenced. This is no good I would need to do this for every pipe that requires this behaviour because dynamic content is not allowed for pipe selection. This approach would also pollute the Data Factory workspace.
Direct control flow backwards after waiting inside the same pipeline if condition is met. This won't work either, the If activity does not allow expression of flow within the same context as the If activity itself.
I thought about externalising this behaviour to a Python application which could be attached to an Azure Function if needed. The application would handle the scheduling and waiting. The application could call any pipe it needed and could itself be invoked by the pipe in question. This seems drastic!
Finally, I discovered an activity Until which has do while behaviour. I could wrap these pipes in Until, the pipe executes and finishes and sets database state to 'finished' or cannot finish and sets the state to incomplete and waits. The expression then either kicks off another execution or it does not. Additional conditional logic can be included as required in the procedure that will be used to set a value to variable used by the expression in the Until. I would need a variable per pipe.
I think idea 4 makes sense, I thought I would post this anyway in case people can spot limitations in this approach and/or recommend an approach.
Yes, absolutely agree with All About BI, its seems in your scenario the best suited ADF Activity is Until :
The Until activity in ADF functions as a wrapper and parent component
for iterations, with inner child activities comprising the block of
items to iterate over. The result (s) from those inner child
activities must then be used in the parent Until expression to
determine if another iteration is necessary. Alternatively, if the
pipeline can be maintained
The assessment condition for the Until activity might comprise outputs from other activities, pipeline parameters, or variables.
When used in conjunction with the Wait activity, the Until activity allows you to create loop conditions to periodically check the status of specific operations. Here are some examples:
Check to see if the database table has been updated with new rows.
Check to see if the SQL job is complete.
Check to see whether any new files have been added to a specific
folder.
I would like to have a simple CQRS implementation on an API.
In short:
Separate routes for Command and Query.
Separate DB tables (on the same DB at the moment). Normalized one for Command and a de-normalized one for Query.
Asynchronous event-driven update of Query Read Model, using existing external Event Bus.
After the Command is executed, naturally I need to raise an event and pass it to the Event Bus.
Event Bus would process the event and pass it to it's subscriber(s).
In this case the subscriber is Read Model which needs to be updated.
So I need a callback route on the API which gets the event from Command Bus and updated the Read Model projection (i.e.: updating the de-normalized DB table which is used for Queries).
The problem is that the update of the Read Model projection is neither a Command (we do not execute any Domain Logic) nor a Query.
The questions is:
How should this async Read Model update work in order to be compliant both with CQRS and DDD?
How should this async Read Model update work in order to be compliant both with CQRS and DDD?
I normally think of the flow of information as a triangle.
We copy information from the outside world into our "write model", via commands
We copy information from the write model into our "read model"
We copy information from the read model to the outside world, via queries.
Common language for that middle step is "projection".
So the projection (typically) runs asynchronously, querying the "write model" and updating the "read model".
In the architecture you outlined, it would be the projection that is subscribed to the bus. When the bus signals that the write model has changed, we wake up the projection, and let it run so that it can update the read model.
(Note the flow of information - the signal we get from the bus triggers the projection to run, but the projection copies data from the write model, not from the event bus message. This isn't the only way to arrange things, but it is simple, and therefore easy to reason about when things start going pear shaped.)
It is often the case that the projection will store some of its own metadata when it updates the read model, so as to not repeat work.
I am currently implementing a server system which has both an SQL database and a Redis datastore. The data written to Redis heavily depends on the SQL data (cache, objects representing logic entities defined by a number of SQL models and their relationships).
While looking for an error handling methodology to wrap client requests, something similar to SQL's transaction commit & rollback (Redis doesn't support rollbacks), I thought of a mechanism which can serve this purpose and I'd appreciate input regarding it.
Basically, I intend to wrap (with before/after middleware) every client request with an SQL transaction and a Redis multi command (pipes commands until exec or discard command is invoked), and allow both transactions to occur only if the request was processed successfully.
The problem is that once you start a Redis multi command, you are not able to preform any reads/writes and actually use their values while processing a request. I reduced the problem just for reads since depending on just-now written values can be optimized out.
My (simple) solution: split the Redis connection into two - a writer and a reader. The writer connection will be the one to be initialized with the multi command and executed/discarded at the end. Of course, all writing will be preformed through it, while reading is done using the reader (executed instantly).
The down side: as opposed to SQL, you can't rely on values written in the same request (transaction). Then again, usually quite easy to overcome.
We are developing an application that makes use of deep object models that are being stored in Gemfire. Clients will access a REST service to perform cache operations instead of accessing the cache via the Gemfire client. We are also planning to have a database underneath Gemfire as the system of record making use of asynchronous write-behind to do the DB updates and inserts.
In this scenario it seems to be a non-trivial problem to guarantee that an insert or update into Gemfire will result in a successfull insert or update in the DB without setting up elaborate server-side validation (essentially the constraints to the Gemfire operation would have to match the DB operation constraints). We cannot bubble the DB insert or update success/failure back to the client without making the DB call synchronous to the Gemfire operation. This would obviously defeat the purpose of using Gemfire for low latency client operations.
We are curious as to how other Gemfire adopters who are using write-behind have solved the problem of keeping the DB in sync with the Gemfire data fabric?
We generally recommend to implement validations in GemFire in a CacheWriter since by definition it's going to be called before any modification occurs on the cache.
http://gemfire.docs.gopivotal.com/javadocs/japi/com/gemstone/gemfire/cache/CacheWriter.html
That said, a pattern that I saw on a customer was to receive data on region "A" and implement basic validations there, accepting that data. This data will then be copied to the DB using write-behind as you describe, batched through AsyncEventListener where in the try/catch of the insertion, if any error occur they store that data in a another region that has another Listener for the write-behind, non-batched, so you can actually see which record failed and decide what to do accordingly.
Something like:
data -> cacheWriter basic checks -> region A -> AEL (batch 500~N events) -> DB
If error occurs copy to region B -> Listener persist individual records on DB ->
Do some action on the failed record.
In their case, they had an interface to manually clear pending records on region B.
Hope that helps, if not maybe you can provide more details about your use case...
Cheers.
You can register an AsyncEventListener and update your DB with batch updates. For more info:
http://pubs.vmware.com/vfabric53/topic/com.vmware.vfabric.gemfire.7.0/developing/events/implementing_write_behind_event_handler.html
Gemfire ver8 and above solves your problem of accessing Cache objects REST service.
Gemfire REST
I'm planning to build a console app to run as part of a SQL 2005 job which will gather records from a database table, create a request object for a WCF service, pass this object to the service for processing, receive a response object, and update a log table with its data. This will be for processing at least several thousand records each time the job step executes.
The WCF service currently exposes a single method which I'd be hitting once for each record in the table, so I imagine I'd want to open a channel to the service, keep it open during processing, then close and dispose and such when complete.
Beyond maintaining the connection, how else could I minimize this console app's performance as a bottleneck? Should I not use a console app and instead try using SQLCLR or some other means to perform this processing?
You've probably considered Service Broker...