How to handle authentication count to be compliant with CQRS pattern? - authentication

I need to count improper authentication attempts to some accounts in my application. If some certain value is reached i need to block the account. In my understanding of CQS/CQRS, an authentication request is kind of query. Queries should not modify any data on the server side in my opinion. To solve such a problem i should update some attribute in the database while handling the query and this would be violation of CQRS principles i guess. What should i do?? Is an authentication a command in my case (remember commands, cannot return any value so how can i know that authentication is correct for example)?? Maybe i should publish some event after unsuccessful authentication?? How can i solve such a problem? Thanks for any answer.

Queries should not modify domain state.
I'm not aware of a general prohibition on a command returning data (strictly interpreted, that constraint would preclude any sort of acknowledgement of a command).
The "S" in CQRS is often interpreted too strictly in my experience: any write model (at least any write model which retains the right to decide that a command is inapplicable based on the results of previous commands) carries with it a read model (if event sourcing, that read model is typically going to be a snapshot derived from the write model, but the principle holds). If there's a query which can be effectively answered from that read model, there's not necessarily any gain from having a different read model handle the query (the best reason to do that separation in that scenario is if there'd be enough query volume that handling "query commands" degrades write performance).
So I'd advise modeling authentication as a command (it almost certainly changes the state of the system) and returning whatever auth status info is relevant/available from an auth as a response to the command.

Related

Multiple data insertions using async writing with Apache Geode

We have Apache Geode connected to Postgres using an AEQ + AsyncCacheListener configured to write data to Postgres. During async write, we submit the list of events that we want to persist and it asynchronously inserts those events. Let's say I have two client applications which calls processEvents for async writing and both have some events in common which violate some key. But, after client calls processEvents, control is immediately returned to client. In such cases how will client know if some issue occurred? What are the best practices to tackle this?
What do you mean by the events in common "violate some key"? Like a primary or foreign key constraint, or some other database constraint perhaps (e.g. uniqueness, non-null values, etc)?
Handling a conflict depends on the importance and nature of the data being inserted, or written to the backend (Postgres) database from Geode and its significance to the application, from a requirements and business logic POV.
If 2 (or more) client applications are writing to the same cache/database entries/records, then certainly some type of collision will eventually occur, and how it is handled will depend on the data and the type of operation performed on the data.
In general, handling the violation closer to where and when the violation occurs (e.g. inside the AsyncEventListener itself) maybe preferable or ideal, since then you should have most of the necessary information (e.g. DataAccessException, events, additional capabilities to query the DB) to deal with the situation.
Inside the AEQ Listener, you could employ different strategies depending on the data and operation as determined by the application:
First update wins (enforced by optimistic locking)
Perform a merge
Log [failed] event(s)
Overwrite value(s) (last update wins).
...
You could employ Geode to conflate events stored in the AEQ for the same key, which should minimize collisions/conflicts.
If the client (as in "client" in a client/server topology) needs to be informed, then you could write the failed events to another Region where a client registers a CQ to be notified when entries are written to this (failed events) Region. The client-side handler associated the CQ could then take the appropriate action, such as notifying the end-user, refreshing and then retrying the operation, and so on.
Given the async nature of the initial write, then you can only respond asynchronously once the violation occurs. This is not unlike in a Reactive world (namely with onSuccess/onFailure event handlers).
So, in this situation, I don't think there really is a "best practice" per-say, rather only "recommendations". For example, handling the situation as near to the actual occurrence of the violation as possible, since handling the violation usually involves having the necessary information readily available to make the best possible, informed decision on the right course of action.
Sometimes you can automate the recovery, other times you might need manual intervention. Most definitely, do not guess. Clearly document your application/systems (configured) behavior when it can handle a situation and when it cannot.
I don't think there is a general, 1 size fits all solution in this case.
I hope this gives you some ideas to think about.

Fix inconsistent state right away or lazily when data is requested

Our users go through several steps of workflow - the further they go the more objects we create. We also allow users to go back to Step#1 and change one of the existing objects. Which may cause inconsistencies so we must update/delete some of the objects at Step#2. I see 2 options:
Update/delete objects from Step#2 right away. This leads to:
Operation that's supposed to be a simple PATCH of an entity field becomes complicated. And it's a shared object between multiple workflows - so we'll have to add if-statements and do different things depending on the workflow.
Circular dependencies. Operations on Step#1 have to know about objects/operations on Step#2.
On each request in Step#1 we'd have to load data for Step#2 in order to determine whether Step#2 really needs to be updated. Which slows down operations on Step#1. So to change 1 record in DB we'll have to load hundreds (or even thousands) records for Step#2.
Many actions on Step#1 may need fixing state at Step#2. So we have to ensure we don't forget anything today and in the future.
Fix Step#2 lazily - when user goes there (our current approach). Step#2 will recognize that objects are inconsistent and fix them. Which leads to just 1 place where we need to care, but:
Until user opens Step#2 - DB will contain inconsistent objects. This hasn't resulted in any problems so far. But I can imagine it may complicate future SQL migrations.
We update DB state on GET request. This one doesn't seem like that big of a deal since GET stays idempotent anyway. But still it feels awkward.
Anyone knows better approaches? Or maybe improvements to these two?
Update
I haven't found perfect solution, but eventually we implemented an improved version of #1. When updating state on Step#1 we also set a flag "need to rebuild Step#2", when UI opens Step#2 it first checks this flag and issues a PUT to rebuild the state, and only then it GETs Step#2.
This still means that DB state is inconsistent for some period of time. But at least we'll know this for sure from the flag in DB. And if needed - we could write migrations taking this flag into account. This also allows (if needed in the future) to create an async job to fix the state.
I think it is more flexible to separate the state and the context where the objects are stored. Any creation of a new object at any step is accompanied by the preservation of the invariant and consistency of context.
There are separate rules of states - these are rules for transition from one to another and available objects for creation and separate rules for the context, rules for its consistency, which is ensured every time it changes.
What about dirty data asynchronous cleanup?
Whenever user goes back to Step #1 and changes something, mark all related data as "dirty" (e.g. add links to it in "DirtyData" table) and be done for now.
Have a DataCleanup worker (e.g. separate thread or smth) that constantly looks for data to be cleaned up.
Before editing data for Step #2, check if the data is not dirty.
Depending on your logic, 3) might result in user error (e.g. user would need to repeat Step #2). If DataCleanup worker has enough resources (i.e. it processes DirtyData table almost instantaneously), that should happen only on very rare occasions. If that is not OK, you could opt for checking for dirty data on each fetch, but that could be expensive.
It sounds like you're familiar with the HTTP spec regarding GET requests, but for future readers:
Why shouldn't a GET request change data on the server?
Why is using a HTTP GET to update state on the server in a RESTful call incorrect?
For the other bullet under 2, we probably don't need a specification to agree that persisting valid data is preferable to persisting invalid data.
So what can we do for the bullets under 1 to avoid complex branching logic in a particular step and also circular dependencies? My suggestion is an event-driven design. When step #2 changes it should fire a change event. In this scenario, step #2 has no knowledge of the concrete listener(s) who may receive its events, so it remains decoupled from any complex handling logic.
There's probably no way to guarantee you don't forget anything in the future; but if every step in the workflow is defined as a listener, it forces you to consider change events to some extent every time you implement a new step.
One side note on granularity: if a step has many changes, it can batch up its events rather than fire each one individually. You can adjust the size for efficiency.
In summary, I would strongly consider the Observer design pattern.

Good ways to decouple GUIs from SOAP/WS-API update/write calls?

Let's assume we have some configuration GUI that in its current form uses direct DB transactions to submit new configurations for more than one configurable component in a consistent manner.
Now let's move the data (DB) stuff behind some SOAP/WS API. The GUI has no direct DB access anymore. The transactional behaviour must remain, but the API should NOT be designed to explcitly accommodate the GUI form submissions. In fact, I don't even know how the new GUI will work or how the user input will be structured. Therefore I need to provide something like WS-AtomicTransaction on the API server side. However, there are (at least) two caveats:
The GUI is written in PHP: I don't think there is any WS-Transaction support in PHP available.
I don't want to keep DB transactions open on the server side while waiting for additional client requests.
Solutions I can think of:
using Camel's aggregation. However, that would make things more complicated in at least two ways:
You cannot use DB row ids of newly inserted rows in the subsequent calls inside the same transaction. You need to use some sort of symbolic back-referencing because there would be no communication between client and server while processing the aggregated messages.
call replies would not be immediate (or the immediate and separate reply to each single call would only be some sort of a stub, ie. not containing any useful information beyond "your message has been attached to TX xyz" -- if that's at all possible in the Camel aggregation case).
the two disadvantages of the previous solution make me think of request batches where possibly the WS standards provide means for referencing call results in subsequent calls inside the batch transaction. Is there any such thing already available? Maybe even as a PHP client?
trying to eliminate lock contention in the database by carefully using row-level locks etc. However, when inserting new elements, my guess is that usually pages and index pages need to be locked by the DB.
maybe some server-side persistence layer using optimistic locking? But again, that would not return any DB IDs back to the client before the final commit if DB writes would be postponed until the commit (don't know if that's possible at all).
What do YOU think?
Transactions are a powerful tool and we easily get into a thinking pattern in which we see every problem as a nail we hit with this big hammer. I can relate to your confusion because I've experienced it myself. Unfortunately I have no better advice for you than to try not think in terms of transactions but of atomic API calls.
When I think in terms of transactions, my thought pattern usually goes like this:
start transaction
read (repeat as required)
update (repeat as required)
commit/roll back
It takes some time to realize that we overuse this pattern. Actual conflicts are rare and there are many other ways of dealing with them. Here is a commonly used one in APIs
read and send data to client (atomic API call)
update data (on the client)
send original + updates back to the server (atomic API call)
start transaction (on server)
read
compare with original from client
if not same, return error (client should retry)
if same, update
commit
The last six points are part of the implementation of the API call.
Ferenc Mihaly
http://theamiableapi.com

Log user activity - which is better

I am using Action Filter Attributes for loging user activity on certain action which has SQL database interaction. Similarly I can log the activity in the SQL tables using triggers on tables during each activity on the tables. I would like to know which of the above two methods is a best practice ( perfomance wise )
I think that the actionfilter is certainly the cleanest and best practice appraoch since it is in the application layer. Part of the benefit of being there is its managed code and if something breaks you can easily locate the problem. There is also the benefit that all your code is in one spot too.
Database triggers are a big no no in many companies since they have a habit of causing infinite loop well an unknowing programmer creates some logic that steps on the trigger over and over again causing the database to fail. Some companies do allow triggers but very well documented and very lightly used. Hope this helps.
Performance of logging depends greatly on the system architecture. If you have 3 load balanced web servers hitting one main database, triggers would have to handle all the load while Action Filters would split the load in three. In that scenario, Action Filters would be better.
In terms of best practices, I wouldn't use either of those approaches. I would set up Transactional Replication to another SQL server. This approach would run without impacting performance at all. The transaction log is already being generated and replication would just spin up a separate process that's reading that log.

Asynchronous SQL Operations

I've got a problem I'm not sure how best to solve.
I have an application which updates a database in response to ad hoc requests. One request in particular is quite common. The request is an update that by itself is quite simple, but has some complex preconditions.
For this request the business layer
first requests a set of data from the
data layer.
The business logic layer evaluated
the data from the database and
parameters from the request, from
this the action to be performed is
determined, and the request's
response message(s) are created.
The business layer now executes the
actual update command that is the
purpose of the request.
This last step is the problem, this command is dependent on the state of the database, which might have changed since the business logic ran. Locking down the data read in this operation across several round-trips to the database doesn't seem like a good idea either. Is there a 'best-practice' way to accomplish something like this?
Thanks!
In simple terms when you execute the update command you are concerned that the database may have changed?
Then call stored procedures that are written defensively and will only update if the data is in an acceptable state when they are called (by checking the foreign key references, data integrity etc.).
Let me know if I can help in mocking up some aspect of this.
You could store the original state of the modified business objects and compare the original objects to their database counterparts to check if anything has been changed.
If changes have been made, then you either have the choice to merge the objects based on the original, modified and stored (database) objects, or to cancel the update and tell the client the update has failed.
this is kind of difficult, because there are not many specifics in the question, so I'll just give a simple example that you may be able to apply to your situation.
Load all the data as well as the last changed date (yyyy-mm-dd hh:mi:ss.mmm)
SELECT AAA,BBB,LastChgDate FROM YourTable WHERE ID=xxxxxx
do your business logic
save the data
UPDATE YourTable SET AAA=aaaaa,BBB=bbbbb WHERE ID=xxxxxx AND LastChgDate=zzzzzz
If the row count !=1 then error someone else has changed the data, otherwise the data is saved.
Use a proper transaction isolation mode and do everything in a singe database transaction (i.e. start transaction in step 1. and commit after step 3.).
Your question is a little bit vague, but my guess you either need SNAPSHOT or READ COMMITTED mode.