Async Message Testing - testing

Here is the problem I am facing with respect to Asynchronous Testing. The Problem statement is as below
I get a big batch of xml with data of multiple candidates. We do some validations and split that big xml into multiple xml's for each candidate. Each and every xml is persisted to the file structured database wih a Unique Identifier. A Unique identifier is generated for each of the messages that got persisted to the database. Each of those unique identifier's are hosted on to the Queue for subscription.
I am working on developing the automation test framework. I am looking for a way to let the test class know that unique idenifier has been subscribed by the next step in Data processing.
I have read information regarding the above problem. Most of which specifies Thread sleeps and timers. The problem what would happen is when we run the large number of test cases, it takes enoromously large amount of time.
Have read Awaitility. Had some hopes on it. Any ideas and anyonehas faced a similar situation. Please help.
Thanks
DevAutotester

You could use Awaitility to wait until all id's exists in the db or queue (if I understand it correctly) and then continue to do the validation afterwards. You will have to provide a supplier to Awaitility that checks that all IDs are present. Awaitility will then wait for this statement to be true.
/Johan

Related

Pentaho large source table processing to target table same schema

I currently have an etl job that reads source table with over 1 million records and then sequentially processing to target table. Both source and target are in same schema but in between there is an external rest endpoint call to post some data from the source table and this job is performing very bad right now and Can someone please let me know what are some ways to improve performance in terms of how to parallelize this or reducing fetchsize etc to reduce this job running time ?
Check if your rest endpoint supports batching, and then implement that. Most APIs do these days. (In this case, you send multiple requests in one json/xml file to the end point)
Otherwise you simply need to use multiple copies of the REST client step. you should be able to get away with 8-10 at least, but check that you're not limited in some way at the other end.
Finally if none of that helps, try concocting your own httpclient in the java class step (not the javascript) and be sure that you only authenticate with the rest endpoint once, not every request, by keeping the session open. I'm not 100% convinced the rest client does this, and authentication is often the most expensive bit.

How to pause Web API ? Is it even possible?

We are facing odd issue.
We have two parts
1. Windows task to update database
2. Web API using same database to provide search results
We want to pause API while Windows task updating the database. So Search results won't be partial or incorrect.
Is it possible to pause API request while database is being updated? Database update take about 10-15 seconds.
When you say "pause", what do you expect to happen to callers? It seems like you are choosing to give them errors instead of incomplete data.
If possible, your database updates should be wrapped in a transaction so consumers get current, complete data until the transaction is committed. Then, the next call will have updated and complete data.
I would hope that transactional processing would also help you recover from errors in your updates. What happens now if something fails part way through an update?
This post may help you: How to Decide to use Database Transactions
If the API knows when the this task is being starting, you can do have the thread sleep for 10 seconds by calling:
System.Threading.Thread.Sleep(10000)

Decide large process and notify users

I have some processing to do at server side.
When user selects large amount of data for processing (let's say, they are insert, update, delete in database and file read/write stuffs), it takes so much time.
I am using c# with .net core mvc web application.
In this case, is it possible to decide when process takes more than some decided time, run it into background (or say transfer process to another tool if possible) and notify user that it will take some time and u will be notified once done (that notification need not be real time. We can mail)
So is there any mechanism to do this?
You can go ahead and create a job for processing the data, you can try hangfire, that allows you to create background jobs inside your aspnet core application.
I don't think you will be able to make what you want. On the other hand you can use parallel foreach to process more data faster.
Ex:
Parallel.ForEach([list of entities],
new ParallelOptions { MaxDegreeOfParallelism = 2 },
[entity] =>
{
//YOURCODE PROCESSING THE ENTITY (INSERT,UPDATE,DELETE)
}
);
The property MaxDegreeOfParallelism defines how many threads you're going to use in maximum. I suggest you to start with 2 and increase one by one to see what's the best fit for you.
Parallel foreach is going to use more threads to process your data.
If parallel foreach does not solve your problem, you can use an strategy that consists in receiving your users data, assumes that this processing is going to take a long time, stores this data as is on your database or any other kind of storage, give your user a answer back with a transaction id and a text explaining that it's going to take a long time and the response is going to be sent by e-mail. By doing it you will need to build another service to process these transaction and e-mail the users with anything you think is necessary.
Another possibility is, instead of notifying the user through e-mail, you can create a method to check the processing status based on a transaction id and do a pooling strategy so the user won't even notice that the processing is being done in background.
Hope you succeed
Best Regards

Suggestion to handle Deadlock

There are like 6 procedures which are called internally to get data from a transactional table and do aggregations on the retrieved data , formated as an XML and then send emails hourly.
During this process, a lot of logging in done and logs are also sent as email in an HTML format(in the same email).There is one procedure where a deadlock occurs and one section of the email is always missed or we have a deadlock occurence(LOGS). So in order to handle I am trying to use the READ_COMMITTED_SNAPSHOT in that particular procedure. Can anyone please suggest how if this has worked for them or else which is the best way to handle this kind of deadlock.
Can I do a retry of that particular procedure internally by checking the output is Null or not.
I cant let the other process fail as that is a transaction.But I need the HTML to show all the information without missing anything in the body.
EDIT: This occurs very rarely.But the frequency is increasing daily now.I am not able to understand as the procedure is just trying to read from the transactional table and make some calculations and format it into XML and the other transaction is writting to the transactional table. So how does a WRITE effect a READ
You need to fix the deadlock in order to resolve this.
A deadlock occurs when one process holds a resource that the other requires in order to proceed and vice-versa. You'll get a deadlock when you have two processes that acquire the same set of resources in different orders. For instance, If process P1 acquires resources in the following order:
Resource A
Resource B
And a competing process, P2, requires the same resources in a different order:
Resource B
Resource A
P1 starts and acquires exclusive access to Resource A.
P2 starts and acquires exclusive access to Resource B.
In order for each to continue, P1 needs access to Resource B and P2 needs access to Resource A.
Neither of them can acquire the resource they need, thus causing the deadlock.
This is different than blocking, where one process is simply waiting for another process to release the needed resource. Given sufficient time, the blocking will be resolved. In a deadlock, the blocking cannot be resolved.
The SQL Engine can (and does) detect the deadlock situation. It resolves it by selecting one process or the other as the deadlock victim and rolling back.
Fix the deadlock by identifying the problem and resolving it, not by simply retrying and hoping it goes through. SQL Trace may help you identify the problem. You may need a DBA to help you.
A simpler (less dangerous) approach would be to change the six procedures in question so that they do dirty reads (i.e., WITH(NOLOCK)). This should work even in a deadlock, although you might get garbage data.

WCF InstanceContextMode: Per Call vs. Single in this scenario

I want to avoid generating duplicate numbers in my system.
CreateNextNumber() will:
Find the last number created.
Increment the value by one.
Update the value in the database with the new number.
I want to avoid two clients calling this method at the same time. My fear is they will pull the same last number created, increment it by one, and return the duplicate number for both clients.
Questions:
Do I need to use single mode here? I'd rather use Per Call if possible.
The default concurrency mode is single. I don't understand how Per Call would create multiple instances, yet have a single thread. Does that mean that even though multiple instances are created, only one client at a time can call a method in their instance?
If you use InstanceContextMode.Single and ConcurrentcyMode.Single your service will handle one request at a time and so would give you this feature - however, this issue is better handled in the database
Couple of options:
make the field that requires the unique number an identity column and the database will ensure no duplicate values
Wrap the incrementing of the control value in a stored procedure that uses isolation level RepeatableRead and read, increment and write in a transaction
for your questions you might find my blog article on instancing and concurrency useful
Single instance will not stop the service from handling requests concurrently, I don't think. You need a server side synchronisation mechanism, such as a Mutex, so that all code that tries to get this number first locks. You might get away with a static locking object inside the service code actually, which will likely be simpler than a mutex.
Basically, this isn't a WCF configuration issue, this is a more general concurrency issue.
private static object ServiceLockingObject = new object();
lock (ServiceLockingObject)
{
// Try to increment your number.
}
Don't bother with WCF settings, generate unique numbers in the database instead. See my answer to this question for details. Anything you try to do in WCF will have the following problems:
If someone deploys multiple instances of your service in a web farm, each instance will generate clashing numbers.
If there is a database error during the reading or writing of the table, then problems will ensue.
The mere act of reading and writing to the table in separate steps will introduce massive concurrency problems. Do you really want to force a serializable table lock and have everything queue up on the unique number generator?
If you begin a transaction in your service code, all other requests will block on the unique number table because it will be part of a long-running transaction.