In Integration Testing, does it make sense to replace Async process with a Synchronous one for the sake of testing? - testing

In integration tests, asynchronous processes (methods, external services) make for a very tough test code. If instead, I factored out the async part and create a dependency and replace it with a synchronous one for the sake of testing, would that be a "good thing"?
By replacing the async process with a synchronous one, am I not testing in the spirit of integration testing? I guess I'm assuming that integration testing refers to testing close to the real thing.

Nice question.
In a unit test this approach would make sense but for integration testing you should be testing the real system as it will behave in real-life. This includes any asynchronous operations and any side-effects they may have - this is the most likely place for bugs to exist and is probably where you should concentrate your testing not factor it out.
I often use a "waitFor" approach where I poll to see if an answer has been received and timeout after a while if not. A good implementation of this pattern, although java-specific you can get the gist, is the JUnitConditionRunner. For example:
conditionRunner = new JUnitConditionRunner(browser, WAIT_FOR_INTERVAL, WAIT_FOR_TIMEOUT);
protected void waitForText(String text) {
try {
conditionRunner.waitFor(new Text(text));
} catch(Throwable t) {
throw new AssertionFailedError("Expecting text " + text + " failed to become true. Complete text [" + browser.getBodyText() + "]");
}
}

We have a number of automated unit tests that send off asynchronous requests and need to test the output/results. The way we handle it is to actually perform all of testing as if it were part of the actual application, in other words asynchronous requests remain asynchronous. But the test harness acts synchronously: It sends off the asynchronous request, sleeps for [up to] a period of time (the maximum in which we would expect a result to be produced), and if still no result is available, then the test has failed. There are callbacks, so in almost all cases the test is awakened and continues running before the timeout has expired, but the timeouts mean that a failure (or change in expected performance) will not stall/halt the entire test suite.
This has a few advantages:
The unit test is very close to the actual calling patters of the application
No new code/stubs are needed to make the application code (the code being tested) run synchronously
Performance is tested implicitly: If the test slept for too short a period, then some performance characteristic has changed, and that needs looking in to
The last point may need a small amount of explanation. Performance testing is important, and it is often left out of test plans. The way these unit tests are run, they end up taking a lot longer (running time) than if we had rearranged the code to do everything synchronously. However this way, performance is tested implicitly, and the tests are more faithful to their usage in the application. Plus all of our message queueing infrastructure gets tested "for free" along the way.
Edit: Added note about callbacks

What are you testing? The behaviour of your class in response to certain stimuli? In which case don't suitable mocks do the job?
Class Orchestrator implements AsynchCallback {
TheAsycnhService myDelegate; // initialised by injection
public void doSomething(Request aRequest){
myDelegate.doTheWork(aRequest, this)
}
public void tellMeTheResult(Response aResponse) {
// process response
}
}
Your test can do something like
Orchestrator orch = new Orchestrator(mockAsynchService);
orch.doSomething(request);
// assertions here that the mockAsychService received the expected request
// now either the mock really does call back
// or (probably more easily) make explicit call to the tellMeTheResult() method
// assertions here that the Orchestrator did the right thing with the response
Note that there's no true asynch processing here, and the mock itself need have no logic other than to allow verification of the receipt of the correct request. For a Unit test of the Orchestrator this is sufficient.
I used this variation on the idea when testing BPEL processes in WebSphere Process Server.

Related

Proper logging in reactive application - WebFlux

last time I am thinking about proper using logger in our applications.
For example, I have a controller which returns a stream of users but in the log, I see the "Fetch Users" log is being logged by another thread than the thread on the processing pipeline but is it a good approach?
#Slf4j
class AwesomeController {
#GetMapping(path = "/users")
public Flux<User> getUsers() {
log.info("Fetch users..");
return Flux.just(...)..subscribeOn(Schedulers.newParallel("my-custom"));
}
}
In this case, two threads are used and from my perspective, not a good option, but I can't find good practices with loggers in reactive applications. I think below approach is better because allocation memory is from processing thread but not from spring webflux thread which potential can be blocking but logger.
#GetMapping(path = "/users")
public Flux<User> getUsers() {
return Flux.defer(() -> {
return Mono.fromCallable(() -> {
log.info("Fetch users..");
.....
})
}).subscribeOn(Schedulers.newParallel("my-custom"))
}
The normal thing to do would be to configure the logger as asynchronous (this usually has to be explicit as per the comments, but all modern logging frameworks support it) and then just include it "normally" (either as a separate line as you have there, or in a side-effect method such as doOnNext() if you want it half way through the reactive chain.)
If you want to be sure that the logger's call isn't blocking, then use BlockHound to make sure (this is never a bad idea anyway.) But in any case, I can't see a use case for your second example there - that makes the code rather difficult to follow with no real advantage.
One final thing to watch out for - remember that if you include the logging statement separately as you have above, rather than as part of the reactive chain, then it'll execute when the method at calltime rather than subscription time. That may not matter in scenarios like this where the two happen near simultaneously, but would be rather confusing if (for example) you're returning a publisher which may be subscribed to multiple times - in that case, you'd only ever see the "Fetch users..." statement once, which isn't obvious when glancing through the code.

How to test handle_cast in a GenServer properly?

I have a GenServer, which is responsible for contacting an external resource. The result of calling external resource is not important, ever failures from time to time is acceptable, so using handle_cast seems appropriate for other parts of code. I do have an interface-like module for that external resource, and I'm using one GenServer to access the resource. So far so good.
But when I tried to write test for this gen_server, I couldn't figure out how to test the handle_cast. I have interface functions for GenServer, and I tried to test those ones, but they always return :ok, except when GenServer is not running. I could not test that.
I changed the code a bit. I abstracted the code in handle_cast into another function, and created a similar handle_call callback. Then I could test handle_call easily, but that was kind of a hack.
I would like to know how people generally test async code, like that. Was my approach correct, or acceptable? If not, what to do then?
The trick is to remember that a GenServer process handles messages one by one sequentially. This means we can make sure the process received and handled a message, by making sure it handled a message we sent later. This in turn means we can change any async operation into a synchronous one, by following it with a synchronisation message, for example some call.
The test scenario would look like this:
Issue asynchronous request.
Issue synchronous request and wait for the result.
Assert on the effects of the asynchronous request.
If the server doesn't have any suitable function for synchronisation, you can consider using :sys.get_state/2 - a call meant for debugging purposes, that is properly handled by all special processes (including GenServer) and is, what's probably the most important thing, synchronous. I'd consider it perfectly valid to use it for testing.
You can read more about other useful functions from the :sys module in GenServer documentation on debugging.
A cast request is of the form:
Module:handle_cast(Request, State) -> Result
Types:
Request = term()
State = term()
Result = {noreply,NewState} |
{noreply,NewState,Timeout} |
{noreply,NewState,hibernate} |
{stop,Reason,NewState}
NewState = term()
Timeout = int()>=0 | infinity
Reason = term()
so it is quite easy to perform unit test just calling it directly (no need to even start a server), providing a Request and a State, and asserting the returned Result. Of course it may also have some side effects (like writing in an ets table, modifying the process dictionary...) so you will need to initialize those resources before, and check the effect after the assert.
For example:
test_add() ->
{noreply,15} = my_counter:handle_cast({add,5},10).

Elasticsearch testing(unit/integration) best practices in C# using Nest

I've been seraching for a while how should I test my data access layer with not a lot of success. Let me list my concerns:
Unit tests
This guy (here: Using InMemoryConnection to test ElasticSearch) says that:
Although asserting the serialized form of a request matches your
expectations in the SUT may be sufficient.
Does it really worth to assert the serialized form of requests? Do these kind of tests have any value? It doesn't seem likely to change a function that should not change the serialized request.
If it does worth it, what is the correct way to assert these reqests?
Unit tests once again
Another guy (here: ElasticSearch 2.0 Nest Unit Testing with MOQ) shows a good looking example:
void Main()
{
var people = new List<Person>
{
new Person { Id = 1 },
new Person { Id = 2 },
};
var mockSearchResponse = new Mock<ISearchResponse<Person>>();
mockSearchResponse.Setup(x => x.Documents).Returns(people);
var mockElasticClient = new Mock<IElasticClient>();
mockElasticClient.Setup(x => x
.Search(It.IsAny<Func<SearchDescriptor<Person>, ISearchRequest>>()))
.Returns(mockSearchResponse.Object);
var result = mockElasticClient.Object.Search<Person>(s => s);
Assert.AreEqual(2, result.Documents.Count()).Dump();
}
public class Person
{
public int Id { get; set;}
}
Probably I'm missing something but I can't see the point of this code snippet. First he mocks an ISearchResponse to always return the people list. then he mocks an IElasticClient to return this previous search response to any search request he makes.
Well it doesn't really surprise me the assertion is true after that. What is the point of these kind of tests?
Integration tests
Integration tests do make more sense to me to test a data access layer. So after a little search i found this (https://www.nuget.org/packages/elasticsearch-inside/) package. If I'm not mistaken this is only about an embedded JVM and an ES. Is it a good practice to use it? Shouldn't I use my already running instance?
If anyone has good experience with testing that I didn't include I would happily hear those as well.
Each of the approaches that you have listed may be a reasonable approach to take, depending on exactly what it is you are trying to achieve with your tests. you haven't specified this in your question :)
Let's go over the options that you have listed
Asserting the serialized form of the request to Elasticsearch may be a sufficient approach if you build a request to Elasticsearch based on a varying number of inputs. You may have tests that provide different input instances and assert the form of the query that will be sent to Elasticsearch for each. These kinds of tests are going to be fast to execute but make the assumption that the query that is generated and you are asserting the form of is going to return the results that you expect.
This is another form of unit test that stubs out the interaction with the Elasticsearch client. The system under test (SUT) in this example is not the client but another component that internally uses the client, so the interaction with the client is controlled through the stub object to return an expected response. The example is contrived in that in a real test, you wouldn't assert on the results of the client call as you point out but rather on the output of the SUT.
Integration/Behavioural tests against a known data set within an Elasticsearch cluster may provide the most value and go beyond points 1 and 2 as they will not only incidentally test the generated queries sent to Elasticsearch for a given input, but will also be testing the interaction and producing an expected result. No doubt however that these types of test are harder to setup than 1 and 2, but the investment in setup may be outweighed by their benefit for your project.
So, you need to ask yourself what kinds of tests are sufficient to achieve the level of assurance that you require to assert that your system is doing what you expect it to do; it may be a combination of all three different approaches for different elements of the system.
You may want to check out how the .NET client itself is tested; there are components within the Tests project that spin up an Elasticsearch cluster with different plugins installed, seed it with known generated data and make assertions on the results. The source is open and licensed under Apache 2.0 license, so feel free to use elements within your project :)

How to apply TDD to web API development

I'm designing a web service running on Google App Engine that scrapes a number of websites and presents their data via a RESTful interface. Based on some background reading, I think I'd like to attempt Test Driven Development (TDD) and develop my tests before I write any business code.
My problem is caused by the fact that my list of scraped elements includes timetables and other records that change quite frequently. The limit of my knowledge on TDD is that you write tests that examine the results of code execution and compare these results to a hardcoded result set. Seeing as the data set changes frequently, this method seems impossible. Assuming that this is true, what would be the best approach to test such an API? How would a large-scale web API be tested (Twitter, Google, Netflix etc.)?
You have to choose the type of test:
Unit tests just test proper operation of your modules (units). You provide input data and test that code outputs proper results. If there are system dependent classes you try to mock them or in case of GAE services, you use google provided local services. Unit tests can be run locally on your machine or on CI servers. There are two popular unit test libs for java: Junit & TestNG.
Integration tests check that various modules (internal & external) work together - they basically check that APIs between modules are working. They are usually run on real servers and call real external services. They are technology specific and are harder to run.
In your case, I'd go with unit tests and provide sets of different input data which you logic should parse and act upon. Since your flow is pretty simple (load data from fixed Url, parse it) you could also embed loading of real data into unit tests (we do this when we parse external sources).
From what you are describing you could easily find yourself writing integration tests. If your aim is to test the logic for processing what is returned from the scraped data (e.g. you know that you are going to get a timetable in a specific format coming in and you now have logic to process that data) you will need to create a SEAM between your web services logic and your processing logic. Once you have done this you should be able to mock the data that is returned from the web service call to always return the same table data and then you can write consistent unit tests against it.
public class ScrapingService : IScrapingService
{
public string Scrape(string url)
{// scraping logic}
}
public interface IScrapingService
{
string Scrape(string url);
}
public class ScrapingProcessor
{
private IScrapingService _scrapingService
// inject the dependency
pubilc ScrapingProcessor(IScrapingService scrapingService)
{
_scrapingService = scrapingService;
}
public void Process(string url)
{
var scrapedData = _scrapingService.Scrape(url)
// now process the scrapedData
}
}
To test you can now create a FakeScrapingService that implements the IScrapingService interface and then return whatever data you like from the Scrape method. There are some very good Mocking frameworks out there that make this type of thing easy. My personal favorite is NSubstitue.
I hope this explanation helps.

Browser session in setUp(), tearDown(), no per testcase setup?

I've previously written some selenium tests using ruby/rspec, and found it quite powerful. Now, I'm using Selenium with PHPUnit, and there are a couple of things I'm missing, it might just be because of inexperience. In Ruby/RSpec, I'm used to being able to define a "global" setup, for each test case, where I, among other things, open up the browser window and log into my site.
I feel that PHPUnit is a bit lacking here, in that 1) you only have setUp() and tearDown(), which are run before and after each individual test, and that 2) it seems that the actual browser session is set up between setUp() and the test, and closed before tearDown().
This makes for a bit more clutter in the tests themselves, because you explicitly have to open the page at the beginning, and perform cleanups at the end. In every single test. It also seems like unnecessary overhead to close and reopen the browser for every single test, in stead of just going back to the landing page.
Are there any alternative ways of achieving what I'm looking for?
What I have done in the past is to make a protected method that returns an object for the session like so:
protected function initBrowserSession() {
if (!$this->browserSession) {
$this->setBrowser('*firefox');
$this->setBrowserUrl('http://www.example.com/');
//Initialize Session
$this->open('http://www.example.com/login.php');
// Do whatever other setup you need here
}
$this->browserSession = true;
}
public function testSomePage() {
$this->initBrowserSession();
//Perform your test here
}
You can't really use the setupBefore/AfterClass functions since they are static (and as such you won't have access to the instance).
Now, with that said, I would question your motivation for doing so. By having a test that re-uses a session between tests you're introducing the possibility of having side-effects between the tests. By re-opening a new session for each test you're isolating the effects down to just that of the test. Who cares about the performance (to a reasonable extent at least) of re-opening the browser? Doing so actually increases the validity of the test since it's isolated. Then again, there could be something to be said for testing a prolonged session. But if that was the case, I would make that a separate test case/class to the individual functionality test...
Although I agree with #ircmaxell that it might be best to reset the session between tests, I can see the case where tests would go from taking minutes to taking hours just to restart the browser.
Therefore, I did some digging, and found out that you can override the start() method in a base class. In my setup, I have the following:
<?php
require_once 'PHPUnit/Extensions/SeleniumTestCase.php';
class SeleniumTestCase extends PHPUnit_Extensions_SeleniumTestCase
{
public function setUp() {
parent::setUp();
// Set browser, URL, etc.
$this->setBrowser('firefox');
$this->setBrowserUrl('http://www.example.com');
}
public function start() {
parent::start();
// Perform any setup steps that depend on
// the browser session being started, like logging in/out
}
}
This will automatically affect any classes that extend SeleniumTestCase, so you don't have to worry about setting up the environment in every single test.
I haven't tested, but it seems likely that there is a stop() method called before tearDown() as well.
Hope this helps.