Mocking EntityManager - testing

I am getting NPE while mocking EntityManager, below is my code,
#Stateless
public class NodeChangeDeltaQueryBean implements NodeChangeDeltaQueryLocal {
#PersistenceContext
private EntityManager em;
#Override
public String findIdByNaturalKey(final String replicationDomain, final int sourceNodeIndex,
final int nodeChangeNumber) {
List<String> result =
NodeChangeDelta.findIdByNaturalKey(this.em, replicationDomain, sourceNodeIndex,
nodeChangeNumber).getResultList();
return result.isEmpty() ? null : result.get(0);
}
}
My Entity Class
#Entity
public class NodeChangeDelta implements Serializable, Cloneable, GeneratedEntity, KeyedEntity<String> {
public static TypedQuery<String> findIdByNaturalKey(final EntityManager em, final String replicationDomain, final int sourceNodeIndex, final int nodeChangeNumber) {
return em.createNamedQuery("NodeChangeDelta.findIdByNaturalKey", String.class)
.setParameter("replicationDomain", replicationDomain)
.setParameter("sourceNodeIndex", sourceNodeIndex)
.setParameter("nodeChangeNumber", nodeChangeNumber);
}
}
My Test Class
#RunWith(MockitoJUnitRunner.class)
public class NodeChangeDeltaQueryBeanTest {
#InjectMocks
NodeChangeDeltaQueryBean nodeChangeDeltaQueryBean;
#Mock
EntityManager em;
#Test
public void testFindIdByNaturalKey() {
this.addNodeChangeDelta();
this.nodeChangeDeltaQueryBean.findIdByNaturalKey(this.REPLICATION_DOMAIN,
this.SOURCE_NODE_INDEX, this.NODE_CHANGE_NUMDER);
}
}
While debugging em is not null (also other arguments REPLICATION_DOMAIN,
SOURCE_NODE_INDEX, NODE_CHANGE_NUMDER not null) in Entity class, whereas em.createNamedQuery("NodeChangeDelta.findIdByNaturalKey", String.class) is null.

On the mockito wiki : Don't mock types you don't own !
This is not a hard line, but crossing this line may have repercussions! (it most likely will.)
Imagine code that mocks a third party lib. After a particular upgrade of a third library, the logic might change a bit, but the test suite will execute just fine, because it's mocked. So later on, thinking everything is good to go, the build-wall is green after all, the software is deployed and... Boom
It may be a sign that the current design is not decoupled enough from this third party library.
Also another issue is that the third party lib might be complex and require a lot of mocks to even work properly. That leads to overly specified tests and complex fixtures, which in itself compromises the compact and readable goal. Or to tests which do not cover the code enough, because of the complexity to mock the external system.
Instead, the most common way is to create wrappers around the external lib/system, though one should be aware of the risk of abstraction leakage, where too much low level API, concepts or exceptions, goes beyond the boundary of the wrapper. In order to verify integration with the third party library, write integration tests, and make them as compact and readable as possible as well.
Mock type that you don't have the control can be considered a (mocking) anti-pattern. While EntityManager is pretty much standard, one should not consider there won't be any behavior change in upcoming JDK / JSR releases (it already happened numerous time in other part of the API, just look at the JDK release notes). Plus the real implementations may have subtleties in their behavior that can hardly be mocked, tests may be green but the production tomcats are on fire (true story).
My point is that if the code needs to mock a type I don't own, the design should change asap so I, my colleagues or future maintainers of this code won't fall in these traps.
Also the wiki links to other blogs entries describing issues they had when they tried to mock type they didn't have control.
Instead I really advice everyone to don't use mock when testing integration with another system. I believe for database stuff, Arquillian is the thing to go, the project appears to be quite active.
Adapted from my answer : https://stackoverflow.com/a/28698223/48136

In Mockito, any method invocation on a mock that is not explicitly configured, always returns null. Therefore in findIdByNaturalKey, em.createNamedQuery is returning null and so NPE on setParameter. You need to configure it to RETURN_MOCKS.
Also, I am not sure if #InjectMocks supports #PersistenceContext. If it does not then em is probably null. If it does, please let me know and the above is your issue.

Related

Does Inversion of Control lead to Side Effects?

A question I've been struggling a lot with lately is how, in my opinion, Inversion of Control
breaks Encapsulation and can easily lead to side effects in a program. However, at the same time, some of the
big advatages of IoC is loose coupling/modularity as well as Test Driven Design making Unit testing a class much easier (I think TDD is really pushing IoC in the industry).
Here is my argument againt IoC.
If the injected types are Immutable and Pure then IoC is acceptable, for example primitive types. However, if they are impure
and can modify the state of the program or hold their own state then side effects can easily
occur.
Take the following example C#/Pseudo:
public class FileSearcher: IFileSearcher
{
private readonly string searchPath;
public void SetSearchPath(string path)
{
searchPath = path;
}
public List<string> FindFiles(string searchPattern)
{
//...Search for files with searchPattern starting at searchPath
}
}
public class PlayListViewer
{
public PlayListViewer(string playlistName, IFileSearcher searcher)
{
searcher.SetSearchPath($"playlists/{playlistName}")
}
public List<string> FindSongNames()
{
return searcher.FindFiles(
"*.mp3|*.wav|*.flac").Select(f => Path.GetFileName(f))
}
//....other methods
}
public class Program
{
public static void Main()
{
var searcher = FileSearcher();
var viewer = PlayListViewer("Hits 2021", searcher);
searcher.SetSearchPath("C:/Users") //Messes up search path
var pictures = searcher.FindFiles("*.jpg") //Using searcher for something else
viewer
.FindSongNames()
.ForEach(s => Console.WriteLine(s)) //WRONG SONGS
}
}
In the (very uncreative) example above, The PlaylistViewer has a method for finding songs within a playlist. It
attempts to set the correct search path for the playlist on the injected IFileSearcher, but the User of the class
overwrote the path. Now when they try to find the songs in the playlist, the results are incorrect.
The Users of a class do not always know the implementation of the class they're using and don't know the side
effects they're causing by mutating the objects they passed in.
Some other simple examples of this:
The Date Class in Java is not immutable and has a setDate (deprecated now) method. The following could occur:
date = new Date(2021, 10, 1)
a = new A(date)
a.SomethingInteresting() //Adds 1 year to the Date using setDate
b = new B(date) //No longer the correct date
I/O abstractions such as streams:
audioInput = new MemoryStream()
gainStage = new DSPGain(audioInput)
audioInput.write(....)
audioInput.close()
gainStage.run() //Error because memory stream is already closed
etc...
Other issues can come up too if the Object gets passed to multiple classes that use it across different threads concurrently. In these cases
a User might not know/realize that class X internally is launching/processing on a different thread.
I think the simple, and functional, answer would be to only write pure functions and immutable classes but that isn't always practical in the real world.
So when should IoC really be used? Maybe only when the injected types are immutable and pure and anything else should be composed and encapsulated? If that's the answer, then what does that mean for TDD?
First, Inversion of Control is not the same as Dependency Injection. DI is just one implementation of IoC. This question makes more sense if we limit it to just DI.
Second, Dependency Injection is orthogonal to Test Driven Development. DI can make writing unit tests easier, which may encourage you to write more unit tests; but that does not necessitate TDD. You can certainly use DI without TDD, and I suspect that's the way the vast majority of developers use it. TDD is not a widespread practice.
Conversely, practicing TDD may encourage you to implement DI; but that is far from a requirement. Don't confuse statements like, "TDD and DI work well together," with "TDD and DI require each other." They can be used together or separately.
Finally, if you want to use your DI container as a repository of global variables, you certainly can. This approach of storing mutable state and injecting it across your application brings the same caveats and pitfalls as sharing mutable state anywhere else.
That should be the main takeaway from this question: not the downside of DI or TDD, but the downside of mutable state in general. You don't need DI to run afoul of mutable state. Trouble with mutable state is virtually guaranteed in the practice of imperative programming, which is by far the most common programming paradigm.
Consider that the functional programmers might really be onto something with their declarative approach.

Modular design and intermodule references

I'm not so sure the title is a good match for this question I want to put on the table.
I'm planning to create a web MVC framework as my graduation dissertation and in a previous conversation with my advisor trying to define some achivements, he convinced me that I should choose a modular design in this project.
I already had some things developed by then and stopped for a while to analyze how much modular it would be and I couldn't really do it because I don't know the real meaning of "modular".
Some things are not very cleary for me, like for example, just referencing another module blows up the modularity of my system?
Let's say I have a Database Access module and it OPTIONALY can use a Cache module for storing results of complex queries. As anyone can see, I at least will have a naming dependency for the cache module.
In my conception of "modular design", I can distribute each component separately and make it interact with others developed by other people. In this case I showed, if someone wants to use my Database Access module, they will have to take the Cache as well, even if he will not use it, just for referencing/naming purposes.
And so, I was wondering if this is really a modular design yet.
I came up with an alternative that is something like creating each component singly, without don't even knowing about the existance of other components that are not absolutely required for its functioning. To extend functionalities, I could create some structure based on Decorators and Adapters.
To clarify things a little bit, here is an example (in PHP):
Before
interface Cache {
public function isValid();
public function setValue();
public function getValue();
}
interface CacheManager {
public function get($name);
public function put($name, $value);
}
// Some concrete implementations...
interface DbAccessInterface {
public doComplexOperation();
}
class DbAccess implements DbAccessInterface {
private $cacheManager;
public function __construct(..., CacheManager $cacheManager = null) {
// ...
$this->cacheManager = $cacheManager;
}
public function doComplexOperation() {
if ($this->cacheManager !== null) {
// return from cache if valid
}
// complex operation
}
}
After
interface Cache {
public function isValid();
public function setValue();
public function getValue();
}
interface CacheManager {
public function get($name);
public function put($name, $value);
}
// Some concrete implementations...
interface DbAccessInterface {
public function doComplexOperation();
}
class DbAccess implements DbAccessInterface {
public function __construct(...) {
// ...
}
public function doComplexQuery() {
// complex operation
}
}
// And now the integration module
class CachedDbAcess implements DbAccessInterface {
private $dbAccess;
private $cacheManager;
public function __construct(DbAccessInterface $dbAccess, CacheManager $cacheManager) {
$this->dbAccess = $dbAccess;
$this->cacheManager = $cacheManager;
}
public function doComplexOperation() {
$cache = $this->cacheManager->get("Foo")
if($cache->isValid()) {
return $cache->getValue();
}
// Do complex operation...
}
}
Now my question is:
Is this the best solution? I should do this for all the modules that do not have as a requirement work together, but can be more efficient doing so?
Anyone would do it in a different way?
I have some more further questions involving this, but I don't know if this is an acceptable question for stackoverflow.
P.S.: English is not my first language, maybe some parts can get a little bit confuse
Some resources (not theoretical):
Nuclex Plugin Architecture
Python Plugin Application
C++ Plugin Architecture (Use NoScript on that side, they have some weird login policies)
Other SO threads (design pattern for plugins in php)
Django Middleware concept
Just referencing another module blows up the modularity of my system?
Not necessarily. It's a dependency. Having a dependencies is perfectly normal. Without dependencies modules can't interact with each other (unless you're doing such interaction indirectly which in general is a bad practice since it hides dependencies and complicates the code). Modular desing implies managing of dependencies, not removing them.
One tool - is using interfaces. Referencing module via interface makes a so called soft dependency. Such module can accept any implementation of an interface as a dependency so it is more independant and as a result - more maintainable.
The other tool - designing modules (and their interfaces) that have only single responcibility. This also makes them more granular, independant and maintainable.
But there is a line which you should not cross - blindly applying these tools may leed to a too modular and too generic desing. Making things too granular makes the whole system more complex. You should not solve universe problems, making generic modules, that all developers can use (unless it is your goal). First of all your system should solve your domain tasks and make things generic enough, but not more than that.
I came up with an alternative that is something like creating each component singly, without don't even knowing about the existance of other components that are not absolutely required for its functioning
It is great if you came up with this idea by yourself. The statement itself, is a key to modular programming.
Plugin architecture is the best in terms of extensibility, but imho it is hard to maintenance especially in intra application. And depending the complexity of plugin architecture, it can make your code more complex by adding plugin logics, etc.
Thus, for intra modular design, I choose the N-Tier, interface based architecture. Basically, the architecture relays on those tiers:
Domain / Entity
Interface [Depend on 1]
Services [Depend on 1 and 2]
Repository / DAL [Depend on 1 and 2]
Presentation Layer [Depend on 1,2,3,4]
Unfortunately, I don't think this is achieveable neatly in php projects as it need separated project / dll references in each tier. However, following the architecture can help to modularize the application.
For each modules, we need to do interface-based design. It can help to enhance the modularity of your code, because you can change the implementation later, but still keep the consumer the same.
I have provided an answer similiar to this interface-based design, at this stackoverflow question.
Lastly but not least, if you want to make your application modular to the UI, you can do Service Oriented Architecture. This is simply make your application as bunch of services, and then make the UI to consume the service. This design can help to separate your UI with your logic. You can later use different UI such as desktop app, but still use the same logic. Unfortunately, I don't have any reliable source for SOA.
EDIT:
I misunderstood the question. This is my point of view about modular framework. Unfortunately, I don't know much about Zend so I will give examples in C#:
It consist of modules, from the smallest to larger modules. Example in C# is you can using the Windows Form (larger) at your application, and also the Graphic (smaller) class to draw custom shapes in the screen.
It is extensible, or replaceable without making change to base class. In C# you can assign FormLoad event (extensible) to the Form class, inherit the Form or List class (extensible) or overridding form draw method to create a custom window graphic (replaceable).
(optional) it is easy to use. In normal DI interface design, we usually inject smaller modules into a larger (high level) module. This will require an IOC container. Refer to my question for detail.
Easy to configure, and does not involve any magical logic such as Service Locator Pattern. Search Service Locator is an Anti Pattern in google.
I don't know much about Zend, however I guess that the modularity in Zend can means that it can be extended without changing the core (replacing the code) inside framework.
If you said that:
if someone wants to use my Database Access module, they will have to take the Cache as well, even if he will not use it, just for referencing/naming purposes.
Then it is not modular. It is integrated, means that your Database Access module will not work without Cache. In reference of C# components, it choose to provide List<T> and BindingList<T> to provide different functionality. In your case, imho it is better to provide CachedDataAccess and DataAccess.

What's an appropriate DAO structure with jpa2/eclipselink?

I've JPA entities and need to perform logic with them. Until now a huge static database class did the job. It's ugly because every public interface method had an private equivalent that used the EntityManager, to perform transactions. But I could solve that having a static em too!
However i'm wondering if that's an appropriate design, especially as the class is responsible for many things.
Not surprisingly, the code i found online of real projects was not easy to understand (i might then as well remeain with my code).
The code here is easy to understand, although maybe over generic? Anyway, on top of JDBC. Yet, insightful, why use factories and singletons for DAOs?
I've though of singletoning the em instance as follows:
private static final Map<String, EntityManager> ems = new HashMap<String, EntityManager>();
private final EntityManager em;
private final EntityManagerFactory emf;
public void beginTransaction() {
em.getTransaction().begin();
}
public void commitTransaction() {
em.getTransaction().commit();
}
public Database(final String persistenceUnitName) {
if(ems.containsKey(persistenceUnitName)){
em = ems.get(persistenceUnitName);
}else{
ems.put(persistenceUnitName, em = Persistence.createEntityManagerFactory(persistenceUnitName).createEntityManager());
}
emf = em.getEntityManagerFactory();
this.persistenceUnitName = persistenceUnitName;
}
This way creation of instances is standard, still maintaining a singleton Connection/EntityManager.
On the otherhand I wondered whether there was the need to singleton ems in the first place?
The advantage is with multiple ems I run into locking problems (not using em.lock()).
Any feedback? Any real-world or tutorial code that demonstrates DAO with JPA2 and eclipselink?
Personally, I don't see the added value of shielding the EntityManager (which is an implementation of the Domain Store pattern) with a DAO and I would use it directly from the services, unless switching from JPA is a likely event. But, quoting An interesting debate about JPA and the DAO:
Adam said that he met only very few cases in which a project switched the database vendor, and no cases in which the persistence moved to a different thing than a RDBMs. Why should you pay more for a thing that it's unlikely to happen? Sometimes, when it happens, a simpler solution might have paid for itself and it might turn out to be simpler to rewrite a component.
I totally share the above point of view.
Anyway, the question that remains open is the lifecycle of the EntityManager and the answer highly depends on the nature of your application (a web application, a desktop application).
Here are some links that might help to decide what would be appropriate in your case:
Re: JPA DAO in Desktop Application
Using the Java Persistence API in Desktop Applications
Eclipselink in J2SE RCP Applications
Developing Applications Using EclipseLink JPA (ELUG)
An interesting debate about JPA and the DAO
And if you really want to go the DAO way, you could:
use Spring JPA support,
use some generic DAO library like generic-dao, krank, DAO Fusion,
roll your own generic DAO.
You could consider using Spring 3. Just follow their documentation for a clean design.

Alternatives for the singleton pattern?

I have been a web developer for some time now using ASP.NET and C#, I want to try and increase my skills by using best practices.
I have a website. I want to load the settings once off, and just reference it where ever I need it. So I did some research and 50% of the developers seem to be using the singleton pattern to do this. And the other 50% of the developers are ant-singleton. They all hate singletons. They recommend dependency injection.
Why are singletons bad? What is best practice to load websites settings? Should they be loaded only once and referenced where needed? How would I go about doing this with dependency injection (I am new at this)? Are there any samples that someone could recommend for my scenario? And I also would like to see some unit test code for this (for my scenario).
Thanks
Brendan
Generally, I avoid singletons because they make it harder to unit test your application. Singletons are hard to mock up for unit tests precisely because of their nature -- you always get the same one, not one you can configure easily for a unit test. Configuration data -- strongly-typed configuration data, anyway -- is one exception I make, though. Typically configuration data is relatively static anyway and the alternative involves writing a fair amount of code to avoid the static classes the framework provides to access the web.config anyway.
There are a couple of different ways to use it that will still allow you to unit test you application. One way (maybe both ways, if your singleton doesn't lazily read the app.cofnig) is to have a default app.config file in your unit test project providing the defaults required for your tests. You can use reflection to replace any specific values as needed in your unit tests. Typically, I'd configure a private method that allows the private singleton instance to be deleted in test set up if I do make changes for particular tests.
Another way is to not actually use the singleton directly, but create an interface for it that the singleton class implements. You can use hand injection of the interface, defaulting to the singleton instance if the supplied value is null. This allows you to create a mock instance that you can pass to the class under test for your tests, but in your real code use the singleton instance. Essentially, every class that needs it maintains a private reference to the singleton instance and uses it. I like this way a little better, but since the singleton will be created you may still need the default app.config file, unless all of the values are lazily loaded.
public class Foo
{
private IAppConfiguration Configuration { get; set; }
public Foo() : this(null) { }
public Foo( IAppConfiguration config )
{
this.Configuration = config ?? AppConfiguration.Instance;
}
public void Bar()
{
var value = this.Config.SomeMaximum;
...
}
}
There's a good discussion of singleton patterns, and coding examples here... http://en.wikipedia.org/wiki/Singleton_pattern See also here... http://en.wikipedia.org/wiki/Dependency_injection
For some reason, singletons seem to divide programmers into strong pro- and anti- camps. Whatever the merits of the approach, if your colleagues are against it, it's probably best not to use one. If you're on your own, try it and see.
Design Patterns can be amazing things. Unfortunately, the singleton seems to stick out like a sore thumb and in many cases can be considered an anti-pattern (it promotes bad practices). Bizarely, the majority of developers will only know one design pattern, and that is the singleton.
Ideally your settings should be a member variable in a high level location, for example the application object which owns the webpages you are spawning. The pages can then ask the app for the settings, or the application can pass the settings as pages are constructed.
One way to approach this problem, is to flog it off as a DAL problem.
Whatever class / web page, etc. needs to use config settings should declare a dependency on an IConfigSettingsService (factory/repository/whatever-you-like-to-call-them).
private IConfigSettingsService _configSettingsService;
public WebPage(IConfigSettingsService configSettingsService)
{
_configSettingsService = configSettingsService;
}
So your class would get settings like this:
ConfigSettings _configSettings = _configSettingsService.GetTheOnlySettings();
the ConfigSettingsService implementation would have a dependency which is Dal class. How would that Dal populate the ConfigSettings object? Who cares.
Maybe it would populate a ConfigSettings from a database or .config xml file, every time.
Maybe it do that the first time but then populate a static _configSettings for subsequent calls.
Maybe it would get the settings from Redis. If something indicates the settings have changed then the dal, or something external, can update Redis. (This approach will be useful if you have more than one app using the settings.
Whatever it does, your only dependency is a non-singleton service interface. That is very easy to mock. In your tests you can have it return a ConfigSettings with whatever you want in it).
In reality it would more likely be MyPageBase which has the IConfigSettingsService dependency, but it could just as easily be a web service, windows service, MVC somewhatsit, or all of the above.

How can I avoid global state?

So, I was reading the Google testing blog, and it says that global state is bad and makes it hard to write tests. I believe it--my code is difficult to test right now. So how do I avoid global state?
The biggest things I use global state (as I understand it) for is managing key pieces of information between our development, acceptance, and production environments. For example, I have a static class named "Globals" with a static member called "DBConnectionString." When the application loads, it determines which connection string to load, and populates Globals.DBConnectionString. I load file paths, server names, and other information in the Globals class.
Some of my functions rely on the global variables. So, when I test my functions, I have to remember to set certain globals first or else the tests will fail. I'd like to avoid this.
Is there a good way to manage state information? (Or am I understanding global state incorrectly?)
Dependency injection is what you're looking for. Rather than have those functions go out and look for their dependencies, inject the dependencies into the functions. That is, when you call the functions pass the data they want to them. That way it's easy to put a testing framework around a class because you can simply inject mock objects where appropriate.
It's hard to avoid some global state, but the best way to do this is to use factory classes at the highest level of your application, and everything below that very top level is based on dependency injection.
Two main benefits: one, testing is a heck of a lot easier, and two, your application is much more loosely coupled. You rely on being able to program against the interface of a class rather than its implementation.
Keep in mind if your tests involve actual resources such as databases or filesystems then what you are doing are integration tests rather than unit tests. Integration tests require some preliminary setup whereas unit tests should be able to run independently.
You could look into the use of a dependency injection framework such as Castle Windsor but for simple cases you may be able to take a middle of the road approach such as:
public interface ISettingsProvider
{
string ConnectionString { get; }
}
public class TestSettings : ISettingsProvider
{
public string ConnectionString { get { return "testdatabase"; } };
}
public class DataStuff
{
private ISettingsProvider settings;
public DataStuff(ISettingsProvider settings)
{
this.settings = settings;
}
public void DoSomething()
{
// use settings.ConnectionString
}
}
In reality you would most likely read from config files in your implementation. If you're up for it, a full blown DI framework with swappable configurations is the way to go but I think this is at least better than using Globals.ConnectionString.
Great first question.
The short answer: make sure your application is a function from ALL its inputs (including implicit ones) to its outputs.
The problem you're describing doesn't seem like global state. At least not mutable state. Rather, what you're describing seems like what is often referred to as "The Configuration Problem", and it has a number of solutions. If you're using Java, you may want to look into light-weight injection frameworks like Guice. In Scala, this is usually solved with implicits. In some languages, you will be able to load another program to configure your program at runtime. This is how we used to configure servers written in Smalltalk, and I use a window manager written in Haskell called Xmonad whose configuration file is just another Haskell program.
An example of dependency injection in an MVC setting, here goes:
index.php
$container = new Container();
include_file('container.php');
container.php
container.add("database.driver", "mysql");
container.add("database.name","app");
...
$container.add(new Database($container->get('database.driver', "database.name")), 'database');
$container.add(new Dao($container->get('database')), 'dao');
$container.add(new Service($container->get('dao')));
$container.add(new Controller($container->get('service')), 'controller');
$container.add(new FrontController(),'frontController');
index.php continues here:
$frontController = $container->get('frontController');
$controllerClass = $frontController->getController($_SERVER['request_uri']);
$controllerAction = $frontController->getAction($_SERVER['request_uri']);
$controller = $container->get('controller');
$controller->$action();
And there you have it, the controller depends on a service layer object which depends on
a dao(data access object) object which depends on a database object with depends on the
database driver, name etc