Jackson serialization should ignore already included objects - jackson

So this seems like a fairly simple answer to a common problem: Infinite loop detected in Jackosn. If, when serializing an object tree, Jackson comes upon an object it has already serialized why doesn't it just ignore it? Is there a way to do this in Jackson, or has someone created something similar?
Why all this mucking around with JsonManagedReference/JsonBackReference, which is completely insufficent if you start serializing child objects (which need a reference to the parent) some of the time and you are serializing parent objects some of the time (which obviously doesn't want the child to refer back to itself)?
It seems like now I have to create custom views that take into account every type of circular reference and use case possible which in any non-trivial ORM is a huge task.

EDIT (October 2012)
Jackson 2.x actual now DOES support identity information handling with #JsonIdentityInfo annotation! So the original answer is bit out of date...
OBSOLETE
Jackson does not support handling of object identity: this is a non-trivial task not so because of identifying shared objects which can be done by traversing object graph (incurring some overhead), but rather in figuring out how to include identity information; which ids to use and how. This in turn is somewhat similar to inclusion of type information, but now adds second dimension of extra wrapping to handle.
Doing this has been requested before, and some thought has gone into figuring out how to do it, but ratio of effort to benefit (i.e. number of requests, how badly it is needed) has been higher than adding other features.
So your best bet is to use wrapper objects and implement this manually, or have a look at XStream which can solve this (when enabled; it adds significant overhead in time) and also has JSON output mode using Jettison.
Implementing this manually for your use case is bit easier than solving the general case: you could start with BeanSerializerModifier to add wrapper handler that can keep track of object identities, and know what to serialize instead as object id.

Related

Best way of handling Jackson bi-directional references

I'm trying to build rest APIs for our core components using Jackson, and I had issues with some of the objects getting this exception:
javax.ws.rs.ProcessingException: com.fasterxml.jackson.databind.JsonMappingException: Infinite recursion (StackOverflowError)
After searching I came out to know about several ways how to solve it.
e.g.
https://www.baeldung.com/jackson-bidirectional-relationships-and-infinite-recursion
and I used the #JsonIdentityInfo which is working fine for me, but the question is WHAT IS THE BETTER WAY TO DO IT?
In this post:
Infinite Recursion with Jackson JSON and Hibernate JPA issue
There is a claim that need to use the #JsonIdentityInfo in caution cause it can cause problems:
In this case you've got to be careful, since you could need to read your object's attributes more than once (for example in a products list with more products that share the same seller), and this annotation prevents you to do so. I suggest to always take a look at firebug logs to check the Json response and see what's going on in your code.
I reached this article as well: http://springquay.blogspot.com/2016/01/new-approach-to-solve-json-recursive.html
#JsonIdentityInfo
I understood that #JsonIdentityInfo is newer approach in Jackson 2.
Advantage that it requires minimum code change (just to put this annotation in the problematic Object Model and no need to handle it from the other side.
A drawback is explained
#JsonIgnoreProperties
It requires to change more classes rather than just annotating the base one, and I'm not sure how it will will work if I have more than one class inheriting from that object model.

Tracking model object attributes changes (dirty) in Cocoa

I'm trying to gain insight into the least overhead solution to tracking model object changes in Cocoa.
As I see it there are 3 options:
Use Core Data – lot's of functionality exists for monitoring model object changes (Core Data NSManagedObject - tracking if attribute was changed). I don't know what the overhead of Core Data's management infrastructure is compared to other approaches but it's well established architecture for multi-threading support is a plus. For cross-platform devs there is some downside in not having a readily accessible schema but there are ways around that issue.
Write custom accessors that mark the object as dirty when updating a field with a new value. I've been using this technique with mixed success for quite some time. There are some sticky issues to deal with when sharing objects across threads. You also don't get the benefits of enhancements to automatic synthesis of attributes, etc. You do, however, have greater control of your data store than when using Core Data which can be of benefit (eg. certain operations can be done in a SQL store across many objects in a much more efficient way). Note: There could be a lot of variation here depending on how you write the accessors. For the sake of conversation let's assume setters make a check of the new value against the old one, make appropriate calls to KVO (willChange / didChange), and set a boolean flag (all within synchronization of course).
Use KVO to monitor object fields (ala keyPathsForValuesAffectingValueForKey:) and mark the object as dirty in the KVO callout. I have yet to use this method but it seems like a decent approach. The obvious downside would be the callout every time a setter is called.
I am inclined to think that option 2 has the lowest overhead (in terms of raw processing requirements) given that Core Data and KVO both have some additional overhead either in the generated accessors or in the KVO callouts. The question is, how substantial is the overhead?
And lastly, did I miss an option?
Thanks.

Does adding PetaPoco attributes to POCO's have any negative side effects?

Our current application uses a smart object style for working with the database. We are looking at the feasibility of moving to PetaPoco instead. Looking over the features I notice you can add attributes to make it easier to CRUD objects. Does adding these attributes have any negative side effects that I should be aware of?
Has anyone found a reason NOT to use these decorators?
Directly to the use of the POCO object instance itself? None.
At least not that I would be aware of. Jon Skeet should be able to provide more info because he knows compiler inner workings through and through, so he knows exactly what happens with this metadata after it's been compiled.
Other implications indirectly related to these
There are of course implications when accessing these declarative attributes, because they're read using reflection which is normally a slow process.
But there's nothing to worry here, because PetaPoco is a smart library and reads these only once then compiles & caches these things, so you only get penalized once then you get blazing performance afterwards. Because it uses compiled code.
Non-performance related implications
By putting attributes (any) on your classes/properties/methods you somehow bind your code to particular engine that will use this class, because they're directives for this particular engine to understand your code.
In case of PetaPoco attributes this means that your class can be used with PetaPoco but not with some other DAL (ie. EF) unless you add attributes of that one as well (EF Code First uses the very same approach with attributes).
The second implication is related to back-end database. In case you rename a table, column or any other part that is provided in your PetaPoco attribute as a constant magic string, you will subsequently have to change this string as well. This just means that you have to be thorough when doing database changes...
One downside is that it breaks the separation between the "domain" layer and the "data" layer, since it introduces the PetaPoco file (which contains data logic) to domain classes that should really not have any knowledge or dependency on the data layer.
If you're doing a single-project MVC app or something then it's okay to just use the Models directory for both, but for non-trivial and separated apps you'll have to have two PetaPoco files or play around with abstracting portions of the file in order to annotate your models without making them "know too much" about the underlying data, or else have you specify the table and/or primary key name all over the place.

best practice for a function which interacts with a database

Say I have a User object, which is generated by a Usermapper. The User object does not know anything about the database/repository in use (which I believe to be good design).
When creating a User, I only want it to have it filled by the mapper with the most trivial things e.g. Name, address etc. However after object instantiation I might have a method userX.getTotalDebt(), getTotalDebt() would need to reconnect to the database , because I don't want this relatively expensive operation to be done for every User instantiation (multiple tables needed etc). If I'd simply insert some sql in the getTotalDebt() or a dependency back to the Mapper where the coupledness is growing tight very fast.
There is an obvious good/best practice for this, because it's a situation arises often, however I can't find it or I'm looking at this problem totally from a wrong angle.
Say I have a User object, which is generated by a Usermapper. The User object does not know anything about the database/repository in use (which I believe to be good design).
They are often referred to as POCOs (Plain Old CLR Objects).
When creating a User I only want it to have it filled by the mapper with the most trivial things e.g. Name, address etc.
There are several OR/M layers which can achieve this. Use either nhibernate or Entity Framework 4.1 Code First.
I might have a method userX.getTotalDebt(), getTotalDebt() would need to reconnect to the database
Then it's not a poco anymore. Although it is possible using a transparent proxy. Both EF and nhibernate supports this and it's called Lazy Loading.
There is an obvious good/best practice for this, because it's a situation arises often, however I can't find it or I'm looking at this problem totally from a wrong angle
I usually keep my objects dumb and disconnected. I use the Repository pattern (even if I use nhibernate or another orm) since it makes my classes testable.
I either use the repository classes directly or create a service class which contains all logic. It depends on how complex my application is.

Passing object references needlessly through a middleman

I often find myself needing reference to an object that is several objects away, or so it seems. The options I see are passing a reference through a middle-man or just making something available statically. I understand the danger of global scope, but passing a reference through an object that does nothing with it feels ridiculous. I'm okay with a little bit passing around, I suppose. I suspect there's a line to be drawn somewhere.
Does anyone have insight on where to draw this line?
Or a good way to deal with the problem of distributing references amongst dependent objects?
Use the Law of Demeter (with moderation and good taste, not dogmatically). If you're coding a.b.c.d.e, something IS wrong -- you've nailed forevermore the implementation of a to have a b which has a c which... EEP!-) One or at the most two dots is the maximum you should be using. But the alternative is NOT to plump things into globals (and ensure thread-unsafe, buggy, hard-to-maintain code!), it is to have each object "surface" those characteristics it is designed to maintain as part of its interface to clients going forward, instead of just letting poor clients go through such undending chains of nested refs!
This smells of an abstraction that may need some improvement. You seem to be violating the Law of Demeter.
In some cases a global isn't too bad.
Consider, you're probably programming against an operating system's API. That's full of globals, you can probably access a file or the registry, write to the console. Look up a window handle. You can do loads of stuff to access state that is global across the whole computer, or even across the internet... and you don't have to pass a single reference to your class to access it. All this stuff is global if you access the OS's API.
So, when you consider the number of global things that often exist, a global in your own program probably isn't as bad as many people try and make out and scream about.
However, if you want to have very nice OO code that is all unit testable, I suppose you should be writing wrapper classes around any access to globals whether they come from the OS, or are declared yourself to encapsulate them. This means you class that uses this global state can get references to the wrappers, and they could be replaced with fakes.
Hmm, anyway. I'm not quite sure what advice I'm trying to give here, other than say, structuring code is all a balance! And, how to do it for your particular problem depends on your preferences, preferences of people who will use the code, how you're feeling on the day on the academic to pragmatic scale, how big the code base is, how safety critical the system is and how far off the deadline for completion is.
I believe your question is revealing something about your classes. Maybe the responsibilities could be improved ? Maybe moving some code would solve problems ?
Tell, don't ask.
That's how it was explained to me. There is a natural tendency to call classes to obtain some data. Taken too far, asking too much, typically leads to heavy "getter sequences". But there is another way. I must admit it is not easy to find, but improves gradually in a specific code and in the coder's habits.
Class A wants to perform a calculation, and asks B's data. Sometimes, it is appropriate that A tells B to do the job, possibly passing some parameters. This could replace B's "getName()", used by A to check the validity of the name, by an "isValid()" method on B.
"Asking" has been replaced by "telling" (calling a method that executes the computation).
For me, this is the question I ask myself when I find too many getter calls. Gradually, the methods encounter their place in the correct object, and everything gets a bit simpler, I have less getters and less call to them. I have less code, and it provides more semantic, a better alignment with the functional requirement.
Move the data around
There are other cases where I move some data. For example, if a field moves two objects up, the length of the "getter chain" is reduced by two.
I believe nobody can find the correct model at first.
I first think about it (using hand-written diagrams is quick and a big help), then code it, then think again facing the real thing... Then I code the rest, and any smells I feel in the code, I think again...
Split and merge objects
If a method on A needs data from C, with B as a middle man, I can try if A and C would have some in common. Possibly, A or a part of A could become C (possible splitting of A, merging of A and C) ...
However, there are cases where I keep the getters of course.
But it's less likely a long chain will be created.
A long chain will probably get broken by one of the techniques above.
I have three patterns for this:
Pass the necessary reference to the object's constructor -- the reference can then be stored as a data member of the object, and doesn't need to be passed again; this implies that the object's factory has the necessary reference. For example, when I'm creating a DOM, I pass the element name to the DOM node when I construct the DOM node.
Let things remember their parent, and get references to properties via their parent; this implies that the parent or ancestor has the necessary property. For example, when I'm creating a DOM, there are various things which are stored as properties of the top-level DomDocument ancestor, and its child nodes can access those properties via the reference which each one has to its parent.
Put all the different things which are passed around as references into a single class, and then pass around just that one class instance as the only thing that's passed around. For example, there are many properties required to render a DOM (e.g. the GDI graphics handle, the viewport coordinates, callback events, etc.) ... I put all of these things into a single 'Context' instance which is passed as the only parameter to the methods of the DOM nodes to be rendered, and each method can get whichever properties it needs out of that context parameter.