Performance difference - Jackson ObjctMapper.writeValue(writer, val) vs ObjectMapper.writeValueAsString(val) - jackson

Is there any significant performance difference between the following two?
String json = mapper.writeValueAsString(searchResult);
response.getWriter().write(json);
vs
mapper.writeValue(response.getWriter(), searchResult);

writeValueAsString JavaDoc says:
Method that can be used to serialize any Java value as a String.
Functionally equivalent to calling writeValue(Writer,Object) with
StringWriter and constructing String, but more efficient.
So, in case, you want to write JSON to String is much better to use this method than writeValue. Both these methods use _configAndWriteValue.
In your case it is better to write JSON directly to response.getWriter() than generating String object and after that writing it to response.getWriter().

Related

Apply SQL "LIKE" to bytes

I must create a DAO with hibernate that can work in a generic way, that means to execute some queries based on properties types.
My generic DAO works ok when filtering String properties of any class, it accepts "contains", "starts with", "ends with" using "like" restrictions:
Restrictions.like(propertyName, (String) value, getMatchMode());
The problem I have is that I need to also create a similar "contains", "starts with", "ends with" to bytes (byte[]) properties, the hibernate
SimpleExpression like(String propertyName, Object value)
api does not work (probably totally expected not to work), so I was thinking maybe I could convert the bytes stored in DB into a String, and then with a workaround apply the normal stringed Restrictions.like api.
The problem is that I think there's no standard way to convert bytes[] into String since there's no standard data type among DB platforms, I mean, Oracle uses "RAW", hsql uses "VARBINARY" and so on (Oracle uses its own RAWTOHEX for instance).
Or should any of you have an idea how to sort out the problem it will be very welcome.
Cheers.
///RGB
In MySQL you could use HEX to convert BINARY to String. i.e.
SELECT
*
FROM
myTable
WHERE
HEX(myBinaryField) LIKE 'abc%'`
In your Java code you could use some Base64 Encoder, which will convert bytes to string. Then you could just persist the Base64 encoded String and use normal LIKE queries. Maybe not the most efficient way, but it should work well.

Logger slf4j advantages of formatting with {} instead of string concatenation

Is there any advantage of using {} instead of string concatenation?
An example from slf4j
logger.debug("Temperature set to {}. Old temperature was {}.", t, oldT);
instead of
logger.debug("Temperature set to"+ t + ". Old temperature was " + oldT);
I think it's about speed optimization because parameters evaluation (and string concatenation) could be avoided in runtime depending on a config file. But only two parameters are possible, then sometimes there is no other choice than string concatenation. Needing views on this issue.
It is about string concatenation performance. It's potentially significant if your have dense logging statements.
(Prior to SLF4J 1.7) But only two parameters are possible
Because the vast majority of logging statements have 2 or fewer parameters, so SLF4J API up to version 1.6 covers (only) the majority of use cases. The API designers have provided overloaded methods with varargs parameters since API version 1.7.
For those cases where you need more than 2 and you're stuck with pre-1.7 SLF4J, then just use either string concatenation or new Object[] { param1, param2, param3, ... }. There should be few enough of them that the performance is not as important.
Short version: Yes it is faster, with less code!
String concatenation does a lot of work without knowing if it is needed or not (the traditional "is debugging enabled" test known from log4j), and should be avoided if possible, as the {} allows delaying the toString() call and string construction to after it has been decided if the event needs capturing or not. By having the logger format a single string the code becomes cleaner in my opinion.
You can provide any number of arguments. Note that if you use an old version of sljf4j and you have more than two arguments to {}, you must use the new Object[]{a,b,c,d} syntax to pass an array instead. See e.g. http://slf4j.org/apidocs/org/slf4j/Logger.html#debug(java.lang.String, java.lang.Object[]).
Regarding the speed: Ceki posted a benchmark a while back on one of the lists.
Since, String is immutable in Java, so the left and right String have to be copied into the new String for every pair of concatenation. So, better go for the placeholder.
Another alternative is String.format(). We are using it in jcabi-log (static utility wrapper around slf4j).
Logger.debug(this, "some variable = %s", value);
It's much more maintainable and extendable. Besides, it's easy to translate.
I think from the author's point of view, the main reason is to reduce the overhead for string concatenation.I just read the logger's documentation, you could find following words:
/**
* <p>This form avoids superfluous string concatenation when the logger
* is disabled for the DEBUG level. However, this variant incurs the hidden
* (and relatively small) cost of creating an <code>Object[]</code> before
invoking the method,
* even if this logger is disabled for DEBUG. The variants taking
* {#link #debug(String, Object) one} and {#link #debug(String, Object, Object) two}
* arguments exist solely in order to avoid this hidden cost.</p>
*/
*
* #param format the format string
* #param arguments a list of 3 or more arguments
*/
public void debug(String format, Object... arguments);
Concatenation is expensive, so you want it to happen only when needed. By using {}, slf4j performs the concatenation only if the trace is needed. In production, you may configure the log level to INFO, thus ignoring all debug traces.
A trace like this will concatenate the string even if the trace will be ignored, which is a waste of time :
logger.debug("Temperature set to"+ t + ". Old temperature was " + oldT);
A trace like this will be ignored at no cost :
logger.debug("Temperature set to {}. Old temperature was {}.", t, oldT);
If you have a lot of debug traces that you ignore in production, using {} is definitely better as it has no impact on performance.
Compliant logging is highly important for application development, as it affects performance.
The mentioned non-compliant logging is resulting with redundant toString() method invocation on each call, and is resulting with redundant temporary memory allocation and CPU processing, as can be seen at example high scale test execution, where we can take a look on redundant allocated temporary memory:
Look on method profiling:
Note: I am the author of this blog post, Logging impact on application performance.

cli/c++ increment operator overloading

i have a question regarding operator overloading in cli/c++ environment
static Length^ operator++(Length^ len)
{
Length^ temp = gcnew Length(len->feet, len->inches);
++temp->inches;
temp->feet += temp->inches/temp->inchesPerFoot;
temp->inches %= temp->inchesPerFoot;
return temp;
}
(the code is from ivor horton's book.)
why do we need to declare a new class object (temp) on the heap just to return it?
ive googled for the info on overloading but theres really not much out there and i feel kinda lost.
This is the way operator overloading is implemented in .NET. Overloaded operator is static function, which returns a new instance, instead of changing the current instance. Therefore, post and prefix ++ operators are the same. Most information about operator overloading talks about native C++. You can see .NET specific information, looking for C# samples, for example this: http://msdn.microsoft.com/en-us/library/aa288467(v=vs.71).aspx
.NET GC allows to create a lot of lightweight new instances, which are collected automatically. This is why .NET overloaded operators are more simple than in native C++.
Yes, because you're overloading POST-increment operator here. Hence, the original value may be used a lot in the code, copied and stored somewhere else, despite the existance of the new value. Example:
store_length_somewhere( len++ );
While len will be increased, the original value might be stored by the function somewhere else. That means that you might need two different values at the same time. Hence the creation and return of a new value.

Most appropriate data structure for dynamic languages field access

I'm implementing a dynamic language that will compile to C#, and it's implementing its own reflection API (.NET's is too slow, and the DLR is limited only to more recent and resourceful implementations).
For this, I've implemented a simple .GetField(string f) and .SetField(string f, object val) interface. Until recently, the implementation just switches over all possible field string values and makes the corresponding action.
Also, this dynamic language has the possibility to define anonymous objects. For those anonymous objects, at first, I had implemented a simple hash algorithm.
By now, I am looking for ways to optimize the dynamic parts of the language, and I have come across the fact that a hash algorithm for anonymous objects would be overkill. This is because the objects are usually small. I'd say the objects contain 2 or 3 fields, normally. Very rarely, they would contain more than 15 fields. It would take more time to actually hash the string and perform the lookup than if I would test for equality between them all. (This is not tested, just theoretical).
The first thing I did was to -- at compile-time -- create a red-black tree for each anonymous object declaration and have it laid onto an array so that the object can look for it in a very optimized way.
I am still divided, though, if that's the best way to do this. I could go for a perfect hashing function. Even more radically, I'm thinking about dropping the need for strings and actually work with a struct of 2 longs.
Those two longs will be encoded to support 10 chars (A-za-z0-9_) each, which is mostly a good prediction of the size of the fields. For fields larger than this, a special function (slower) receiving a string will also be provided.
The result will be that strings will be inlined (not references), and their comparisons will be as cheap as a long comparison.
Anyway, it's a little hard to find good information about this kind of optimization, since this is normally thought on a vm-level, not a static language compilation implementation.
Does anyone have any thoughts or tips about the best data structure to handle dynamic calls?
Edit:
For now, I'm really going with the string as long representation and a linear binary tree lookup.
I don't know if this is helpful, but I'll chuck it out in case;
If this is compiling to C#, do you know the complete list of fields at compile time? So as an idea, if your code reads
// dynamic
myObject.foo = "some value";
myObject.bar = 32;
then during the parse, your symbol table can build an int for each field name;
// parsing code
symbols[0] == "foo"
symbols[1] == "bar"
then generate code using arrays or lists;
// generated c#
runtimeObject[0] = "some value"; // assign myobject.foo
runtimeObject[1] = 32; // assign myobject.bar
and build up reflection as a separate array;
runtimeObject.FieldNames[0] == "foo"; // Dictionary<int, string>
runtimeObject.FieldIds["foo"] === 0; // Dictionary<string, int>
As I say, thrown out in the hope it'll be useful. No idea if it will!
Since you are likely to be using the same field and method names repeatedly, something like string interning would work well to quickly generate keys for your hash tables. It would also make string equality comparisons constant-time.
For such a small data set (expected upper bounds of 15) I think almost any hashing will be more expensive then a tree or even a list lookup, but that is really dependent on your hashing algorithm.
If you want to use a dictionary/hash then you'll need to make sure the objects you use for the key return a hash code quickly (perhaps a single constant hash code that's built once). If you can prevent collisions inside of an object (sounds pretty doable) then you'll gain the speed and scalability (well for any realistic object/class size) of a hash table.
Something that comes to mind is Ruby's symbols and message passing. I believe Ruby's symbols act as a constant to just a memory reference. So comparison is constant, they are very lite, and you can use symbols like variables (I'm a little hazy on this and don't have a Ruby interpreter on this machine). Ruby's method "calling" really turns into message passing. Something like: obj.func(arg) turns into obj.send(:func, arg) (":func" is the symbol). I would imagine that symbol makes looking up the message handler (as I'll call it) inside the object pretty efficient since it's hash code most likely doesn't need to be calculated like most objects.
Perhaps something similar could be done in .NET.

How to convert this code to LINQ?

I'm trying to write this as LINQ,
Original code:
For Each CurrentForm As Form In MyForms
AddLink(CurrentForm.GetLink())
Next
I'm a LINQ beginner, so far I'm not quite sure where to use and where not to. If in this case LINQ will do more harm then help, feel free to flame me.
Edit : You can assume that there is an overload for AddLink() which takes IEnumerable
Unless there is an overload of AddLink which takes a collection, LINQ won't avoid the loop.
Is there is such an overload then something like:
AddLinks(MyForms.Select(f => f.GetLink())
would do it.
How the above expression works (briefly):
LINQ is about expressions, taking some object (for LINQ to Objects used here, always a collection)
Select extension method takes a collection and a function and returns a collection. The function is passed each element of the input collection. And then Select returns the collection made up of all the function return values.
I have used a lambda expression to create an anonymous function that takes one argument called f (its type will be determined by the compiler) and returns the value of the expression (now corrected).
AddLinks is an assumed variant of your AddLink which takes a collection of links.
There is a lot going on, this is one of the advantages of LINQ, it is a compact way of expressing data manipulation without the usual overheads of explicit loops and temporary variables.
No flames here, but LINQ won't really help here. If LINQ had a ForEach method (as has been discussed in a previous question, as well as elsewhere) then you could use that - but it's not built into LINQ, and in this case there doesn't really seem to be much use for it.
Of course, it depends exactly what AddLink does - if it adds a link to a list, and you could instead use (say) List.AddRange, then you could use LINQ. But this code seems pretty simple and readable already, so I wouldn't worry in this case.