Out of memory is one of the most frequent errors in java and other OO languages.
Can we apply static program analysis to reduce "out of memory" errors ?
I am looking for approaches that aims to reduce this error without running code.
Is there any particular areas of analysis that perform research in this ?
Yes, static analyzers are able to detect out of bounds. They do it slightly worse than dynamic analyzers, as they cannot trace complex cases of pointer usage. Moreover, static analyzers just cannot obtain some information. Let’s consider an abstract situation:
void OutstandingIssue(const char *strCount)
{
unsigned nCount;
sscanf_s(strCount, "%u", &nCount);
int array[10];
memset(array, 0, nCount * sizeof(int));
}
Whether the array index out of bounds happens or not, it will depend on the data
received from the outside. A static code analyzer is powerless here. Maximum of what it can do here is to warn that the entered value is used without a check. The dynamic analyzer will detect an error during testing the program, while it will receive various variants of input data. These topics are given here with more details.
But if there is a question “does a static analyzer really help?”, the answer is yes, it does. For example, a collection of bugs gathered by PVS-Studio team contains a good number of this type errors detected by static analyzer: V512, V645.
Related
I know that DbC mandates that the caller is responsible for the precondition (parameters or maybe values of member variables) and I have just read, in one of the books, that actually few people are bold enough to really leave all the responsibility up to the calling code and do not check the input in the called routine.
But I am thinking, doesn't it also lead to duplication? What if I need to call a method from several places.. in all those places I would need to make sure the preconditions are met..
bool AddEmployee(Employee e)
{
//precondition: List of employees is not full, employee is not empty...
EmployeeList.Add(e);
}
I could call it from several modules (Employee management, HR module..) so I do not get whether I truly should check for preconditions in all those places.
The contract states that it is the callers responsibility to ensure that the pre-conditions are met.
The contract states clearly who is responsible for a bug. If you fail to meet a pre-condition its the caller. If you fail to meet a post-condition its the callee. That is useful enough alone that its worth documenting the contract.
Sometimes you can write your code so that the pre-conditions don't need to be checked. For example:
Foo()
{
int x = 1;
Bar(x);
}
Bar(int x) [[expects: x>0]]
{
}
You set x so you know it can't be less than zero.
At other times you do need to check them. It does sometimes create duplication . I have not found this to be a significant problem often but you may sometimes see patterns like:
SafeBar(int x)
{
if (x <= 0) throw SomeException();
else Bar(x);
}
This assumes of course that errors can be handled in the same way for each use, which is not always the case.
Removing pre-condition checks is a performance optimisation. As we know premature optimisation is the root of all evil, so it should only be done when necessary.
Now another factor is implementation. Few languages support checking contracts at compile time. It was recently voted into C++20 but at the time of writing there is only an experimental implementation.
C++20 uses attributes as above. Attributes are not supposed to change runtime behaviour.
If you don't have compile time support you will typically find implementations using some sort of assertion macro. Personally I use one that throws an exception. You are then using standard exception handling mechanism for handling bugs (some consider this inappropraite) but you don't necessarily need to check the contract at the call site.
Why might it be inappropriate? It is worth remembering that a contract violation is a bug. If you run the function without meeting the pre-condition you are invoking undefined behaviour. In principle anything could happen. It could even format your hard-drive (though that is unlikely). Checking the pre-condition at runtime is like defensive coding. If the assertion causes an exception then undefined behaviour never occurs. That is safer and makes it easier to debug. But from one point of view you've modified the contract.
In general checking contracts at compile time is undecidable. Quoting the linked answer:
If the Theorem Prover can prove that a contract will always be
violated, that's a compile error. If the Theorem Prover can prove that
a contract will never be violated, that's an optimization.
Proving contracts in general is equivalent to solving the Halting Problem and
thus not possible. So, there will be a lot of cases, where the Theorem
Prover can neither prove nor disprove the contract.
In that case, a runtime check is emitted
A slight aside as the question is marked language agnostic but an issue I have with the C++20 proposal is that it seems to omit the runtime check for the other cases. It also says explicitly that it should not be possible to set the violation handler at run time:
There should be no programmatic way of setting or modifying the violation handler
It also mandates default choice of calling std::terminate() on contract violation to end the whole process. This would be a bad thing(tm) for something like a multithreaded fault tolerant task scheduler. A bug in one task should not kill the whole process.
I think the reasoning is that C++20 contracts are intended as a compile time feature only. That includes evaluating them in compile time meta-pograms using constexpr and consteval. The feature allows compiler venders to start adding theorem provers to check contracts which was not possible before. This is important and opens up many new opportunities.
Hopefully a pragmatic modification considering the runtime possibilities will follow.
The downside is that in the short term you will need to retain your assertions. If, like me, you use Doxygen for documentation (which does not yet understand contracts) you have triple redundancy. For example:
///
/// #brief
/// Bar does stuff with x
///
/// #pre
/// #code x > 0 #endcode
///
void Bar(int x) [[expects: x > 0]]
{
{ //pre-conditions
assertion(x>0);
}
...do stuff
}
Note that the C assert() macro doesn't throw. Hence we use our own assertion() macro that does. The CppCoreGuidelines support library includes Expects() and Ensures(). I'm not sure if they throw or call std::terminate().
TL;DR
Please provide a piece of code written in some well known dynamic language (e.g. JavaScript) and how that code would look like in Java bytecode using invokedynamic and explain why the usage of invokedynamic is a step forward here.
Background
I have googled and read quite a lot about the not-that-new-anymore invokedynamic instruction which everyone on the internet agrees on that it will help speed dynamic languages on the JVM. Thanks to stackoverflow I managed to get my own bytecode instructions with Sable/Jasmin to run.
I have understood that invokedynamic is useful for lazy constants and I also think that I understood how the OpenJDK takes advantage of invokedynamic for lambdas.
Oracle has a small example, but as far as I can tell the usage of invokedynamic in this case defeats the purpose as the example for "adder" could much simpler, faster and with roughly the same effect expressed with the following bytecode:
aload whereeverAIs
checkcast java/lang/Integer
aload whereeverBIs
checkcast java/lang/Integer
invokestatic IntegerOps/adder(Ljava/lang/Integer;Ljava/lang/Integer;)Ljava/lang/Integer;
because for some reason Oracle's bootstrap method knows that both arguments are integers anyway. They even "admit" that:
[..]it assumes that the arguments [..] will be Integer objects. A bootstrap method requires additional code to properly link invokedynamic [..] if the parameters of the bootstrap method (in this example, callerClass, dynMethodName, and dynMethodType) vary.
Well yes, and without that interesing "additional code" there is no point in using invokedynamic here, is there?
So after that and a couple of further Javadoc and Blog entries I think that I have a pretty good grasp on how to use invokedynamic as a poor replacement when invokestatic/invokevirtual/invokevirtual or getfield would work just as well.
Now I am curious how to actually apply the invokedynamic instruction to a real world usecase so that it actually is some improvements over what we could with "traditional" invocations (except lazy constants, I got those...).
Actually, lazy operations are the main advantage of invokedynamic if you take the term “lazy creation” broadly. E.g., the lambda creation feature of Java 8 is a kind of lazy creation that includes the possibility that the actual class containing the code that will be finally invoked by the invokedynamic instruction doesn’t even exist prior to the execution of that instruction.
This can be projected to all kind of scripting languages delivering code in a different form than Java bytecode (might be even in source code). Here, the code may be compiled right before the first invocation of a method and remains linked afterwards. But it may even become unlinked if the scripting language supports redefinition of methods. This uses the second important feature of invokedynamic, to allow mutable CallSites which may be changed afterwards while supporting maximal performance when being invoked frequently without redefinition.
This possibility to change an invokedynamic target afterwards allows another option, linking to an interpreted execution on the first invocation, counting the number of executions and compiling the code only after exceeding a threshold (and relinking to the compiled code then).
Regarding dynamic method dispatch based on a runtime instance, it’s clear that invokedynamic can’t elide the dispatch algorithm. But if you detect at runtime that a particular call-site will always call the method of the same concrete type you may relink the CallSite to an optimized code which will do a short check if the target is that expected type and performs the optimized action then but branches to the generic code performing the full dynamic dispatch only if that test fails. The implementation may even de-optimize such a call-site if it detects that the fast path check failed a certain number of times.
This is close to how invokevirtual and invokeinterface are optimized internally in the JVM as for these it’s also the case that most of these instructions are called on the same concrete type. So with invokedynamic you can use the same technique for arbitrary lookup algorithms.
But if you want an entirely different use case, you can use invokedynamic to implement friend semantics which are not supported by the standard access modifier rules. Suppose you have a class A and B which are meant to have such a friend relationship in that A is allowed to invoke private methods of B. Then all these invocations may be encoded as invokedynamic instructions with the desired name and signature and pointing to a public bootstrap method in B which may look like this:
public static CallSite bootStrap(Lookup l, String name, MethodType type)
throws NoSuchMethodException, IllegalAccessException {
if(l.lookupClass()!=A.class || (l.lookupModes()&0xf)!=0xf)
throw new SecurityException("unprivileged caller");
l=MethodHandles.lookup();
return new ConstantCallSite(l.findStatic(B.class, name, type));
}
It first verifies that the provided Lookup object has full access to A as only A is capable of constructing such an object. So sneaky attempts of wrong callers are sorted out at this place. Then it uses a Lookup object having full access to B to complete the linkage. So, each of these invokedynamic instructions is permanently linked to the matching private method of B after the first invocation, running at the same speed as ordinary invocations afterwards.
Both instructions use static rather than dynamic dispatch. It seems like the only substantial difference is that invokespecial will always have, as its first argument, an object that is an instance of the class that the dispatched method belongs to. However, invokespecial does not actually put the object there; the compiler is the one responsible for making that happen by emitting the appropriate sequence of stack operations before emitting invokespecial. So replacing invokespecial with invokestatic should not affect the way the runtime stack / heap gets manipulated -- though I expect that it will cause a VerifyError for violating the spec.
I'm curious about the possible reasons behind making two distinct instructions that do essentially the same thing. I took a look at the source of the OpenJDK interpreter, and it seems like invokespecial and invokestatic are handled almost identically. Does having two separate instructions help the JIT compiler better optimize code, or does it help the classfile verifier prove some safety properties more efficiently? Or is this just a quirk in the JVM's design?
Disclaimer: It is hard to tell for sure since I never read an explicit Oracle statement about this, but I pretty much think this is the reason:
When you look at Java byte code, you could ask the same question about other instructions. Why would the verifier stop you when pushing two ints on the stack and treating them as a single long right after? (Try it, it will stop you.) You could argue that by allowing this, you could express the same logic with a smaller instruction set. (To go further with this argument, a byte cannot express too many instructions, the Java byte code set should therefore cut down wherever possible.)
Of course, in theory you would not need a byte code instruction for pushing ints and longs to the stack and you are right about the fact that you would not need two instructions for INVOKESPECIAL and INVOKESTATIC in order to express method invocations. A method is uniquely identified by its method descriptor (name and raw argument types) and you could not define both a static and a non-static method with an identical description within the same class. And in order to validate the byte code, the Java compiler must check whether the target method is static nevertheless.
Remark: This contradicts the answer of v6ak. However, a methods descriptor of a non-static method is not altered to include a reference to this.getClass(). The Java runtime could therefore always infer the appropriate method binding from the method descriptor for a hypothetical INVOKESMART instruction. See JVMS §4.3.3.
So much for the theory. However, the intentions that are expressed by both invocation types are quite different. And remember that Java byte code is supposed to be used by other tools than javac to create JVM applications, as well. With byte code, these tools produce something that is more similar to machine code than your Java source code. But it is still rather high level. For example, byte code still is verified and the byte code is automatically optimized when compiled to machine code. However, the byte code is an abstraction that intentionally contains some redundancy in order to make the meaning of the byte code more explicit. And just like the Java language uses different names for similar things to make the language more readable, the byte code instruction set contains some redundancy as well. And as another benefit, verification and byte code interpretation/compilation can speed up since a method's invocation type does not always need to be inferred but is explicitly stated in the byte code. This is desirable because verification, interpretation and compilation are done at runtime.
As a final anecdote, I should mention that a class's static initializer <clinit> was not flagged static before Java 5. In this context, the static invocation could also be inferred by the method's name but this would cause even more run time overhead.
There are the definitions:
http://docs.oracle.com/javase/specs/jvms/se5.0/html/Instructions2.doc6.html#invokestatic
http://docs.oracle.com/javase/specs/jvms/se5.0/html/Instructions2.doc6.html#invokespecial
There are significant differences. Say we want to design an invokesmart instruction, which would choose smartly between inkovestatic and invokespecial:
First, it would not be a problem to distinguish between static and virtual calls, since we can't have two methods with same name, same parameter types and same return type, even if one is static and second is virtual. JVM does not allow that (for a strange reason). Thanks raphw for noticing that.
First, what would invokesmart foo/Bar.baz(I)I mean? It may mean:
A static method call foo.Bar.baz that consumes int from operand stack and adds another int. // (int) -> (int)
An instance method call foo.Bar.baz that consumes foo.Bar and int from operand stack and adds int. // (foo.Bar, int) -> (int)
How would you choose from them? There may exist both methods.
We may try to solve it by requiring foo/Bar.baz(Lfoo/Bar;I) for the static call. However, we may have both public static int baz(Bar, int) and public int baz(int).
We may say that it does not matter and possibly disable such situation. (I don't think that it is a good idea, but just to imagine.) What would it mean?
If the method is static, there are probably no additional restrictions. On the other hand, if the method is not static, there are some restrictions: "Finally, if the resolved method is protected (§4.6), and it is either a member of the current class or a member of a superclass of the current class, then the class of objectref must be either the current class or a subclass of the current class."
There are some further differences, see the note about ACC_SUPER.
It would mean that all the referenced classes must be loaded before bytecode verification. I hope this is not necessary now, but I am not 100% sure.
So, it would mean very inconsistent behavior.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Functional programming vs Object Oriented programming
Can someone explain to me why I would need functional programming instead of OOP?
E.g. why would I need to use Haskell instead of C++ (or a similar language)?
What are the advantages of functional programming over OOP?
One of the big things I prefer in functional programming is the lack of "spooky action at a distance". What you see is what you get – and no more. This makes code far easier to reason about.
Let's use a simple example. Let's say I come across the code snippet X = 10 in either Java (OOP) or Erlang (functional). In Erlang I can know these things very quickly:
The variable X is in the immediate context I'm in. Period. It's either a parameter passed in to the function I'm reading or it's being assigned the first (and only—c.f. below) time.
The variable X has a value of 10 from this point onward. It will not change again within the block of code I'm reading. It cannot.
In Java it's more complicated:
The variable X might be defined as a parameter.
It might be defined somewhere else in the method.
It might be defined as part of the class the method is in.
Whatever the case is, since I'm not declaring it here, I'm changing its value. This means I don't know what the value of X will be without constantly scanning backward through the code to find the last place it was assigned or modified explicitly or implicitly (like in a for loop).
When I call another method, if X happens to be a class variable it may change out from underneath me with no way for me to know this without inspecting the code of that method.
In the context of a threading program it's even worse. X can be changed by something I can't even see in my immediate environment. Another thread may be calling the method in #5 that modifies X.
And Java is a relatively simple OOP language. The number of ways that X can be screwed around with in C++ is even higher and potentially more obscure.
And the thing is? This is just a simple example of how a common operation can be far more complicated in an OOP (or other imperative) language than in a functional. It also doesn't address the benefits of functional programming that don't involve mutable state, etc. like higher order functions.
There are three things about Haskell that I think are really cool:
1) It's a statically-typed language that is extremely expressive and lets you build highly maintainable and refactorable code quickly. There's been a big debate between statically typed languages like Java and C# and dynamic languages like Python and Ruby. Python and Ruby let you quickly build programs, using only a fraction of the number of lines required in a language like Java or C#. So, if your goal is to get to market quickly, Python and Ruby are good choices. But, because they're dynamic, refactoring and maintaining your code is tough. In Java, if you want to add a parameter to a method, it's easy to use the IDE to find all instances of the method and fix them. And if you miss one, the compiler catches it. With Python and Ruby, refactoring mistakes will only be caught as run-time errors. So, with traditional languages, you get to choose between quick development and lousy maintainability on the one hand and slow development and good maintainability on the other hand. Neither choice is very good.
But with Haskell, you don't have to make this type of choice. Haskell is statically typed, just like Java and C#. So, you get all the refactorability, potential for IDE support, and compile-time checking. But at the same time, types can be inferred by the compiler. So, they don't get in your way like they do with traditional static languages. Plus, the language offers many other features that allow you to accomplish a lot with only a few lines of code. So, you get the speed of development of Python and Ruby along with the safety of static languages.
2) Parallelism. Because functions don't have side effects, it's much easier for the compiler to run things in parallel without much work from you as a developer. Consider the following pseudo-code:
a = f x
b = g y
c = h a b
In a pure functional language, we know that functions f and g have no side effects. So, there's no reason that f has to be run before g. The order could be swapped, or they could be run at the same time. In fact, we really don't have to run f and g at all until their values are needed in function h. This is not true in a traditional language since the calls to f and g could have side effects that could require us to run them in a particular order.
As computers get more and more cores on them, functional programming becomes more important because it allows the programmer to easily take advantage of the available parallelism.
3) The final really cool thing about Haskell is also possibly the most subtle: lazy evaluation. To understand this, consider the problem of writing a program that reads a text file and prints out the number of occurrences of the word "the" on each line of the file. Suppose you're writing in a traditional imperative language.
Attempt 1: You write a function that opens the file and reads it one line at a time. For each line, you calculate the number of "the's", and you print it out. That's great, except your main logic (counting the words) is tightly coupled with your input and output. Suppose you want to use that same logic in some other context? Suppose you want to read text data off a socket and count the words? Or you want to read the text from a UI? You'll have to rewrite your logic all over again!
Worst of all, what if you want to write an automated test for your new code? You'll have to build input files, run your code, capture the output, and then compare the output against your expected results. That's do-able, but it's painful. Generally, when you tightly couple IO with logic, it becomes really difficult to test the logic.
Attempt 2: So, let's decouple IO and logic. First, read the entire file into a big string in memory. Then, pass the string to a function that breaks the string into lines, counts the "the's" on each line, and returns a list of counts. Finally, the program can loop through the counts and output them. It's now easy to test the core logic since it involves no IO. It's now easy to use the core logic with data from a file or from a socket or from a UI. So, this is a great solution, right?
Wrong. What if someone passes in a 100GB file? You'll blow out your memory since the entire file must be loaded into a string.
Attempt 3: Build an abstraction around reading the file and producing results. You can think of these abstractions as two interfaces. The first has methods nextLine() and done(). The second has outputCount(). Your main program implements nextLine() and done() to read from the file, while outputCount() just directly prints out the count. This allows your main program to run in constant memory. Your test program can use an alternate implementation of this abstraction that has nextLine() and done() pull test data from memory, while outputCount() checks the results rather than outputting them.
This third attempt works well at separating the logic and the IO, and it allows your program to run in constant memory. But, it's significantly more complicated than the first two attempts.
In short, traditional imperative languages (whether static or dynamic) frequently leave developers making a choice between
a) Tight coupling of IO and logic (hard to test and reuse)
b) Load everything into memory (not very efficient)
c) Building abstractions (complicated, and it slows down implementation)
These choices come up when reading files, querying databases, reading sockets, etc. More often than not, programmers seem to favor option A, and unit tests suffer as a consequence.
So, how does Haskell help with this? In Haskell, you would solve this problem exactly like in Attempt 2. The main program loads the whole file into a string. Then it calls a function that examines the string and returns a list of counts. Then the main program prints the counts. It's super easy to test and reuse the core logic since it's isolated from the IO.
But what about memory usage? Haskell's lazy evaluation takes care of that for you. So, even though your code looks like it loaded the whole file contents into a string variable, the whole contents really aren't loaded. Instead, the file is only read as the string is consumed. This allows it to be read one buffer at a time, and your program will in fact run in constant memory. That is, you can run this program on a 100GB file, and it will consume very little memory.
Similarly, you can query a database, build a resulting list containing a huge set of rows, and pass it to a function to process. The processing function has no idea that the rows came from a database. So, it's decoupled from its IO. And under-the-covers, the list of rows will be fetched lazily and efficiently. So, even though it looks like it when you look at your code, the full list of rows is never all in memory at the same time.
End result, you can test your function that processes the database rows without even having to connect to a database at all.
Lazy evaluation is really subtle, and it takes a while to get your head around its power. But, it allows you to write nice simple code that is easy to test and reuse.
Here's the final Haskell solution and the Approach 3 Java solution. Both use constant memory and separate IO from processing so that testing and reuse are easy.
Haskell:
module Main
where
import System.Environment (getArgs)
import Data.Char (toLower)
main = do
(fileName : _) <- getArgs
fileContents <- readFile fileName
mapM_ (putStrLn . show) $ getWordCounts fileContents
getWordCounts = (map countThe) . lines . map toLower
where countThe = length . filter (== "the") . words
Java:
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.Reader;
class CountWords {
public interface OutputHandler {
void handle(int count) throws Exception;
}
static public void main(String[] args) throws Exception {
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader(new File(args[0])));
OutputHandler handler = new OutputHandler() {
public void handle(int count) throws Exception {
System.out.println(count);
}
};
countThe(reader, handler);
} finally {
if (reader != null) reader.close();
}
}
static public void countThe(BufferedReader reader, OutputHandler handler) throws Exception {
String line;
while ((line = reader.readLine()) != null) {
int num = 0;
for (String word: line.toLowerCase().split("([.,!?:;'\"-]|\\s)+")) {
if (word.equals("the")) {
num += 1;
}
}
handler.handle(num);
}
}
}
If we compare Haskell and C++, functional programming makes debugging extremely easy, because there's no mutable state and variables like ones found in C, Python etc. which you should always care about, and it's ensured that, given some arguments, a function will always return the same results in spite the number of times you evaluate it.
OOP is orthogonal to any programming paradigm, and there are lanugages which combine FP with OOP, OCaml being the most popular, several Haskell implementations etc.
From PMD:
IntegerInstantiation: In JDK 1.5, calling new Integer() causes memory allocation. Integer.valueOf() is more memory friendly.
ByteInstantiation: In JDK 1.5, calling new Byte() causes memory allocation. Byte.valueOf() is more memory friendly.
ShortInstantiation: In JDK 1.5, calling new Short() causes memory allocation. Short.valueOf() is more memory friendly.
LongInstantiation: In JDK 1.5, calling new Long() causes memory allocation. Long.valueOf() is more memory friendly.
Does the same apply for JDK 1.6? I am just wondering if the compiler or jvm optimize this to their respective valueof methods.
In theory the compiler could optimize a small subset of cases where (for example) new Integer(n) was used instead of the recommended Integer.valueOf(n).
Firstly, we should note that the optimization can only be applied if the compiler can guarantee that the wrapper object will never be compared against other objects using == or !=. (If this occurred, then the optimization changes the semantics of wrapper objects, such that "==" and "!=" would behave in a way that is contrary to the JLS.)
In the light of this, it is unlikely that such an optimization would be worth implementing:
The optimization would only help poorly written applications that ignore the javadoc, etc recommendations. For a well written application, testing to see if the optimization could be applied only slows down the optimizer; e.g. the JIT compiler.
Even for a poorly written application, the limitation on where the optimization is allowed means that few of the actual calls to new Integer(n) qualify for optimization. In most situations, it is too expensive to trace all of places where the wrapper created by a new expression might be used. If you include reflection in the picture, the tracing is virtually impossible for anything but local variables. Since most uses of the primitive wrappers entail putting them into collections, it is easy to see that the optimization would hardly ever be found (by a practical optimizer) to be allowed.
Even in the cases where the optimization was actually applied, it would only help for values of n within in a limited range. For example, calling Integer.valueOf(n) for large n will always create a new object.
This holds for Java 6 as well. Try the following with Java 6 to prove it:
System.out.println(new Integer(3) == new Integer(3));
System.out.println(Integer.valueOf(3) == Integer.valueOf(3));
System.out.println(new Long(3) == new Long(3));
System.out.println(Long.valueOf(3) == Long.valueOf(3));
System.out.println(new Byte((byte)3) == new Byte((byte)3));
System.out.println(Byte.valueOf((byte)3) == Byte.valueOf((byte)3));
However, with big numbers the optimization is off, as expected.
The same applies for Java SE 6. In general it is difficult to optimise away the creation of new objects. It is guaranteed that, say, new Integer(42) != new Integer(42). There is potential in some circumstances to remove the need to get an object altogether, but I believe all that is disabled in HotSpot production builds at the time of writing.
Most of the time it doesn't matter. Unless you have identified you are running a critical piece of code which is called lots of times (e.g. 10K or more) it is likely it won't make much difference.
If in doubt, assume the compiler does no optimisations. It does in fact very little. The JVM however, can do lots of optimisations, but removing the need to create an object is not one of them. The general assumption is that object allocation is fast enough most of the time.
Note: code which is run only a few times (< 10K time by default) will not even be fully compiled to native code and this is likely to slow down your code more than the object allocation.
On a recent jvm with escape analysis and scalar replacement, new Integer() might actually be faster than Integer.valueOf() when the scope of the variable is restricted to one method or block. See Autoboxing versus manual boxing in Java for some more details.