execution Vs. call Join point - aop

I have two different aspect classes to count the number of non-static method calls for an execution of a test program. The first aspect counts methods on "call" join points:
pointcut methodCalls() : call (!static * test..*(..));
before(): methodCalls() {
counter.methodCallCounter();
}
while the second aspect counts methods on "execution" join points:
pointcut methodCalls() : execution (!static * test..*(..));
before(): methodCalls() {
counter.methodCallCounter();
}
methodCallCounter() is a static method in counter class.
The number of method calls for small test program is the same. But when I change the test program with a larger program the number of method calls in the second aspect class (with execution pointcut) is more than the number of method calls in the aspect class with call pointcut. This is reasonable since the call join point does not pick out the calls made with super and therefore does not count them.
However, I encountered a case where for the specific execution of program the number non-static method calls in the aspect class with "call pointcut" was higher than the number of method calls in the aspect class with "execution pointcut". I can not find any interpretation why this is happening. Any thought about the reason of second situation is appreciated.

Acutally the explanation is quite simple if you understand the basic difference between call() and execution() pointcuts: While the former intercepts all callers (i.e. the sources of method calls), the latter intercepts the calls themselves no matter where they originate from.
So how can the number of interceptions triggered by both pointcuts differ?
If you call JRE/JDK methods from your own code, AspectJ can weave into your calls, but not into the execution joinpoints within the JDK (unless you have created a woven JDK as a preparatory step). Thus, the number of calls will be higher than the number of executions.
Similarly, if you call methods in third party libraries which you have not woven with AspectJ because they were not on the in-path during LTW or CTW, again the executions will not be captured.
Last, but not least, it can happen the other way around if your own woven code is called by third party libs or by JRE/JDK classes. In this case the counted number of executions will be higher than the number of calls because they originate from places outside the control of your AspectJ code.
Generally, in all cases the reason is the difference between overall used code and the subset of woven code. In other words: the difference between code under and beyond your (or the aspects') control.

this image might help you visualize the difference between execution and call:

Related

Is there a way to defer parameter resolution?

(I'm reasonably sure the answer is "no", but I want to make sure.)
In JUnit 5 you can write an extension that is an implementation of ParameterResolver. Before your test runs, if the method in question has parameters, then an extension that implements ParameterResolver can return the object suitable as an argument for that parameter.
You can also write an extension that is an implementation of InvocationInterceptor, that is in charge of intercepting a test method's execution. You can get any arguments (such as those resolved by ParameterResolvers), but it appears you cannot change them.
In terms of execution order, if there are relevant parameters, then a ParameterResolver will "fire" first, and then any InvocationInterceptors will "fire" next.
(Lastly, if your test method declares parameters, but there are no ParameterResolvers to resolve them, everything craps out.)
Putting this all together:
Consider the case when a parameter can't really be properly resolved until the stuff that an interceptor sets up prior to execution is complete:
What is the best way, if there is one, to have all of the following:
A parameter that conceivably only the interceptor could resolve
Deferred resolution of that parameter (i.e. the actual parameter value is not sought by the JUnit internals until interception time so that the interceptor could resolve it just-in-time before calling proceed()
…?
(In my very concrete case, I got lucky: the parameter I'm interested in is an interface, so I "resolve" it to a dummy implementation, and then, at interception time, "fill" the dummy implementation with a delegate that does the real work. I can't think of a better way with the existing JUnit 5 toolkit.)
(I can almost get there if ReflectiveInvocationContext would allow me to set its arguments: my resolveParameter implementation could return null and my interceptor could replace the null reference it found in the arguments with an appropriate non-null argument just-in-time.)
(I also am at least aware of the ExecutableInvoker interface that is reachable from the ExtensionContext, but I'm unclear how that would help me in this scenario, since parameter resolution happens before interception.)

AspectJ, separating native library calls from application calls

I am using AspectJ and Load-time weaving to trace methods calls in an arbitrary java program. I can trace all calls using the standard:
call(* *.*(..))
But what I now trying to do is separate out calls to the native java libraries and any application code:
nativeCalls(): !within(MethodTracer) && call(* java..*.*(..));
appCalls(): !within(MethodTracer) && call(* *.*(..)) && !call(* java..*.*(..));
The issue is that the nativeCalls() pointcut is picking out calls to application classes that inherit from native java classes, even though the signatures do not start with java.lang. or java.util, etc.
For example:
If I have a class tetris.GameComponent that inherits from java.awt.Component, my nativeCalls() pointcut will pick out tetris.GameComponent.getBackground() when the method is actually implemented in java.awt.Component.getBackground().
Is there a way to have my nativeCalls() pointcut ignore the calls to inherited methods?
I hope this is clear. I can provide additional info if necessary. Thanks for any help that can be provided.
Actually I have no idea why you want to exclude those inherited method calls from your trace because IMO it is important or at least interesting to know if a method was called on one of your classes, even if that method was defined in a JDK super class.
But anyway, the answer is no, you cannot exclude calls to JDK methods from your nativeCalls() pointcut if those calls are actually made upon target objects typed to one of your application classes. At the time the call is made, AspectJ does not know how the JVM will resolve the polymorphism. There can be several cases:
Call to Foo.aaa(), existing method Foo.aaa() is executed. This is the simple case where a called method actually exists.
Call to Foo.bbb(), inherited method Base.bbb() is executed (polymorphism). This is the case you want to exclude, but you cannot because the fact that a base method is called will only be known when the method is executed. Furthermore, if Base is a JDK class, you cannot even intercept its method executions with AspectJ.
Call to Base.ccc(), non-overridden method Base.ccc() is executed. This can happen if you directly create an instance of Base or also if you assign/cast a Foo instance to a variable typed Base, e.g. Base obj = new Foo(), and call obj.ccc() which has not been overridden by Foo.
Call to Base.ddd(), overridden method Foo.ddd() is executed (polmorphism). This also happens if you assign/cast a Foo instance to a variable typed Base, e.g. Base obj = new Foo(), and call obj.ddd() which has been overridden by Foo.
So much for not being able to easily exclude the polymorphism stuff when calling inherited JDK method.
Now the other way around: You can easily intercept execution() instead of call() upon your application classes and take advantage of the fact that JDK method executions cannot be intercepted anyway: pointcut appMethod() : execution(* *(..));

Will the Hotspot VM inline functions as necessary?

I am converting some C++ code to Java and I was wondering what I can do about the inlined functions. Can I assume that functions will be inlined by the VM (as an when necessary) and just not worry about this? How do I profile to observe this behaviour? Suppose there is a main outer function, and I throw a for loop around it and cause a million invocations. Should I expect to see an improvements as the VM inlines more and more?
Yes Java does inline method calls. The inlining is performed by the JIT compiler, so you won't see it by examining the bytecode files.
Whether inlining actually occurs for a given method call will depend on the size of the method body, and whether the call is inlineable. (If a method call involves dispatching ... after the JVM has a bunch of global optimization designed to remove unnecessary dispatching ... then it cannot be inlined.)
The same applies to your example with your outer main function. It depends on how big the method body is. On the other hand, if the method takes a significant time to execute, then the relative importance of the optimization decreases correspondingly.
My advice is to not worry about things like this at this stage. Just write the code clearly and simply, and let the JIT compiler deal with the problem of optimizing. When your application is working, you can profile it and see if there are any "hot spots" in the code that are worthwhile optimizing by hand.
But I should be able to see this in something like Visual VM right? I mean initially no inlining, then more and more stuff is inlined so the average time for the outer method is slightly reduced.
It may be observable and it may not, depending on the amount time spent in making the calls relative to executing the method bodies. (Profiling often relies on sampling the program counter. The reported times may be inaccurate if the number of samples for a given region of code is too small ... and for other reasons.)
It also depends on the JVM you are using. Not all JVMs will re-optimize code that they have previously optimized.
Finally, there is a way to get the JVM to dump the native code output by the JIT compiler. That will give you a definitive answer as to what has been inlined ... if you are prepared to read the machine instructions.

Custom performance profiler for Objective C

I want to create a simple to use and lightweight performance profile framework for Objective C. My goal is to measure the bottlenecks of my application.
Just to mention that I am not a beginner and I am aware of Instruments/Time Profiler. This is not what I am looking for. Time Profiler is a great tool but is too developer oriented. I want a framework that can collect performance data from a QA or pre production users and even incorporate in a real production environment to gather the real data.
The main part of this framework is the ability to measure how much time was spent in Objective C message (I am going to profile only Objective C messages).
The easiest way is to start timer in the beginning of a message and stop it at the end. It is the simplest way but its disadvantage is that it is to tedious and error prone - if any message has more than 1 return path then it will require to add the "stop timer" code before each return.
I am thinking of using method swizzling (just to note that I am aware that Apple are not happy with method swizzling but these profiled builds will be used internally only - will not be uploaded on the App Store).
My idea is to mark each message I want to profile and to generate automatically code for the method swizzling method (maybe using macros). When started, the application will swizzle the original selector with the generated one. The generated one will just start a timer, will call the original method and then will stop the timer. So in general the swizzled method will be just a wrapper of the original one.
One of the problems of the above idea is that I cannot think of an easy way how to automatically generate the methods to use for swizzling.
So I greatly will appreciate if anyone has any ideas how to automate the whole process. The perfect scenario is just to write one line of code anywhere mentioning the class and the selector I want to profile and the rest to be generated automatically.
Also will be very thankful if you have any other idea (beside method swizzling) of how to measure the performance.
I came up with a solution that works for me pretty well. First just to clarify that I was unable to find out an easy (and performance fast) way to automatically generate the appropriate swizzled methods for arbitrary selectors (i.e. with arbitrary arguments and return value) using only the selector name. So I had to add the arguments types and the return value for each selector, not only the selector name. In reality it should be relatively easy to create a small tool that would be able to parse all source files and detect automatically what are the arguments types and the returned value of the selector which we want to profile (and prepare the swizzled methods) but right now I don't need such an automated solution.
So right now my solution includes the above ideas for method swizzling, some C++ code and macros to automate and minimize some coding.
First here is the simple C++ class that measures time
class PerfTimer
{
public:
PerfTimer(PerfProfiledDataCounter* perfProfiledDataCounter);
~PerfTimer();
private:
uint64_t _startTime;
PerfProfiledDataCounter* _perfProfiledDataCounter;
};
I am using C++ to use that the destructor will be executed when object has exited the current scope. The idea is to create PerfTimer in the beginning of each swizzled method and it will take care of measuring the elapsed time for this method
The PerfProfiledDataCounter is a simple struct that counts the number of execution and the whole elapsed time (so it may find out what is the average time spent).
Also I am creating for each class I'd like profile, a category named "__Performance_Profiler_Category" and to conforms to "__Performance_Profiler_Marker" protocol. For easier creating I am using some macros that automatically create such categories. Also I have a set of macros that take selector name, return type and arguments type and create selectors for each selector name.
For all of the above tasks, I've created a set of macros to help me. Also I have a single file with .mm extension to register all classes and all selectors I'd like to profile. On app start, I am using the runtime to retrieve all classes that conforms to "__Performance_Profiler_Marker" protocol (i.e. the registered ones) and search for selectors that are marked for profiling (these selectors starts with predefined prefix). Note that this .mm file is the only file that needs .mm extension and there is no need to change file extension for each class I want to profile.
Afterwards the code swizzles the original selectors with the profiled ones. In each profiled one, I just create PerfTimer and call the swizzled method.
In brief that is my idea which turned out to work pretty smoothly.

Aspects scanning too many classes and method cache fills memory

In our application we have several (actuall many, about 30) web services. Each web service resides in its own WAR file and has its own Spring context that is initialised when application starts.
We also have a number of annotation-driven aspect classes that we apply to web service classes. In the begining the poincut expression looked like this:
#Pointcut("execution(public * my.package.service.business.*BusinessServiceImpl.*(..))")
public void methodsToBeLogged() {
}
And AOP was enabled on services through entry in configuration.
But when the number of web serivces grew, we began to experience OutOfMemoryExceptions on our servers. After doing some profiling and analysis it appeared that memory is taken by the cache that is kept by instances of AspectJExpressionPointcut class.
Each instance's cache was about 5 MBs. And as we had 3 aspects and 30 services it resulted in 90 instances holding 450MBs of data in total.
After examining the contents of the cache we realised that it contains Java reflection Method instances for all classes existing in the WAR even those which are not part of my.package.service.business package. After modifing the point cut expression to have additionally within clause:
#Pointcut("execution(public * my.package.service.business.*BusinessServiceImpl.*(..)) &&
within(my.package.service.business..*)")
public void methodsToBeLogged() {
}
Memory usage was down to normal again. And all AspectJExpressionPointcut instances took less than 1MB all together.
Can someone explain why is that? And why first point cut expression is not enough? Why the cache of AspectJExpressionPointcut is not shared?
The AspectJExpressionPointcut uses a cache (shadowMatchCache) which speeds up the decision of whether AOP should be applied to a certain method call or not, based on the pointcut expression. This cache possibly consumes a lot of memory.
Additionaly, before offering all methods of a specific bean to see if there is a pointcut expression match or not, Spring first checks if a bean class, could possibly match or not, by calling AspectJExpressionPointcut.matches(Class targetClass).
This method delegates to AspectJ's PointcutExpressionImpl.couldPossiblyMatch() method. This will perform a fast check whether a class could 'possibly' match a pointcut expression or will never 'definetely' match.
According to the AspectJ developers using a within pointcut, results in more definite no's. They also recommend to never use a standalone kind of pointcuts (execution, call, get, set), but combine these with within.
The shadowMatchCache can not be shared however, because it contains the result of match or no match per pointcut expression.
But at least you can limit what gets cached. I also think Spring could possibly improve on this by not keeping this whole cache around, once the applicationContext is started. F.e. they could possibly throw away all the no matches, at the expense of redoing some of the matching, when a new bean is dynamically added to the applicationContext after it is already started.
Another possible memory hog inside the AspectJExpressionPointcut class is the pointCutParser. This parser could possibly be shared across all AspectJExpressionPointcuts in the applicationContext. Take a loot at JIRA ticket SPR-7678.