In my
previous post, I have given a quick overview of some new features of the
aspect weaver.
Equally important to features: runtime performance.
PostSharp 1.5 already did a great job compared to other aspect frameworks.
However, this could still be greatly improved. And this had been done in
PostSharp 2.0.
Take, for instance, a rather simple aspect: a performance counter. Say we
want to increase a counter every time a method is executed. The easiest and most
efficient way with PostSharp 1.5 is to create an aspect of type
OnMethodBoundaryAspect and to implement the OnEntry:
[Serializable]
public sealed class MethodInvocationCounterAttribute : OnMethodBoundaryAspect
{
[NonSerialized]
private PerformanceCounter performanceCounter;
public MethodInvocationCounterAttribute(string category, string name)
{
this.Category = category;
this.Name = name;
}
public string Category { get; private set; }
public string Name { get; private set; }
public override void RuntimeInitialize(System.Reflection.MethodBase method)
{
this.performanceCounter = new PerformanceCounter(this.Category,
this.Name, false);
}
public override void OnEntry(MethodExecutionEventArgs eventArgs)
{
this.performanceCounter.Increment();
}
}
Now let's apply this aspect on an empty method (and forbid inlining), and put
it on the test bench (here on my AMD Phenom II X4 940 3.00 GHz):
| Implementation |
CPU Time (ns) |
Overhead (ns) |
| Manual implementation |
26 |
0 |
| PostSharp 1.5 |
108 |
82 |
As you can see, in PostSharp 1.5, the overhead of the using an aspect,
compared to hand coding, is 108 - 26 = 82 ns (remember, a nanosecond is a
billionth of a second). What's the problem? The aspect overhead may is
more than 3
times the aspect effect itself! If the aspect has to be invoked thousands of
times per second, its cost can surely not be ignored.
Now look at the benchmark with PostSharp 2.0:
| Implementation |
CPU Time (ns) |
Overhead (ns) |
| Manual implementation |
26 |
0 |
| PostSharp 2.0 |
29 |
3 |
This time, it's much better. For this specific aspect, PostSharp 2.0 is
more than 25 times faster at runtime than PostSharp 1.5! Most importantly, the
overhead of the aspect is now only a small fraction of the cost of the effect
itself. So it now makes perfectly sense to use an aspect for lightweight
instrumentation.
How does PostSharp 2.0 achieves this performance gain? In short, by being
smarter about the code it generates. PostSharp 1.5 did not look into the code of
your aspects, it had to generate code for any offered feature, even if your
aspect did not use them. Therefore, PostSharp 1.5 generated a lot of useless
instructions. This is clear if we looked at the output assembly using Reflector:

As you can see, there are instructions to pass the parameters of the method
to the aspect, even if the aspect never uses them. Handlers OnSuccess,
OnException and OnExit are invoked, even if the aspect does not
implement them. That's bad overhead: it does not translate into anything
useful.
PostSharp 2.0 is way smarter: it analyzes the code of your aspect, figures
out which feature you actually use, and generate instructions only for them. Watch the difference:

No wonder it's faster. Since the aspect does not even use the eventArgs
object, why should we pass it? PostSharp 2.0 is smarter than you could think: it
also looks at which member of MethodExecutionEventArgs you are using. So
if you read the Instance property and not the Arguments property,
you will get Instance, not Arguments.
Sure, there is a catch in this benchmark: I have intentionally chosen an
example where the improvement is dramatic. But, from today, you know that you can
achieve amazing performance with OnMethodBoundaryAspect, and that it's up
to you to design an aspect that is really lightweight at runtime. The learning
curve of PostSharp has always been pay-as-you-consume. Now runtime performance
is also pay-as-you-consume.
What with other aspects? Take
OnMethodInvocationAspect. It has been reimplemented from scratch, is now
named MethodInterceptionAspect, and is
"just" 77% faster in PostSharp 2.0 than in PostSharp 1.5.
PostSharp 2.0 delivers better runtime performance by playing on 3 factors:
- Adaptive Code Generation, as demonstrated above (major benefits in
OnMethodBoundaryAspect, minor benefits elsewhere).
- Use of generic tuples instead of untyped arrays to store arguments (no
boxing-unboxing, no casting).
- Use of binding classes instead of delegates.
- Aggressively optimized design (we do consider a virtual call or a
boxing/unboxing cycle as an expensive operation).
- Cross-aspect optimizations, delivering great benefits when many aspects
are applied to the same method (artifacts used by an aspect can be reused by
next aspects in chain).
Now a last piece of code. What if a 3 ns overhead is still too much for your
case? Look at the following piece of code, it has zero overhead.
[Serializable]
public sealed class MethodInvocationCounter3Attribute : MethodLevelAspect
{
[NonSerialized]
private readonly static PerformanceCounter performanceCounter;
static MethodInvocationCounter3Attribute()
{
if ( !PostSharpEnvironment.IsPostSharpRunning )
{
performanceCounter = new PerformanceCounter("Custom", "NumberOfItems64", false);
}
}
[OnMethodEntryHandler, SelfSelector]
public static void OnEntry(MethodExecutionArgs eventArgs)
{
performanceCounter.Increment();
}
}
The OnEntry handler is a static method. So, when applied on a target
method, it gives the following:

Since the OnEntry handler is inlined by the JIT compiler, this method is
strictly equivalent to invoking performanceCounter.Increment manually
from the instrumented method. Sure, your possibilities are very limited when you
use a static method (you can't access instance fields of the aspect instance, so
the counter name has to be the same for all methods using this aspect). But the
promise holds: you use nothing, you pay nothing. Here, the aspect is free of any
overhead.
Happy PostSharping!
-gael
PS. Kicking ass?