My group has recently been adding support for managed applications to our platform. As part of this work, we are experimenting with various approaches - rewriting existing libraries in C#, writing managed wrappers for native libraries in Managed C++, DllImport, etc.
Although the best approach varies depending on the specifics, such as the stability of the underlying C++ library, how closely the programming model in the C++ library matches that of .Net, and even our plans for the future of the component, the performance aspects of each approach has been a constant question.
One of the first libraries to be incorporated was a networking library. This is a fairly complex, ~10,000 line C++ library with lots of threading, memory management, and sockets programming. Relatively complex, and a treasure trove of subtle bugs that we have worked out over the course of several services and multiple years. The choice to wrap this library with a thin managed C++ assembly was easy - given that the interface is fairly simple (only a handful of APIs, and each one does something rather large and complex, like "send message"), and the implementation is complex. With some clever pinning and hooks into the native library, we were able to make this work very smoothly from .Net, while still providing great performance, and without introducing lots of defects from a rewrite. The performance difference between managed clients and native clients of this library is under 5%.
I had wondered if a different approach would be needed for a "chatty" API, something where the client makes many function calls, each of which does a relatively small amount of work. I chose our serialization library for this experiment, as it is fairly small (~2,500 lines of code) and very chatty - each method rarely does more than copy a few bytes into a buffer.
To test this scenario, I wrote both a managed C++ wrapper for the serialization library, as well as a C# port. I kept the interface exactly the same between the wrapper and port, and the implementation of the original library and port are as close as possible.
After writing the new library and wrapper, I took some real serialization code from one of our clients, and converted it into 3 test drivers; a native C++ implementation that uses the original library, a C# version that uses the managed wrapper, and a C# version that uses the port. The messages generated by the sample code are about 20KB in size, each message consists of roughly 600 properties, and the test generates 1000 messages.
Results for the first test:
Native | 90 ms |
Managed C++ wrapper | 300 ms |
C# rewrite | 350 ms |
Wow... the penalty for a managed application is over 200%, and nearly 300% for the C# rewrite. There was one significant difference between both managed implementations and the original, native C++ library - string handling. The serialization format specifies UTF-8, and the native clients traditionally use UTF-8 internally as well. Therefore the managed implementation had to convert from the Unicode System.String into UTF-8 for every string property, whereas the native implementation simply copied bytes.
To remove this difference, I wrote a second version of the test client that converted all of the string values to UTF-8 byte arrays once, outside of the timed section, then passed the pre-encoded byte arrays to the serialization library:
Native | 90 ms |
Managed C++ wrapper | 130 ms |
C# rewrite | 230 ms |
This puts the managed C++ wrapper within 50% of the native implementation, with an additional 75% for the C# rewrite. Given that the managed C++ wrapper is only a couple hundred lines of code, and took less than an hour to write, it seems like the best choice.
As a general observation, the real decision here should not be made on the basis of performance alone. Even comparing the best to the worst implementations only adds 1/4 of a millisecond - a quantity that in nearly every situation will be dwarfed by other factors. Using System.String is probably worth 0.26ms.
In the future I would like to profile and optimize the C# rewrite, to see how close to the native implementation we can get.
-randy
Comments
You can follow this conversation by subscribing to the comment feed for this post.