Few Lines of Code: Excel export can be fast

The open source JumboExcel library presented is in this writing is juggling with hundreds of thousands of Excel cells in subsecond time, is memory friendly due to the streaming approach and is able to work in a multi-threaded server environment. You can check out the project's GitHub page and also use it in a form of binary NuGet package. It has not presented a single performance issue in two years of using in a heavily-loaded production system.

Many developers in the enterprise field face the performance and maintainability issues of integrating Excel export functionality as part of application reporting features in a server environment.

On the .NET platform, naive attempts to integrate with Microsoft Excel in a server environment often lead to infamous Out of Memory Exception for a not so obvious reason.

So, does it need to be that bad?

Not at all. Look at the performance characteristics of my JumboExcel library.

This picture displays the average number of columns in the column headers, the average number of rows in the row headers, and the corresponding average time (milliseconds), taken to export an excel file of that size.

JumboExcel performance

I can add that the initial report in the picture was rendered with the library itself, and the memory usage has not raised above few thousands of megabytes during the entire test run while generating all those files.

JumboExcel is based on the OpenXmlSdk package, so it's basically a wrapper.

Why would one need to wrap it with another library?

In my experience, the OpenXmlSdk is basically a means of creating or reading a well-formed XML in the Open XML format and package it using System.IO.Packaging.

OpenXmlSdk is capable of writing and reading just everything. You can use the Document Object Model to form the structure of the document, but you need to know the format, and this process is extremely error prone. The model is largely loosely-typed, OpenXmlSdk does not address the semantics of the document so you can easily misuse the classes. Also, there are some required nodes that must be present in the document structure. You can end up with a document written successfully, but that would be an invalid Microsoft Excel document.

In contrast, using the JumboExcel library, you are coding against a simplistic and strongly typed API with throughout usage of immutable classes which makes it really difficult to make a mistake.

Basically, if you code and it compiles - the file will open with Microsoft Excel (unless you make something extremely non-trivial with IEnumerable implementations you pass to the API).

The implementation of v1.0.1 is stable and is tested in a real production system.

Please use the GitHub project page to request features, report bugs and ask questions.

The best way to get started with the library is to install the NuGet package and look at the example code in the DemoTests project.

Here is the an example file demonstrating the v1.0.1 formatting features generated with this test, from the latest stable release available at the time of the writing. For more features you can check out the development branch at GitHub or download an alpha package from GuGet.

The library is meant to be as simple as it gets, and v1.0.1 does not support some Excel features.
It's aimed at providing an extremely performant solution to the problem of exporting many rows and columns with basic formatting features. It does not support formulas, for example.

If you need just that and need it to work on a server under heavy load - you found it.

You can use it with Few Lines of Code!

Few Lines of Code

Labels

Wednesday, 13 April 2016

Excel export can be fast

1 comment:

Blog Archive