XSSF Performance: degrade when creating huge number of rows

Apr 5, 2013 at 6:48 PM
Edited Apr 8, 2013 at 9:15 AM
Hello,

We use NPOI HSSF to make an Excel of 50K rows in 9 seconds.
But when changed to use the XSSF (IWorkbook interface), the time needed increase until 3 minutes.
Our code for both tests is the same, we only change the constructor and some type definition.

Also, it is remarkable that, the performance gets worst in time.
For 30k, it takes 25 seconds.
For 70k, it takes 4 minutes.
For 300k, it takes 80 minutes!!!!

We need to use XSSF to support more than 65K rows.

Is there any problem with the performance for XSSF?
Why is the performance degrading when more rows are created?

[EDITED]
We have use a NPOI example to see this.
Just change the new XSSFWorkbook() to new HSSFWorkbook(), and you can see that the HSSF takes 1 second, and the XSSF takes 1 minute!!!

using System;
using System.Collections.Generic;
using System.Text;
using NPOI.SS.UserModel;
using NPOI.HSSF.UserModel;
using NPOI.XSSF.UserModel;
using System.IO;
using System.Diagnostics;

namespace NPOI.Examples.XSSF.SetCellValuesInXlsx
{
class Program
{
    static void Main(string[] args)
    {
        IWorkbook workbook = new XSSFWorkbook();
        ISheet sheet1 = workbook.CreateSheet("Sheet1");
        sheet1.CreateRow(0).CreateCell(0).SetCellValue("This is a Sample");
        int x = 1;

        Debug.WriteLine("Start at " + DateTime.Now.ToString());
        for (int i = 1; i <= 30000; i++)
        {
            IRow row = sheet1.CreateRow(i);
            for (int j = 0; j < 15; j++)
            {
                row.CreateCell(j).SetCellValue(x++);
            }
        }
        Debug.WriteLine("End at " + DateTime.Now.ToString());

        FileStream sw = File.Create("test.xls");
        workbook.Write(sw);
        sw.Close();
    }
}
}


Thanks in advance!
Raúl.
Coordinator
Apr 9, 2013 at 2:29 PM
Ths is a known issue in OOXML library. Since the team is busy implementing features, the performance issue may not be fixed in 2.0.1 final release.

The main reason of the issue is because XSSF uses XmlSerializer as the serializer for OOXML document. However, XmlSerializer is not working as fast as expected. It's always slow. To fix this issue, we have to find a alternative way of serializing OOXML document.
Apr 10, 2013 at 9:24 PM
Thanks for your answer, tonyqus.
The software is great, but with this issue we can use it.
I wish we can use it in the future.

Regards!
Oct 11, 2013 at 5:32 PM
Is a new version going to be released anytime soon? I'm experiencing major performance issues when using XSSF to generate a xlsx spreadsheet.

Thanks.
Coordinator
Nov 3, 2013 at 9:04 AM
Please try the latest source code in github. The performance of XSSF is greatly improved after removing XmlSerializer.
Nov 3, 2013 at 11:15 AM
May I know when next version release?
Coordinator
Nov 24, 2013 at 12:49 PM
The next version is NPOI 2.0 beta 2. It's planned to be released next month (Dec, 2013)
Coordinator
Nov 24, 2013 at 1:55 PM
Edited Nov 24, 2013 at 1:55 PM
Btw, I ran your program based on the latest code in github just now. Here is the latest result.

Start at 2013/11/24 21:51:59
End at 2013/11/24 21:52:01

I switched to Stopwatch to get more precious result. On my computer, it's 1830ms. Much shorter than 1 minute :)