Saturday, September 22, 2007

The Excel Interop

I just finished up a project working on a win32 client application that is used to import data from an Excel spreadsheet into a database via a secure web service. Working with the interop was interesting and it was nice to get exposed to something new. I'll post more about web services later (as well as returning custom data types via web service methods, serializable Dictionary objects, and converting rtf to html) but I wanted to rant about the Excel Interop a bit.

Indexing
When I started working with the interop I went in and created an instance of Excel:

Microsoft.Office.Interop.Excel.Application excelObj = new Microsoft.Office.Interop.Excel.Application();


This is all well and good. After that I set visible to false and went about starting to access some of the bits and pieces in my xls file. Oh, and because we're using C# and because Open isn't overloaded:

Microsoft.Office.Interop.Excel.Workbook workbook = excelObj.Workbooks.Open(
filename,
Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value,
Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value,
Missing.Value, Missing.Value, Missing.Value, Missing.Value
);

Microsoft.Office.Interop.Excel.Sheets sheets = workbook.Worksheets;
Microsoft.Office.Interop.Excel.Worksheet worksheet = (Microsoft.Office.Interop.Excel.Worksheet)sheets.get_Item(0);

Now, if you run this code you'll get an exception citing an invalid index:

System.Runtime.InteropServices.COMException (0x8002000B): Invalid index

If you're like me you'll immediately try starting your index off at 1 because I suppose it makes sense to start off your worksheets at 1. As much as this seems like VB and as much as I don't like VB I think I can live with it. So you change the sheet number to one and go on your merry way.

Then you try to access some data in your first worksheet:

string str = (string)((Microsoft.Office.Interop.Excel.Range)worksheet.Cells[0, 0]).Value2;

Running this code gives you an exception and you want to assume it's because you started your indices off at 0 but it doesn't quite make sense because it gave you a different exception:
System.Runtime.InteropServices.COMException (0x800A03EC): Exception from HRESULT

Hopefully at this point you immediately try Cells[1, 1] and don't spend some time trying to figure out what you did to go from a meaningful exception that made sense to a generic COM Exception. If you try 1,1 you'll find that it works and all will be right in the world.

What bothers me about this isn't the fact that I have to start the index off at 1 rather than 0. What bothers me is that when I tried to open a worksheet with an out of bounds index I got an exception telling me that it was an index issue. When I tried cells that were out of bounds I got a generic exception. The worksheet exception was actually helpful and immediately let me know what the problem was whereas the cell exception wasn't even though it was pretty much the same exact issue.

Killing Your Instance:
I'm not even going to into detail on this as it's been beaten to death already. Killing off an instance of Excel is a pain though I don't think that's really the fault of the interop. You have COM to thank for that one. Essentially you have to make sure that for each Excel COM object you create you're calling ReleaseComObject:

Marshal.ReleaseComObject(excelObj);

And then calling the Garbage Collector:

GC.GetTotalMemory(false);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
GC.GetTotalMemory(true);

I'm not really one to talk in this instance though since my client app still has a bug where when multiple instances of the Excel application are created once I close them out all of the processes go away except for the first one I created. It's really bizarre. See the references section for some links.

Hyperlinks
This isn't so much a beef with the Excel Interop as it is with Excel in general. In Excel you can't have a link within a cell. You can have the entire cell be a link to some location but you can't have the text "click HERE for pictures of my cat" and have only the text "HERE" act as a hyperlink. This ended up saving me some time on the rtf to html conversion but is kind of lame in my opinion.

References:
Here are some of the pages that I found the most helpful when working through this:
- MSDN: Excel Tasks
- Craig Murphy: Excel Interop - killing Excel.exe
- MS Help and Support: Office application does not quit after automation from Visual Studio .NET client
Excel Solutions Forum

3 comments:

  1. That was a good catch on the Missing.Value bit. I believe when I first came across that VB optional parameter issue I tried passing NULLS along, which is probably why I had trouble getting it to work in C#. I'm assuming NULL works in a lot of situations but not in all.

    Working almost exclusively in .NET gives me an imbalanced bias toward Microsoft. It factors in all of the lessons from Java, previous Microsoft languages, and other OO trials into a clean, consistent API. It's not that it doesn't have its bugs, but in working with VBA or some of the older MOSS APIs I'm always a bit surprised at how sloppy they were; they feel like they were designed by multiple people and hacked together to meet immediate needs without a clear long-term objective. Which is probably exactly what happened. They feel like spaghetti code - and that's just looking at the public interfaces.

    In .NET, however, I can often guess the API without Intellisense or the SDK documentation because the naming conventions and overall architectural approach is so consistent. I love that.

    ReplyDelete
  2. Hi,

    I am facing an error when i tried to insert a huge data more than 60000 chars into one excel cell.
    Following is the error i am getting
    "Not enough storage is available to complete this operation. (Exception from HRESULT: 0x8007000E (E_OUTOFMEMORY))"

    How to solve this issue

    ReplyDelete
  3. Missing.Value instead of null. Now I can stop pulling my hair out :) Thanks!

    ReplyDelete