[c#] Using Excel OleDb to get sheet names IN SHEET ORDER

I'm using OleDb to read from an excel workbook with many sheets.

I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;

|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/

Then I need to get the dictionary

1="GERMANY", 
2="UK", 
3="IRELAND"

I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;

GERMANY, IRELAND, UK

which has changed the order of UK and IRELAND.

The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.

Any ideas would be greatly appreciated.

if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.

This question is related to c# excel oledb server-side

The answer is


This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?

object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
                                         opt, opt, opt, opt, opt, opt, opt,
                                         opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;

Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.

Edit

I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:

/// <summary>
/// This method retrieves the excel sheet names from 
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
    OleDbConnection objConn = null;
    System.Data.DataTable dt = null;

    try
    {
        // Connection String. Change the excel file to the file you
        // will search.
        String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" + 
          "Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
        // Create connection object by using the preceding connection string.
        objConn = new OleDbConnection(connString);
        // Open connection with the database.
        objConn.Open();
        // Get the data table containg the schema guid.
        dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);

        if(dt == null)
        {
           return null;
        }

        String[] excelSheets = new String[dt.Rows.Count];
        int i = 0;

        // Add the sheet name to the string array.
        foreach(DataRow row in dt.Rows)
        {
           excelSheets[i] = row["TABLE_NAME"].ToString();
           i++;
        }

        // Loop through all of the sheets if you want too...
        for(int j=0; j < excelSheets.Length; j++)
        {
            // Query each excel sheet.
        }

        return excelSheets;
   }
   catch(Exception ex)
   {
       return null;
   }
   finally
   {
      // Clean up.
      if(objConn != null)
      {
          objConn.Close();
          objConn.Dispose();
      }
      if(dt != null)
      {
          dt.Dispose();
      }
   }
}

Extracted from Article on the CodeProject.


This is short, fast, safe, and usable...

public static List<string> ToExcelsSheetList(string excelFilePath)
{
    List<string> sheets = new List<string>();
    using (OleDbConnection connection = 
            new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x")) 
            ? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
            : "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
    {
        connection.Open();
        DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
        foreach (DataRow drSheet in dt.Rows)
            if (drSheet["TABLE_NAME"].ToString().Contains("$"))
            {
                string s = drSheet["TABLE_NAME"].ToString();
                sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
            }
        connection.Close();
    }
    return sheets;
}

Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:

public List<string> ListSheetInExcel(string filePath)
{
   OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
   String strExtendedProperties = String.Empty;
   sbConnection.DataSource = filePath;
   if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
   {
      sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
      strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
   }
   else if (Path.GetExtension(filePath).Equals(".xlsx"))  //for 2007 Excel file
   {
      sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
      strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
   }
   sbConnection.Add("Extended Properties",strExtendedProperties);
   List<string> listSheet = new List<string>();
   using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
   {
     conn.Open();
     DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);         
     foreach (DataRow drSheet in dtSheet.Rows)
     {
        if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
        {
             listSheet.Add(drSheet["TABLE_NAME"].ToString());
        } 
     }
  }
 return listSheet;
}

Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).


Try this. Here is the code to get the sheet names in order.

private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
    Excel.Application _excel = null;
    Excel.Workbook _workBook = null;
    Dictionary<int, string> excelSheets = new Dictionary<int, string>();
    try
    {
        object missing = Type.Missing;
        object readOnly = true;
        Excel.XlFileFormat.xlWorkbookNormal
        _excel = new Excel.ApplicationClass();
        _excel.Visible = false;
        _workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
            missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
        if (_workBook != null)
        {
            int index = 0;
            foreach (Excel.Worksheet sheet in _workBook.Sheets)
            {
                // Can get sheet names in order they are in workbook
                excelSheets.Add(++index, sheet.Name);
            }
        }
    }
    catch (Exception e)
    {
        return null;
    }
    finally
    {
        if (_excel != null)
        {

            if (_workBook != null)
                _workBook.Close(false, Type.Missing, Type.Missing);
            _excel.Application.Quit();
        }
        _excel = null;
        _workBook = null;
    }
    return excelSheets;
}

I like the idea of @deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.

Sub Macro1()
'
' Macro1 Macro
'

'
Dim i As Integer
For i = 1 To Sheets.Count
 Dim prefix As String
 prefix = i
 If Len(prefix) < 4 Then
  prefix = "000"
 ElseIf Len(prefix) < 3 Then
  prefix = "00"
 ElseIf Len(prefix) < 2 Then
  prefix = "0"
 End If
 Dim sheetName As String
 sheetName = Sheets(i).Name
 Dim names
 names = Split(sheetName, "-")
 If (UBound(names) > 0) And IsNumeric(names(0)) Then
  'do nothing
 Else
  Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
 End If
Next

End Sub

UPDATE: After reading @SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.

1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names

UPDATE: Another solution - NPOI might be helpful here http://npoi.codeplex.com/

 FileStream file = new FileStream(@"yourexcelfilename", FileMode.Open, FileAccess.Read);

      HSSFWorkbook  hssfworkbook = new HSSFWorkbook(file);
        for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
        {
            Console.WriteLine(hssfworkbook.GetSheetName(i));
        }
        file.Close();

This solution works for xls. I didn't try xlsx.

Thanks,

Esen


As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.

Code to get sheets name as per their visual appearance using interop:

Add reference to Microsoft Excel 12.0 Object Library.

Following code will give the sheets name in the actual order stored in workbook, not the sorted name.

Sample Code:

using Microsoft.Office.Interop.Excel;

string filename = "C:\\romil.xlsx";

object missing = System.Reflection.Missing.Value;

Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();

Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename,  missing,  missing,  missing,  missing,missing,  missing,  missing,  missing,  missing,  missing,  missing,  missing,  missing,  missing);

ArrayList sheetname = new ArrayList();

foreach (Microsoft.Office.Interop.Excel.Worksheet  sheet in wb.Sheets)
{
    sheetname.Add(sheet.Name);
}

I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.

The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.

So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -

`

<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
  <fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" /> 
  <workbookPr defaultThemeVersion="124226" /> 
- <bookViews>
  <workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" /> 
  </bookViews>
- <sheets>
  <sheet name="By song" sheetId="1" r:id="rId1" /> 
  <sheet name="By actors" sheetId="2" r:id="rId2" /> 
  <sheet name="By pit" sheetId="3" r:id="rId3" /> 
  </sheets>
- <definedNames>
  <definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName> 
  </definedNames>
  <calcPr calcId="145621" /> 
  </workbook>

`


Another way:

a xls(x) file is just a collection of *.xml files stored in a *.zip container. unzip the file "app.xml" in the folder docProps.

<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
  -<vt:vector baseType="variant" size="2">
    -<vt:variant>
      <vt:lpstr>Arbeitsblätter</vt:lpstr>
    </vt:variant>
    -<vt:variant>
      <vt:i4>4</vt:i4>
    </vt:variant>
  </vt:vector>
</HeadingPairs>
-<TitlesOfParts>
  -<vt:vector baseType="lpstr" size="4">
    <vt:lpstr>Tabelle3</vt:lpstr>
    <vt:lpstr>Tabelle4</vt:lpstr>
    <vt:lpstr>Tabelle1</vt:lpstr>
    <vt:lpstr>Tabelle2</vt:lpstr>
  </vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>

The file is a german file (Arbeitsblätter = worksheets). The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)

regards


I have created the below function using the information provided in the answer from @kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.

    using System.IO.Compression;
    using System.Xml;
    using System.Xml.Linq;

    static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
    {
        //open the excel file
        using (FileStream data = new FileStream(fileName, FileMode.Open))
        {
            //unzip
            ZipArchive archive = new ZipArchive(data);

            //select the correct file from the archive
            ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");

            //read the xml
            XDocument xdoc = XDocument.Load(appxmlFile.Open());

            //find the titles element
            XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();

            //extract the worksheet names
            return titlesElement
                .Elements().Where(e => e.Name.LocalName == "vector").Single()
                .Elements().Where(e => e.Name.LocalName == "lpstr")
                .Select(e => e.Value);
        }
    }

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to oledb

'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? how to resolve DTS_E_OLEDBERROR. in ssis Check if ADODB connection is open Import Excel to Datagridview "Cannot create an instance of OLE DB provider" error as Windows Authentication user Optimal way to Read an Excel file (.xls/.xlsx) How to connect to a MS Access file (mdb) using C#? "The 'Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine" Error in importing process of xlsx to a sql server How can I export data to an Excel file

Examples related to server-side

How do I parse a HTML page with Node.js Send message to specific client with socket.io and node.js Do copyright dates need to be updated? Using Excel OleDb to get sheet names IN SHEET ORDER