[c#] Importing Excel into a DataTable Quickly

I am trying to read an Excel file into a list of Data.DataTable, although with my current method it can take a very long time. I essentually go Worksheet by Worksheet, cell by cell, and it tends to take a very long time. Is there a quicker way of doing this? Here is my code:

    List<DataTable> List = new List<DataTable>();

    // Counting sheets
    for (int count = 1; count < WB.Worksheets.Count; ++count)
    {
        // Create a new DataTable for every Worksheet
        DATA.DataTable DT = new DataTable();

        WS = (EXCEL.Worksheet)WB.Worksheets.get_Item(count);

        textBox1.Text = count.ToString();

        // Get range of the worksheet
        Range = WS.UsedRange;


        // Create new Column in DataTable
        for (cCnt = 1; cCnt <= Range.Columns.Count; cCnt++)
        {
            textBox3.Text = cCnt.ToString();


                Column = new DataColumn();
                Column.DataType = System.Type.GetType("System.String");
                Column.ColumnName = cCnt.ToString();
                DT.Columns.Add(Column);

            // Create row for Data Table
            for (rCnt = 0; rCnt <= Range.Rows.Count; rCnt++)
            {
                textBox2.Text = rCnt.ToString();

                try
                {
                    cellVal = (string)(Range.Cells[rCnt, cCnt] as EXCEL.Range).Value2;
                }
                catch (Microsoft.CSharp.RuntimeBinder.RuntimeBinderException)
                {
                    ConvertVal = (double)(Range.Cells[rCnt, cCnt] as EXCEL.Range).Value2;
                    cellVal = ConvertVal.ToString();
                }

                // Add to the DataTable
                if (cCnt == 1)
                {

                    Row = DT.NewRow();
                    Row[cCnt.ToString()] = cellVal;
                    DT.Rows.Add(Row);
                }
                else
                {

                    Row = DT.Rows[rCnt];
                    Row[cCnt.ToString()] = cellVal;

                }
            }
        }
        // Add DT to the list. Then go to the next sheet in the Excel Workbook
        List.Add(DT);
    }

This question is related to c# .net excel office-interop

The answer is



MS Office Interop is slow and even Microsoft does not recommend Interop usage on server side and cannot be use to import large Excel files. For more details see why not to use OLE Automation from Microsoft point of view.

Instead, you can use any Excel library, like EasyXLS for example. This is a code sample that shows how to read the Excel file:

ExcelDocument workbook = new ExcelDocument();
DataSet ds = workbook.easy_ReadXLSActiveSheet_AsDataSet("excel.xls");
DataTable dataTable = ds.Tables[0];

If your Excel file has multiple sheets or for importing only ranges of cells (for better performances) take a look to more code samples on how to import Excel to DataTable in C# using EasyXLS.


In case anyone else is using EPPlus. This implementation is pretty naive, but there are comments that draw attention to such. If you were to layer one more method GetWorkbookAsDataSet() on top it would do what the OP is asking for.

    /// <summary>
    /// Assumption: Worksheet is in table format with no weird padding or blank column headers.
    /// 
    /// Assertion: Duplicate column names will be aliased by appending a sequence number (eg. Column, Column1, Column2)
    /// </summary>
    /// <param name="worksheet"></param>
    /// <returns></returns>
    public static DataTable GetWorksheetAsDataTable(ExcelWorksheet worksheet)
    {
        var dt = new DataTable(worksheet.Name);
        dt.Columns.AddRange(GetDataColumns(worksheet).ToArray());
        var headerOffset = 1; //have to skip header row
        var width = dt.Columns.Count;
        var depth = GetTableDepth(worksheet, headerOffset);
        for (var i = 1; i <= depth; i++)
        {
            var row = dt.NewRow();
            for (var j = 1; j <= width; j++)
            {
                var currentValue = worksheet.Cells[i + headerOffset, j].Value;

                //have to decrement b/c excel is 1 based and datatable is 0 based.
                row[j - 1] = currentValue == null ? null : currentValue.ToString();
            }

            dt.Rows.Add(row);
        }

        return dt;
    }

    /// <summary>
    /// Assumption: There are no null or empty cells in the first column
    /// </summary>
    /// <param name="worksheet"></param>
    /// <returns></returns>
    private static int GetTableDepth(ExcelWorksheet worksheet, int headerOffset)
    {
        var i = 1;
        var j = 1;
        var cellValue = worksheet.Cells[i + headerOffset, j].Value;
        while (cellValue != null)
        {
            i++;
            cellValue = worksheet.Cells[i + headerOffset, j].Value;
        }

        return i - 1; //subtract one because we're going from rownumber (1 based) to depth (0 based)
    }

    private static IEnumerable<DataColumn> GetDataColumns(ExcelWorksheet worksheet)
    {
        return GatherColumnNames(worksheet).Select(x => new DataColumn(x));
    }

    private static IEnumerable<string> GatherColumnNames(ExcelWorksheet worksheet)
    {
        var columns = new List<string>();

        var i = 1;
        var j = 1;
        var columnName = worksheet.Cells[i, j].Value;
        while (columnName != null)
        {
            columns.Add(GetUniqueColumnName(columns, columnName.ToString()));
            j++;
            columnName = worksheet.Cells[i, j].Value;
        }

        return columns;
    }

    private static string GetUniqueColumnName(IEnumerable<string> columnNames, string columnName)
    {
        var colName = columnName;
        var i = 1;
        while (columnNames.Contains(colName))
        {
            colName = columnName + i.ToString();
            i++;
        }

        return colName;
    }

Dim sSheetName As String
Dim sConnection As String
Dim dtTablesList As DataTable
Dim oleExcelCommand As OleDbCommand
Dim oleExcelReader As OleDbDataReader
Dim oleExcelConnection As OleDbConnection

sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Test.xls;Extended Properties=""Excel 12.0;HDR=No;IMEX=1"""

oleExcelConnection = New OleDbConnection(sConnection)
oleExcelConnection.Open()

dtTablesList = oleExcelConnection.GetSchema("Tables")

If dtTablesList.Rows.Count > 0 Then
    sSheetName = dtTablesList.Rows(0)("TABLE_NAME").ToString
End If

dtTablesList.Clear()
dtTablesList.Dispose()

If sSheetName <> "" Then

    oleExcelCommand = oleExcelConnection.CreateCommand()
    oleExcelCommand.CommandText = "Select * From [" & sSheetName & "]"
    oleExcelCommand.CommandType = CommandType.Text

    oleExcelReader = oleExcelCommand.ExecuteReader

    nOutputRow = 0

    While oleExcelReader.Read

    End While

    oleExcelReader.Close()

End If

oleExcelConnection.Close()

 class DataReader
    {
        Excel.Application xlApp;
        Excel.Workbook xlBook;
        Excel.Range xlRange;
        Excel.Worksheet xlSheet;
        public DataTable GetSheetDataAsDataTable(String filePath, String sheetName)
        {
            DataTable dt = new DataTable();
            try
            {
                xlApp = new Excel.Application();
                xlBook = xlApp.Workbooks.Open(filePath);
                xlSheet = xlBook.Worksheets[sheetName];
                xlRange = xlSheet.UsedRange;
                DataRow row=null;
                for (int i = 1; i <= xlRange.Rows.Count; i++)
                {
                    if (i != 1)
                        row = dt.NewRow();
                    for (int j = 1; j <= xlRange.Columns.Count; j++)
                    {
                        if (i == 1)
                            dt.Columns.Add(xlRange.Cells[1, j].value);
                        else
                            row[j-1] = xlRange.Cells[i, j].value;
                    }
                    if(row !=null)
                        dt.Rows.Add(row);
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
            finally
            {
                xlBook.Close();
                xlApp.Quit();
            }
            return dt;
        }
    }

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to office-interop

How to properly set Column Width upon creating Excel file? (Column properties) Importing Excel into a DataTable Quickly How to fix 'Microsoft Excel cannot open or save any more documents' What reference do I need to use Microsoft.Office.Interop.Excel in .NET? How to read an excel file in C# without using Microsoft.Office.Interop.Excel libraries Reading Datetime value From Excel sheet How to make correct date format when writing data to Excel Create Excel files from C# without office Exporting the values in List to excel C# - How to add an Excel Worksheet programmatically - Office XP / 2003