[c#] Best /Fastest way to read an Excel Sheet into a DataTable?

I'm hoping someone here can point me in the right direction - I'm trying to create a fairly robust utility program to read the data from an Excel sheet (may be .xls OR .xlsx) into a DataTable as quickly and leanly as possible.

I came up with this routine in VB (although I'd be just as happy with a good C# answer):

Public Shared Function ReadExcelIntoDataTable(ByVal FileName As String, ByVal SheetName As String) As DataTable
    Dim RetVal As New DataTable

    Dim strConnString As String
    strConnString = "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=" & FileName & ";"

    Dim strSQL As String 
    strSQL = "SELECT * FROM [" & SheetName & "$]"

    Dim y As New Odbc.OdbcDataAdapter(strSQL, strConnString)

    y.Fill(RetVal)

    Return RetVal

End Function

I'm wondering if this is the best way to do it or if there are better / more efficent ways (or just more intelligent ways - Maybe Linq / native .Net providers) to use instead?

ALSO, just a quick and silly additional question - Do I need to include code such as y.Dispose() and y = Nothing or will that be taken care of since the variable should die at the end of the routine, right??

Thanks!!

This question is related to c# .net vb.net

The answer is


This is the way to read from excel oledb

try
{
    System.Data.OleDb.OleDbConnection MyConnection;
    System.Data.DataSet DtSet;
    System.Data.OleDb.OleDbDataAdapter MyCommand;
    string strHeader7 = "";
    strHeader7 = (hdr7) ? "Yes" : "No";
    MyConnection = new System.Data.OleDb.OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fn + ";Extended Properties=\"Excel 12.0;HDR=" + strHeader7 + ";IMEX=1\"");
    MyCommand = new System.Data.OleDb.OleDbDataAdapter("select * from [" + wks + "$]", MyConnection);
    MyCommand.TableMappings.Add("Table", "TestTable");
    DtSet = new System.Data.DataSet();
    MyCommand.Fill(DtSet);
    dgv7.DataSource = DtSet.Tables[0];
    MyConnection.Close();
}
catch (Exception ex)
{
    MessageBox.Show(ex.ToString());
}

The below code is tested by myself and is very simple, understandable, usable and fast. This code, initially takes all sheet names, then puts all tables of that excel file in a DataSet.

    public static DataSet ToDataSet(string exceladdress, int startRecord = 0, int maxRecord = -1, string condition = "")
    {
        DataSet result = new DataSet();
        using (OleDbConnection connection = new OleDbConnection(
                (exceladdress.TrimEnd().ToLower().EndsWith("x"))
                ? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + exceladdress + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
                : "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + exceladdress + "';Extended Properties=Excel 8.0;"))
            try
            {
                connection.Open();
                DataTable schema = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
                foreach (DataRow drSheet in schema.Rows)
                    if (drSheet["TABLE_NAME"].ToString().Contains("$"))
                    {
                        string s = drSheet["TABLE_NAME"].ToString();
                        if (s.StartsWith("'")) s = s.Substring(1, s.Length - 2);
                        System.Data.OleDb.OleDbDataAdapter command =
                            new System.Data.OleDb.OleDbDataAdapter(string.Join("", "SELECT * FROM [", s, "] ", condition), connection);
                        DataTable dt = new DataTable();
                        if (maxRecord > -1 && startRecord > -1) command.Fill(startRecord, maxRecord, dt);
                        else command.Fill(dt);
                        result.Tables.Add(dt);
                    }
                return result;
            }
            catch (Exception ex) { return null; }
            finally { connection.Close(); }
    }

Enjoy...


public DataTable ImportExceltoDatatable(string filepath)
{
    // string sqlquery= "Select * From [SheetName$] Where YourCondition";
    string sqlquery = "Select * From [SheetName$] Where Id='ID_007'";
    DataSet ds = new DataSet();
    string constring = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0;HDR=YES;\"";
    OleDbConnection con = new OleDbConnection(constring + "");
    OleDbDataAdapter da = new OleDbDataAdapter(sqlquery, con);
    da.Fill(ds);
    DataTable dt = ds.Tables[0];
    return dt;
}

''' <summary>
''' ReadToDataTable reads the given Excel file to a datatable.
''' </summary>
''' <param name="table">The table to be populated.</param>
''' <param name="incomingFileName">The file to attempt to read to.</param>
''' <returns>TRUE if success, FALSE otherwise.</returns>
''' <remarks></remarks>
Public Function ReadToDataTable(ByRef table As DataTable,
                                incomingFileName As String) As Boolean
    Dim returnValue As Boolean = False
    Try

        Dim sheetName As String = ""
        Dim connectionString As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & incomingFileName & ";Extended Properties=""Excel 12.0;HDR=No;IMEX=1"""
        Dim tablesInFile As DataTable
        Dim oleExcelCommand As OleDbCommand
        Dim oleExcelReader As OleDbDataReader
        Dim oleExcelConnection As OleDbConnection

        oleExcelConnection = New OleDbConnection(connectionString)
        oleExcelConnection.Open()

        tablesInFile = oleExcelConnection.GetSchema("Tables")

        If tablesInFile.Rows.Count > 0 Then
            sheetName = tablesInFile.Rows(0)("TABLE_NAME").ToString
        End If

        If sheetName <> "" Then

            oleExcelCommand = oleExcelConnection.CreateCommand()
            oleExcelCommand.CommandText = "Select * From [" & sheetName & "]"
            oleExcelCommand.CommandType = CommandType.Text

            oleExcelReader = oleExcelCommand.ExecuteReader

            'Determine what row of the Excel file we are on
            Dim currentRowIndex As Integer = 0

            While oleExcelReader.Read
                'If we are on the First Row, then add the item as Columns in the DataTable
                If currentRowIndex = 0 Then
                    For currentFieldIndex As Integer = 0 To (oleExcelReader.VisibleFieldCount - 1)
                        Dim currentColumnName As String = oleExcelReader.Item(currentFieldIndex).ToString
                        table.Columns.Add(currentColumnName, GetType(String))
                        table.AcceptChanges()
                    Next
                End If
                'If we are on a Row with Data, add the data to the SheetTable
                If currentRowIndex > 0 Then
                    Dim newRow As DataRow = table.NewRow
                    For currentFieldIndex As Integer = 0 To (oleExcelReader.VisibleFieldCount - 1)
                        Dim currentColumnName As String = table.Columns(currentFieldIndex).ColumnName
                        newRow(currentColumnName) = oleExcelReader.Item(currentFieldIndex)
                        If IsDBNull(newRow(currentFieldIndex)) Then
                            newRow(currentFieldIndex) = ""
                        End If
                    Next
                    table.Rows.Add(newRow)
                    table.AcceptChanges()
                End If

                'Increment the CurrentRowIndex
                currentRowIndex += 1
            End While

            oleExcelReader.Close()

        End If

        oleExcelConnection.Close()
        returnValue = True
    Catch ex As Exception
        'LastError = ex.ToString
        Return False
    End Try


    Return returnValue
End Function

I found it pretty easy like this

    using System;
    using System.Data;
    using System.IO;
    using Excel;

    public DataTable ExcelToDataTableUsingExcelDataReader(string storePath)
    {
        FileStream stream = File.Open(storePath, FileMode.Open, FileAccess.Read);

        string fileExtension = Path.GetExtension(storePath);
        IExcelDataReader excelReader = null;
        if (fileExtension == ".xls")
        {
            excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
        }
        else if (fileExtension == ".xlsx")
        {
            excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
        }

        excelReader.IsFirstRowAsColumnNames = true;
        DataSet result = excelReader.AsDataSet();
        var test = result.Tables[0];
        return result.Tables[0];
    }

Note: you need to install SharpZipLib package for this

Install-Package SharpZipLib

neat and clean! ;)


Use the below snippet it will be helpfull.

string POCpath = @"G:\Althaf\abc.xlsx";

string POCConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + POCpath + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=1\";";

OleDbConnection POCcon = new OleDbConnection(POCConnection);
OleDbCommand POCcommand = new OleDbCommand();
DataTable dt = new DataTable();
OleDbDataAdapter POCCommand = new OleDbDataAdapter("select * from [Sheet1$] ", POCcon);
POCCommand.Fill(dt);
Console.WriteLine(dt.Rows.Count);

If you want to do the same thing in C# based on CiarĂ¡n Answer

string sSheetName = null;
string sConnection = null;
DataTable dtTablesList = default(DataTable);
OleDbCommand oleExcelCommand = default(OleDbCommand);
OleDbDataReader oleExcelReader = default(OleDbDataReader);
OleDbConnection oleExcelConnection = default(OleDbConnection);

sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\\Test.xls;Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\"";

oleExcelConnection = new OleDbConnection(sConnection);
oleExcelConnection.Open();

dtTablesList = oleExcelConnection.GetSchema("Tables");

if (dtTablesList.Rows.Count > 0) 
{
    sSheetName = dtTablesList.Rows[0]["TABLE_NAME"].ToString();
}

dtTablesList.Clear();
dtTablesList.Dispose();


if (!string.IsNullOrEmpty(sSheetName)) {
    oleExcelCommand = oleExcelConnection.CreateCommand();
    oleExcelCommand.CommandText = "Select * From [" + sSheetName + "]";
    oleExcelCommand.CommandType = CommandType.Text;
    oleExcelReader = oleExcelCommand.ExecuteReader();
    nOutputRow = 0;

    while (oleExcelReader.Read())
    {
    }
    oleExcelReader.Close();
}
oleExcelConnection.Close();

here is another way read Excel into a DataTable without using OLEDB very quick Keep in mind that the file ext would have to be .CSV for this to work properly

private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
    csvData = new DataTable(defaultTableName);
    try
    {
        using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
        {
            csvReader.SetDelimiters(new string[]
            {
                tableDelim 
            });
            csvReader.HasFieldsEnclosedInQuotes = true;
            string[] colFields = csvReader.ReadFields();
            foreach (string column in colFields)
            {
                DataColumn datecolumn = new DataColumn(column);
                datecolumn.AllowDBNull = true;
                csvData.Columns.Add(datecolumn);
            }

            while (!csvReader.EndOfData)
            {
                string[] fieldData = csvReader.ReadFields();
                //Making empty value as null
                for (int i = 0; i < fieldData.Length; i++)
                {
                    if (fieldData[i] == string.Empty)
                    {
                        fieldData[i] = string.Empty; //fieldData[i] = null
                    }
                    //Skip rows that have any csv header information or blank rows in them
                    if (fieldData[0].Contains("Disclaimer") || string.IsNullOrEmpty(fieldData[0]))
                    {
                        continue;
                    }
                }
                csvData.Rows.Add(fieldData);
            }
        }
    }
    catch (Exception ex)
    {
    }
    return csvData;
}

This seemed to work pretty well for me.

private DataTable ReadExcelFile(string sheetName, string path)
{

    using (OleDbConnection conn = new OleDbConnection())
    {
        DataTable dt = new DataTable();
        string Import_FileName = path;
        string fileExtension = Path.GetExtension(Import_FileName);
        if (fileExtension == ".xls")
            conn.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Import_FileName + ";" + "Extended Properties='Excel 8.0;HDR=YES;'";
        if (fileExtension == ".xlsx")
            conn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + Import_FileName + ";" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'";
        using (OleDbCommand comm = new OleDbCommand())
        {
            comm.CommandText = "Select * from [" + sheetName + "$]";
            comm.Connection = conn;
            using (OleDbDataAdapter da = new OleDbDataAdapter())
            {
                da.SelectCommand = comm;
                da.Fill(dt);
                return dt;
            }
        }
    }
}

Here is another way of doing it

public DataSet CreateTable(string source)
{
    using (var connection = new OleDbConnection(GetConnectionString(source, true)))
    {
        var dataSet = new DataSet();
        connection.Open();
        var schemaTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
        if (schemaTable == null)
            return dataSet;

        var sheetName = "";
        foreach (DataRow row in schemaTable.Rows)
        {
            sheetName = row["TABLE_NAME"].ToString();
            break;
        }

        var command = string.Format("SELECT * FROM [{0}$]", sheetName);
        var adapter = new OleDbDataAdapter(command, connection);
        adapter.TableMappings.Add("TABLE", "TestTable");
        adapter.Fill(dataSet);
        connection.Close();

        return dataSet;
    }
}

//

private string GetConnectionString(string source, bool hasHeader)
{
    return string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};
    Extended Properties=\"Excel 12.0;HDR={1};IMEX=1\"", source, (hasHeader ? "YES" : "NO"));
}

You can use OpenXml SDK for *.xlsx files. It works very quickly. I made simple C# IDataReader implementation for this sdk. See here. Now you can easy read excel file to DataTable and you can import excel file to sql server database (use SqlBulkCopy). ExcelDataReader reads very fast. On my machine 10000 records less 3 sec and 60000 less 8 sec.

Read to DataTable example:

class Program
{
    static void Main(string[] args)
    {
        var dt = new DataTable();
        using (var reader = new ExcelDataReader(@"data.xlsx"))
            dt.Load(reader);

        Console.WriteLine("done: " + dt.Rows.Count);
        Console.ReadKey();
   }
}

I've used this method and for me, it is so efficient and fast.

// Step 1. Download NuGet source of Generic Parsing by Andrew Rissing
// Step 2. Reference this to your project
// Step 3. Reference Microsoft.Office.Interop.Excel to your project
// Step 4. Follow the logic below

public static DataTable ExcelSheetToDataTable(string filePath) {

    // Save a copy of the Excel file as CSV
    var xlApp = new XL.Application();
    var xlWbk = xlApp.Workbooks.Open(filePath);
    var tempPath =
        Path.Combine(Environment
            .GetFolderPath(Environment.SpecialFolder.UserProfile)
            , "AppData"
            , "Local",
            , "Temp"
            , Path.GetFileNameWithoutExtension(filePath) + ".csv");

    xlApp.DisplayAlerts = false;
    xlWbk.SaveAs(tempPath, XL.XlFileFormat.xlCSV);
    xlWbk.Close(SaveChanges: false);
    xlApp.Quit();

    // The actual parsing
    using (var parser = new GenericParserAdapter(tempPath)) {
        parser.FirstRowHasHeader = true;
        return parser.GetDataTable();
    }

}

Generic Parsing by Andrew Rissing


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to vb.net

How to get parameter value for date/time column from empty MaskedTextBox HTTP 415 unsupported media type error when calling Web API 2 endpoint variable is not declared it may be inaccessible due to its protection level Differences Between vbLf, vbCrLf & vbCr Constants Simple working Example of json.net in VB.net How to open up a form from another form in VB.NET? Delete a row in DataGridView Control in VB.NET How to get cell value from DataGridView in VB.Net? Set default format of datetimepicker as dd-MM-yyyy How to configure SMTP settings in web.config