[c#] How do I import from Excel to a DataSet using Microsoft.Office.Interop.Excel?

What I want to do

I'm trying to use the Microsoft.Office.Interop.Excel namespace to open an Excel file (XSL or CSV, but sadly not XSLX) and import it into a DataSet. I don't have control over the worksheet or column names, so I need to allow for changes to them.

What I've tried

I've tried the OLEDB method of this in the past, and had a lot of problems with it (buggy, slow, and required prior knowledge of the Excel file's schema), so I want to avoid doing that again. What I'd like to do is use Microsoft.Office.Interop.Excel to import the workbook directly to a DataSet, or loop through the worksheets and load each one into a DataTable.

Believe it or not, I've had trouble finding resources for this. A few searches on StackOverflow have found mostly people trying to do the reverse (DataSet => Excel), or the OLEDB technique. Google hasn't been much more helpful.

What I've got so far

    public void Load(string filename, Excel.XlFileFormat format = Excel.XlFileFormat.xlCSV)
    {
        app = new Excel.Application();
        book = app.Workbooks.Open(Filename: filename, Format: format);

        DataSet ds = new DataSet();

        foreach (Excel.Worksheet sheet in book.Sheets)
        {
            DataTable dt = new DataTable(sheet.Name);
            ds.Tables.Add(dt);

            //??? Fill dt from sheet 
        }

        this.Data = ds;
    }

I'm fine with either importing the entire book at once, or looping through one sheet at a time. Can I do this with Interop.Excel?

This question is related to c# excel interop dataset excel-interop

The answer is


What about using Excel Data Reader (previously hosted here) an open source project on codeplex? Its works really well for me to export data from excel sheets.

The sample code given on the link specified:

FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);

//1. Reading from a binary Excel file ('97-2003 format; *.xls)
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
//...
//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
//...
//3. DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();
//...
//4. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();

//5. Data Reader methods
while (excelReader.Read())
{
//excelReader.GetInt32(0);
}

//6. Free resources (IExcelDataReader is IDisposable)
excelReader.Close();

UPDATE

After some search around, I came across this article: Faster MS Excel Reading using Office Interop Assemblies. The article only uses Office Interop Assemblies to read data from a given Excel Sheet. The source code is of the project is there too. I guess this article can be a starting point on what you trying to achieve. See if that helps

UPDATE 2

The code below takes an excel workbook and reads all values found, for each excel worksheet inside the excel workbook.

private static void TestExcel()
    {
        ApplicationClass app = new ApplicationClass();
        Workbook book = null;
        Range range = null;

        try
        {
            app.Visible = false;
            app.ScreenUpdating = false;
            app.DisplayAlerts = false;

            string execPath = Path.GetDirectoryName(Assembly.GetExecutingAssembly().CodeBase);

            book = app.Workbooks.Open(@"C:\data.xls", Missing.Value, Missing.Value, Missing.Value
                                              , Missing.Value, Missing.Value, Missing.Value, Missing.Value
                                             , Missing.Value, Missing.Value, Missing.Value, Missing.Value
                                            , Missing.Value, Missing.Value, Missing.Value);
            foreach (Worksheet sheet in book.Worksheets)
            {

                Console.WriteLine(@"Values for Sheet "+sheet.Index);

                // get a range to work with
                range = sheet.get_Range("A1", Missing.Value);
                // get the end of values to the right (will stop at the first empty cell)
                range = range.get_End(XlDirection.xlToRight);
                // get the end of values toward the bottom, looking in the last column (will stop at first empty cell)
                range = range.get_End(XlDirection.xlDown);

                // get the address of the bottom, right cell
                string downAddress = range.get_Address(
                    false, false, XlReferenceStyle.xlA1,
                    Type.Missing, Type.Missing);

                // Get the range, then values from a1
                range = sheet.get_Range("A1", downAddress);
                object[,] values = (object[,]) range.Value2;

                // View the values
                Console.Write("\t");
                Console.WriteLine();
                for (int i = 1; i <= values.GetLength(0); i++)
                {
                    for (int j = 1; j <= values.GetLength(1); j++)
                    {
                        Console.Write("{0}\t", values[i, j]);
                    }
                    Console.WriteLine();
                }
            }

        }
        catch (Exception e)
        {
            Console.WriteLine(e);
        }
        finally
        {
            range = null;
            if (book != null)
                book.Close(false, Missing.Value, Missing.Value);
            book = null;
            if (app != null)
                app.Quit();
            app = null;
        }
    }

In the above code, values[i, j] is the value that you need to be added to the dataset. i denotes the row, whereas, j denotes the column.


Have you seen this one? From http://www.aspspider.com/resources/Resource510.aspx:

public DataTable Import(String path)
{
    Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
    Microsoft.Office.Interop.Excel.Workbook workBook = app.Workbooks.Open(path, 0, true, 5, "", "", true, Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);

    Microsoft.Office.Interop.Excel.Worksheet workSheet = (Microsoft.Office.Interop.Excel.Worksheet)workBook.ActiveSheet;

    int index = 0;
    object rowIndex = 2;

    DataTable dt = new DataTable();
    dt.Columns.Add("FirstName");
    dt.Columns.Add("LastName");
    dt.Columns.Add("Mobile");
    dt.Columns.Add("Landline");
    dt.Columns.Add("Email");
    dt.Columns.Add("ID");

    DataRow row;

    while (((Microsoft.Office.Interop.Excel.Range)workSheet.Cells[rowIndex, 1]).Value2 != null)
    {

        row = dt.NewRow();
        row[0] = Convert.ToString(((Microsoft.Office.Interop.Excel.Range)workSheet.Cells[rowIndex, 1]).Value2);
        row[1] = Convert.ToString(((Microsoft.Office.Interop.Excel.Range)workSheet.Cells[rowIndex, 2]).Value2);
        row[2] = Convert.ToString(((Microsoft.Office.Interop.Excel.Range)workSheet.Cells[rowIndex, 3]).Value2);
        row[3] = Convert.ToString(((Microsoft.Office.Interop.Excel.Range)workSheet.Cells[rowIndex, 4]).Value2);
        row[4] = Convert.ToString(((Microsoft.Office.Interop.Excel.Range)workSheet.Cells[rowIndex, 5]).Value2);
        index++;

        rowIndex = 2 + index;
        dt.Rows.Add(row);
    }
    app.Workbooks.Close();
    return dt;
}

object[,] valueArray = (object[,])excelRange.get_Value(XlRangeValueDataType.xlRangeValueDefault);

//Get the column names
for (int k = 0; k < valueArray.GetLength(1); )
{
    //add columns to the data table.
    dt.Columns.Add((string)valueArray[1,++k]);
}

//Load data into data table
object[] singleDValue = new object[valueArray.GetLength(1)];
//value array first row contains column names. so loop starts from 1 instead of 0
for (int i = 1; i < valueArray.GetLength(0); i++)
{
    Console.WriteLine(valueArray.GetLength(0) + ":" + valueArray.GetLength(1));
    for (int k = 0; k < valueArray.GetLength(1); )
    {
        singleDValue[k] = valueArray[i+1, ++k];
    }
    dt.LoadDataRow(singleDValue, System.Data.LoadOption.PreserveChanges);
}

Quiet Late though!.

This method is properly tested and it converts the excel to DataSet.

public DataSet Dtl()
        {
            //Instance reference for Excel Application
            Microsoft.Office.Interop.Excel.Application objXL = null;        
            //Workbook refrence
            Microsoft.Office.Interop.Excel.Workbook objWB = null;
            DataSet ds = new DataSet();
            try
            {
                objXL = new Microsoft.Office.Interop.Excel.Application();
                objWB = objXL.Workbooks.Open(@"Book1.xlsx");//Your path to excel file.
                foreach (Microsoft.Office.Interop.Excel.Worksheet objSHT in objWB.Worksheets)
                {
                    int rows = objSHT.UsedRange.Rows.Count;
                    int cols = objSHT.UsedRange.Columns.Count;
                    DataTable dt = new DataTable();
                    int noofrow = 1;
                    //If 1st Row Contains unique Headers for datatable include this part else remove it
                    //Start
                    for (int c = 1; c <= cols; c++)
                    {
                        string colname = objSHT.Cells[1, c].Text;
                        dt.Columns.Add(colname);
                        noofrow = 2;
                    }
                    //END
                    for (int r = noofrow; r <= rows; r++)
                    {
                        DataRow dr = dt.NewRow();
                        for (int c = 1; c <= cols; c++)
                        {
                            dr[c - 1] = objSHT.Cells[r, c].Text;
                        }
                        dt.Rows.Add(dr);
                    }
                   ds.Tables.Add(dt);
                }
                //Closing workbook
                objWB.Close();
                //Closing excel application
                objXL.Quit();
                return ds;
            }

            catch (Exception ex)
            {
               objWB.Saved = true;
                //Closing work book
                objWB.Close();
                //Closing excel application
                objXL.Quit();
                //Response.Write("Illegal permission");
                return ds;
            }
        }

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Reflection;
using Microsoft.Office.Interop.Excel;

namespace trg.satmap.portal.ParseAgentSkillMapping
{
    class ConvertXLStoDT
    {
        private StringBuilder errorMessages;

        public StringBuilder ErrorMessages
        {
            get { return errorMessages; }
            set { errorMessages = value; }
        }

        public ConvertXLStoDT()
        {
            ErrorMessages = new StringBuilder();
        }

        public System.Data.DataTable XLStoDTusingInterOp(string FilePath)
        {
            #region Excel important Note.
            /*
             * Excel creates XLS and XLSX files. These files are hard to read in C# programs. 
             * They are handled with the Microsoft.Office.Interop.Excel assembly. 
             * This assembly sometimes creates performance issues. Step-by-step instructions are helpful.
             * 
             * Add the Microsoft.Office.Interop.Excel assembly by going to Project -> Add Reference.
             */
            #endregion

            Microsoft.Office.Interop.Excel.Application excelApp = null;
            Microsoft.Office.Interop.Excel.Workbook workbook = null;


            System.Data.DataTable dt = new System.Data.DataTable(); //Creating datatable to read the content of the Sheet in File.

            try
            {

                excelApp = new Microsoft.Office.Interop.Excel.Application(); // Initialize a new Excel reader. Must be integrated with an Excel interface object.

                //Opening Excel file(myData.xlsx)
                workbook = excelApp.Workbooks.Open(FilePath, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value);

                Microsoft.Office.Interop.Excel.Worksheet ws = (Microsoft.Office.Interop.Excel.Worksheet)workbook.Sheets.get_Item(1);

                Microsoft.Office.Interop.Excel.Range excelRange = ws.UsedRange; //gives the used cells in sheet

                ws = null; // now No need of this so should expire.

                //Reading Excel file.               
                object[,] valueArray = (object[,])excelRange.get_Value(Microsoft.Office.Interop.Excel.XlRangeValueDataType.xlRangeValueDefault);

                excelRange = null; // you don't need to do any more Interop. Now No need of this so should expire.

                dt = ProcessObjects(valueArray);                

            }
            catch (Exception ex)
            {
                ErrorMessages.Append(ex.Message);
            }
            finally
            {
                #region Clean Up                
                if (workbook != null)
                {
                    #region Clean Up Close the workbook and release all the memory.
                    workbook.Close(false, FilePath, Missing.Value);                    
                    System.Runtime.InteropServices.Marshal.ReleaseComObject(workbook);
                    #endregion
                }
                workbook = null;

                if (excelApp != null)
                {
                    excelApp.Quit();
                }
                excelApp = null;                

                #endregion
            }
            return (dt);
        }

        /// <summary>
        /// Scan the selected Excel workbook and store the information in the cells
        /// for this workbook in an object[,] array. Then, call another method
        /// to process the data.
        /// </summary>
        private void ExcelScanIntenal(Microsoft.Office.Interop.Excel.Workbook workBookIn)
        {
            //
            // Get sheet Count and store the number of sheets.
            //
            int numSheets = workBookIn.Sheets.Count;

            //
            // Iterate through the sheets. They are indexed starting at 1.
            //
            for (int sheetNum = 1; sheetNum < numSheets + 1; sheetNum++)
            {
                Worksheet sheet = (Worksheet)workBookIn.Sheets[sheetNum];

                //
                // Take the used range of the sheet. Finally, get an object array of all
                // of the cells in the sheet (their values). You can do things with those
                // values. See notes about compatibility.
                //
                Range excelRange = sheet.UsedRange;
                object[,] valueArray = (object[,])excelRange.get_Value(XlRangeValueDataType.xlRangeValueDefault);

                //
                // Do something with the data in the array with a custom method.
                //
                ProcessObjects(valueArray);
            }
        }
        private System.Data.DataTable ProcessObjects(object[,] valueArray)
        {
            System.Data.DataTable dt = new System.Data.DataTable();

            #region Get the COLUMN names

            for (int k = 1; k <= valueArray.GetLength(1); k++)
            {
                dt.Columns.Add((string)valueArray[1, k]);  //add columns to the data table.
            }
            #endregion

            #region Load Excel SHEET DATA into data table

            object[] singleDValue = new object[valueArray.GetLength(1)];
            //value array first row contains column names. so loop starts from 2 instead of 1
            for (int i = 2; i <= valueArray.GetLength(0); i++)
            {
                for (int j = 0; j < valueArray.GetLength(1); j++)
                {
                    if (valueArray[i, j + 1] != null)
                    {
                        singleDValue[j] = valueArray[i, j + 1].ToString();
                    }
                    else
                    {
                        singleDValue[j] = valueArray[i, j + 1];
                    }
                }
                dt.LoadDataRow(singleDValue, System.Data.LoadOption.PreserveChanges);
            }
            #endregion


            return (dt);
        }
    }
}

Years after everyone's answer, I too want to present how I did it for my project

    /// <summary>
    /// /Reads an excel file and converts it into dataset with each sheet as each table of the dataset
    /// </summary>
    /// <param name="filename"></param>
    /// <param name="headers">If set to true the first row will be considered as headers</param>
    /// <returns></returns>
    public DataSet Import(string filename, bool headers = true)
    {
        var _xl = new Excel.Application();
        var wb = _xl.Workbooks.Open(filename);
        var sheets = wb.Sheets;
        DataSet dataSet = null;
        if (sheets != null && sheets.Count != 0)
        {
            dataSet = new DataSet();
            foreach (var item in sheets)
            {
                var sheet = (Excel.Worksheet)item;
                DataTable dt = null;
                if (sheet != null)
                {
                    dt = new DataTable();
                    var ColumnCount = ((Excel.Range)sheet.UsedRange.Rows[1, Type.Missing]).Columns.Count;
                    var rowCount = ((Excel.Range)sheet.UsedRange.Columns[1, Type.Missing]).Rows.Count;

                    for (int j = 0; j < ColumnCount; j++)
                    {
                        var cell = (Excel.Range)sheet.Cells[1, j + 1];
                        var column = new DataColumn(headers ? cell.Value : string.Empty);
                        dt.Columns.Add(column);
                    }

                    for (int i = 0; i < rowCount; i++)
                    {
                        var r = dt.NewRow();
                        for (int j = 0; j < ColumnCount; j++)
                        {
                            var cell = (Excel.Range)sheet.Cells[i + 1 + (headers ? 1 : 0), j + 1];
                            r[j] = cell.Value;
                        }
                        dt.Rows.Add(r);
                    }

                }
                dataSet.Tables.Add(dt);
            }
        }
        _xl.Quit();
        return dataSet;
    }

Questions with c# tag:

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array How can I add raw data body to an axios request? Couldn't process file resx due to its being in the Internet or Restricted zone or having the mark of the web on the file Convert string to boolean in C# Entity Framework Core: A second operation started on this context before a previous operation completed ASP.NET Core - Swashbuckle not creating swagger.json file Is ConfigurationManager.AppSettings available in .NET Core 2.0? No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization Getting value from appsettings.json in .net core .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Automatically set appsettings.json for dev and release environments in asp.net core? How to use log4net in Asp.net core 2.0 Get ConnectionString from appsettings.json instead of being hardcoded in .NET Core 2.0 App Unable to create migrations after upgrading to ASP.NET Core 2.0 Update .NET web service to use TLS 1.2 Using app.config in .Net Core How to send json data in POST request using C# ASP.NET Core form POST results in a HTTP 415 Unsupported Media Type response How to enable CORS in ASP.net Core WebAPI VS 2017 Metadata file '.dll could not be found How to set combobox default value? How to get root directory of project in asp.net core. Directory.GetCurrentDirectory() doesn't seem to work correctly on a mac ALTER TABLE DROP COLUMN failed because one or more objects access this column Error: the entity type requires a primary key How to POST using HTTPclient content type = application/x-www-form-urlencoded CORS: credentials mode is 'include' Visual Studio 2017: Display method references Where is NuGet.Config file located in Visual Studio project? Unity Scripts edited in Visual studio don't provide autocomplete How to create roles in ASP.NET Core and assign them to users? Return file in ASP.Net Core Web API ASP.NET Core return JSON with status code auto create database in Entity Framework Core Class Diagrams in VS 2017 How to read/write files in .Net Core? How to read values from the querystring with ASP.NET Core? how to set ASPNETCORE_ENVIRONMENT to be considered for publishing an asp.net core application? ASP.NET Core Get Json Array using IConfiguration Entity Framework Core add unique constraint code-first No templates in Visual Studio 2017 ps1 cannot be loaded because running scripts is disabled on this system

Questions with excel tag:

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel? Creating an Array from a Range in VBA Excel: macro to export worksheet as CSV file without leaving my current Excel sheet VBA: Convert Text to Number EPPlus - Read Excel Table How to label scatterplot points by name? What's the difference between "end" and "exit sub" in VBA? Rename Excel Sheet with VBA Macro Extract Data from PDF and Add to Worksheet Quicker way to get all unique values of a column in VBA? Multiple conditions in an IF statement in Excel VBA How to find and replace with regex in excel Unprotect workbook without password Excel is not updating cells, options > formula > workbook calculation set to automatic Find row number of matching value If "0" then leave the cell blank Clear contents and formatting of an Excel cell with a single command Remove Duplicates from range of cells in excel vba Delete worksheet in Excel using VBA Get list of Excel files in a folder using VBA Excel doesn't update value unless I hit Enter Declare a variable as Decimal Parse XLSX with Node and create json Detect if a Form Control option button is selected in VBA Get length of array? Object of class stdClass could not be converted to string - laravel Java - Writing strings to a CSV file Quickest way to clear all sheet contents VBA VBA: Counting rows in a table (list object) Excel VBA If cell.Value =... then VBA Excel - Insert row below with same format including borders and frames excel - if cell is not blank, then do IF statement filter out multiple criteria using excel vba Referencing value in a closed Excel workbook using INDIRECT? Use Excel VBA to click on a button in Internet Explorer, when the button has no "name" associated IndexError: too many indices for array File name without extension name VBA (Excel) Conditional Formatting based on Adjacent Cell Value Easy way to export multiple data.frame to multiple Excel worksheets Using ExcelDataReader to read Excel data starting from a particular cell What are the RGB codes for the Conditional Formatting 'Styles' in Excel?

Questions with interop tag:

Exception from HRESULT: 0x800A03EC Error How to use Microsoft.Office.Interop.Excel on a machine without installed MS Office? How do I import from Excel to a DataSet using Microsoft.Office.Interop.Excel? How can I make SQL case sensitive string comparison on MySQL? Better way to cast object to int Write Array to Excel Range How do I properly clean up Excel interop objects?

Questions with dataset tag:

How to convert a Scikit-learn dataset to a Pandas dataset? Convert floats to ints in Pandas? Convert DataSet to List C#, Looping through dataset and show each record from a dataset column How to add header to a dataset in R? How I can filter a Datatable? Stored procedure return into DataSet in C# .Net Looping through a DataTable adding a datatable in a dataset How to fill Dataset with multiple tables? Iterate through DataSet A simple explanation of Naive Bayes Classification How to delete the first row of a dataframe in R? Sort columns of a dataframe by column name How do I import from Excel to a DataSet using Microsoft.Office.Interop.Excel? Direct method from SQL command text to DataSet DataAdapter.Fill(Dataset) Reading DataSet .NET - How do I retrieve specific items out of a Dataset? Filtering DataSet Select method in List<t> Collection Adding rows to dataset How to test if a DataSet is empty? Convert generic list to dataset in C# Should I Dispose() DataSet and DataTable? How to loop through a dataset in powershell? How to check for a Null value in VB.NET How do I find out if a column exists in a VB.Net DataRow

Questions with excel-interop tag:

Closing Excel Application Process in C# after Data Access Optimal way to Read an Excel file (.xls/.xlsx) How to fix 'Microsoft Excel cannot open or save any more documents' How do I import from Excel to a DataSet using Microsoft.Office.Interop.Excel? HRESULT: 0x800A03EC on Worksheet.range How to count the number of rows in excel with data? System.Runtime.InteropServices.COMException (0x800A03EC) How do I auto size columns through the Excel interop objects? Exporting the values in List to excel