[sql] How to run a SQL query on an Excel table?

I'm trying to create a sub-table from another table of all the last name fields sorted A-Z which have a phone number field that isn't null. I could do this pretty easy with SQL, but I have no clue how to go about running a SQL query within Excel. I'm tempted to import the data into postgresql and just query it there, but that seems a little excessive.

For what I'm trying to do, the SQL query SELECT lastname, firstname, phonenumber WHERE phonenumber IS NOT NULL ORDER BY lastname would do the trick. It seems too simple for it to be something that Excel can't do natively. How can I run a SQL query like this from within Excel?

This question is related to sql excel filtering

The answer is


tl;dr; Excel does all of this natively - use filters and or tables

(http://office.microsoft.com/en-gb/excel-help/filter-data-in-an-excel-table-HA102840028.aspx)

You can open excel programatically through an oledb connection and execute SQL on the tables within the worksheet.

But you can do everything you are asking to do with no formulas just filters.

  1. click anywhere within the data you are looking at
  2. go to data on the ribbon bar
  3. select "Filter" its about the middle and looks like a funnel
    • you will have arrows on the tight hand side of each cell in the the first row of your table now
  4. click the arrow on phone number and de-select blanks (last option)
  5. click the arrow on last name and select a-z ordering (top option)

have a play around.. some things to note:

  1. you can select the filtered rows and pasty them somewhere else
  2. in the status bar on the left you will see how many rows meet you filter criteria out of the total number of rows. (e.g. 308 of 313 records found)
  3. you can filter by color in excel 2010 on wards
  4. Sometimes i create calculated columns that give statuses or cleaned versions of data you can then filter or sort by theses too. (e.g. like the formulae in the other answers)

DO it with filters unless you are going to do it a lot or you want to automate importing data somewhere or something.. but for completeness:

A c# option:

 OleDbConnection ExcelFile = new OleDbConnection( String.Format( "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0;HDR=YES\"", filename));
 ExcelFile.Open();

a handy place to start is to take a look at the schema as there may be more there than you think:

List<String> excelSheets = new List<string>();

// Add the sheet name to the string array.
foreach (DataRow row in dt.Rows) {
    string temp = row["TABLE_NAME"].ToString();
    if (temp[temp.Length - 1] == '$') {
         excelSheets.Add(row["TABLE_NAME"].ToString());
    }
}

then when you want to query a sheet:

 OleDbDataAdapter da = new OleDbDataAdapter("select * from [" + sheet + "]", ExcelFile);
 dt = new DataTable();
  da.Fill(dt);

NOTE - Use Tables in excel!:

Excel has "tables" functionality that make data behave more like a table.. this gives you some great benefits but is not going to let you do every type of query.

http://office.microsoft.com/en-gb/excel-help/overview-of-excel-tables-HA010048546.aspx

For tabular data in excel this is my default.. first thing i do is click into the data then select "format as table" from the home section on the ribbon. this gives you filtering, and sorting by default and allows you to access the table and fields by name (e.g. table[fieldname] ) this also allows aggregate functions on columns e.g. max and average


You can experiment with the native DB driver for Excel in language/platform of your choice. In Java world, you can try with http://code.google.com/p/sqlsheet/ which provides a JDBC driver for working with Excel sheets directly. Similarly, you can get drivers for the DB technology for other platforms.

However, I can guarantee that you will soon hit a wall with the number of features these wrapper libraries provide. Better way will be to use Apache HSSF/POI or similar level of library but it will need more coding effort.


I suggest you to have a look at the MySQL csv storage engine which essentially allows you to load any csv file (easily created from excel) into the database, once you have that, you can use any SQL command you want.

It's worth to have a look at it.


Microsoft Access and LibreOffice Base can open a spreadsheet as a source and run sql queries on it. That would be the easiest way to run all kinds of queries, and avoid the mess of running macros or writing code.

Excel also has autofilters and data sorting that will accomplish a lot of simple queries like your example. If you need help with those features, Google would be a better source for tutorials than me.


If you have GDAL/OGR compiled with the against the Expat library, you can use the XLSX driver to read .xlsx files, and run SQL expressions from a command prompt. For example, from a osgeo4w shell in the same directory as the spreadsheet, use the ogrinfo utility:

ogrinfo -dialect sqlite -sql "SELECT name, count(*) FROM sheet1 GROUP BY name" Book1.xlsx

will run a SQLite query on sheet1, and output the query result in an unusual form:

INFO: Open of `Book1.xlsx'
      using driver `XLSX' successful.

Layer name: SELECT
Geometry: None
Feature Count: 36
Layer SRS WKT:
(unknown)
name: String (0.0)
count(*): Integer (0.0)
OGRFeature(SELECT):0
  name (String) = Red
  count(*) (Integer) = 849

OGRFeature(SELECT):1
  name (String) = Green
  count(*) (Integer) = 265
...

Or run the same query using ogr2ogr to make a simple CSV file:

$ ogr2ogr -f CSV out.csv -dialect sqlite \
          -sql "SELECT name, count(*) FROM sheet1 GROUP BY name" Book1.xlsx

$ cat out.csv
name,count(*)
Red,849
Green,265
...

To do similar with older .xls files, you would need the XLS driver, built against the FreeXL library, which is not really common (e.g. not from OSGeo4w).


You can do this natively as follows:

  1. Select the table and use Excel to sort it on Last Name
  2. Create a 2-row by 1-column advanced filter criteria, say in E1 and E2, where E1 is empty and E2 contains the formula =C6="" where C6 is the first data cell of the phone number column.
  3. Select the table and use advanced filter, copy to a range, using the criteria range in E1:E2 and specify where you want to copy the output to

If you want to do this programmatically I suggest you use the Macro Recorder to record the above steps and look at the code.


Might I suggest giving QueryStorm a try - it's a plugin for Excel that makes it quite convenient to use SQL in Excel.

Also, it's freemium. If you don't care about autocomplete, error squigglies etc, you can use it for free. Just download and install, and you have SQL support in Excel.

Disclaimer: I'm the author.


If you need to do this once just follow Charles' descriptions, but it is also possible to do this with Excel formulas and helper columns in case you want to make the filter dynamic.

Lets assume you data is on the sheet DataSheet and starts in row 2 of the following columns:

  • A: lastname
  • B: firstname
  • C: phonenumber

You need two helper columns on this sheet.

  • D2: =if(A2 = "", 1, 0), this is the filter column, corresponding to your where condition
  • E2: =if(D2 <> 1, "", sumifs(D$2:D$1048576, A$2:A$1048576, "<"&A2) + sumifs(D$2:D2, A$2:A2, A2)), this corresponds to the order by

Copy down these formulas as far as your data goes.

On the sheet which should display your result create the following columns.

  • A: A sequence of numbers starting with 1 in row 2, this limits the total number of rows you can get (kind like a limit in sequel)
  • B2: =match(A2, DataSheet!$E$2:$E$1048576, 0), this is the row of the corresponding data
  • C2: =iferror(index(DataSheet!A$2:A$1048576, $B2), ""), this is the actual data or empty if no data exists

Copy down the formulas in B2 and C2 and copy-past column C to D and E.


I might be misunderstanding me, but isn't this exactly what a pivot table does? Do you have the data in a table or just a filtered list? If its not a table make it one (ctrl+l) if it is, then simply activate any cell in the table and insert a pivot table on another sheet. Then Add the columns lastname, firstname, phonenumber to the rows section. Then Add Phone number to the filter section and filter out the null values. Now Sort like normal.


You can use SQL in Excel. It is only well hidden. See this tutorial:

http://smallbusiness.chron.com/use-sql-statements-ms-excel-41193.html


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to filtering

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all() Filtering array of objects with lodash based on property value How can I return the difference between two lists? I have filtered my Excel data and now I want to number the rows. How do I do that? Creating lowpass filter in SciPy - understanding methods and units filter items in a python dictionary where keys contain a specific string Detect and exclude outliers in Pandas data frame Filtering Pandas DataFrames on dates Logical operators for boolean indexing in Pandas How to run a SQL query on an Excel table?