[python] python xlrd unsupported format, or corrupt file.

My code:

import xlrd
wb = xlrd.open_workbook("Z:\\Data\\Locates\\3.8 locates.xls")
sh = wb.sheet_by_index(0)
print sh.cell(0,0).value

The error:

Traceback (most recent call last):
File "Z:\Wilson\tradedStockStatus.py", line 18, in <module>
wb = xlrd.open_workbook("Z:\\Data\\Locates\\3.8 locates.xls")
File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 429, in open_workbook
biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1545, in getbof
bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])
File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1539, in bof_error
raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record;
found '<table r'"

The file doesn't seem to be corrupted or of a different format. Anything to help find the source of the issue would be great.

This question is related to python excel xlrd

The answer is


Open in google sheets and then download from sheets as CSV and then reupload to drive. Then you can Open CSV file from python.


there's nothing wrong with your file. xlrd does not yet support xlsx (excel 2007+) files although it's purported to have supported this for some time.

Simplistix github

2-days ago they committed a pre-alpha version to their git which integrates xlsx support. Other forums suggest that you use a DOM parser for xlsx files since the xlsx file type is just a zip archive containing XML. I have not tried this. there is another package with similar functionality as xlrd and this is called openpyxl which you can get from easy_install or pip. I have not tried this either, however, its API is supposed to be similar to xlrd.


Try to open it with pandas:

import pandas as pd
data = pd.read_html('filename.xls')

Or try any other html python parser.

That's not a proper excel file, but an html readable with excel.


Sometimes help to add ?raw=true at the end of a file path. For example:

wb = xlrd.open_workbook("Z:\\Data\\Locates\\3.8 locates.xls?raw=true")

I just downloaded xlrd, created an excel document (excel 2007) for testing and got the same error (message says 'found PK\x03\x04\x14\x00\x06\x00'). Extension is a xlsx. Tried saving it to an older .xls format and error disappears .....


I know there should be a proper way to solve it but just to save time

I uploaded my xlsx sheet to Google Sheets and then again downloaded it from Google Sheets it working now

If you don't have time to solve the problem, you can try this


Worked on the same issue , finally done this is top for the question so just putting what i did.

Observation - 1 -The file was not actually XLS i renamed to txt and noticed HTML text in file.

2 - Renamed the file to html and tried reading pd.read_html, Failed.

3- Added as it was not there in txt file, removed style to ensure that table is displaying in browser from local, and WORKED.

Below is the code may help someone..

import pandas as pd
import os
import shutil
import html5lib
import requests
from bs4 import BeautifulSoup
import re
import time

shutil.copy('your.xls','file.html')
shutil.copy('file.html','file.txt')
time.sleep(2)

txt = open('file.txt','r').read()

# Modify the text to ensure the data display in html page, delete style

txt = str(txt).replace('<style> .text { mso-number-format:\@; } </script>','')

# Add head and body if it is not there in HTML text

txt_with_head = '<html><head></head><body>'+txt+'</body></html>'

# Save the file as HTML

html_file = open('output.html','w')
html_file.write(txt_with_head)

# Use beautiful soup to read

url = r"C:\Users\hitesh kumar\PycharmProjects\OEM ML\output.html"
page = open(url)
soup = BeautifulSoup(page.read(), features="lxml")
my_table = soup.find("table",attrs={'border': '1'})

frame = pd.read_html(str(my_table))[0]
print(frame.head())
frame.to_excel('testoutput.xlsx',sheet_name='sheet1', index=False)

I had faced the same xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; error and solved it by writing an XML to XLSX converter. The reason is that actually, xlrd does not support XML Spreadsheet (*.xml) i.e. NOT in XLS or XLSX format.


import pandas as pd
from bs4 import BeautifulSoup

def convert_to_xlsx():
    with open('sample.xls') as xml_file:
        soup = BeautifulSoup(xml_file.read(), 'xml')
        writer = pd.ExcelWriter('sample.xlsx')
        for sheet in soup.findAll('Worksheet'):
            sheet_as_list = []
            for row in sheet.findAll('Row'):
                sheet_as_list.append([cell.Data.text if cell.Data else '' for cell in row.findAll('Cell')])
            pd.DataFrame(sheet_as_list).to_excel(writer, sheet_name=sheet.attrs['ss:Name'], index=False, header=False)

        writer.save()


In my case, after opening the file with a text editor as @john-machin suggested, I realized the file is not encrypted as an Excel file is supposed to but it's in the CSV format and was saved as an Excel file. What I did was renamed the file and its extension and used read_csv function instead:

os.rename('sample_file.xls', 'sample_file.csv')
csv = pd.read_csv("sample_file.csv", error_bad_lines=False)

I had a similar problem and it was related to the version. In a python terminal check:

>> import xlrd
>> xlrd.__VERSION__

If you have '0.9.0' you can open almost all files. If you have '0.6.0' which was what I found on Ubuntu, you may have problems with newest Excel files. You can download the latest version of xlrd using the Distutils standard.


I met this problem too.I opened this file by excel and saved it as other formats such as excel 97-2003 and finally I solved this problem


This will happen to some files while also open in Excel.


I had the same issue. Those old files are formatted like a tab-delimited file. I've been able to open my problem files with read_table; ie df = pd.read_table('trouble_maker.xls').


I meet the same problem.

it lies in the .xls file itself - it looks like an Excel file however it isn't. (see if there's a pop up when you plainly open the .xls from Excel)

sjmachin commented on Jan 19, 2013 from https://github.com/python-excel/xlrd/issues/26 helps.


I found the similar problem when downloading .xls file and opened it using xlrd library. Then I tried out the solution of converting .xls into .xlsx as detailed here: how to convert xls to xlsx

It works like a charm and rather than opening .xls, I am working with .xlsx file now using openpyxl library.

Hope it helps to solve your issue.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to xlrd

xlrd.biffh.XLRDError: Excel xlsx file; not supported Edit existing excel workbooks and sheets with xlrd and xlwt Read Excel File in Python Pandas: Looking up the list of sheets in an excel file python xlrd unsupported format, or corrupt file. writing to existing workbook using xlwt Create a .csv file with values from a Python list