[linux] Convert xlsx to csv in Linux with command line

I'm looking for a way to convert xlsx files to csv files on Linux.

I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I'm currently using) but I need support for the newer Excel files.

Any ideas?

This question is related to linux excel csv converter xlsx

The answer is


If you are OK to run Java command line then you can do it with Apache POI HSSF's Excel Extractor. It has a main method that says to be the command line extractor. This one seems to just dump everything out. They point out to this example that converts to CSV. You would have to compile it before you can run it but it too has a main method so you should not have to do much coding per se to make it work.

Another option that might fly but will require some work on the other end is to make your Excel files come to you as Excel XML Data or XML Spreadsheet of whatever MS calls that format these days. It will open a whole new world of opportunities for you to slice and dice it the way you want.


In bash, I used this libreoffice command to convert all my xlsx files in the current directory:

for i   in *.xlsx; do  libreoffice --headless --convert-to csv "$i" ; done

It takes care of spaces in the filename.

Tried again some years later, and it didn't work. This thread gives some tips, but the quickiest solution was to run as root (or running a sudo libreoffice). Not elegant, but quick.

Use the command scalc.exe in Windows


Using the Gnumeric spreadsheet application which comes which a commandline utility called ssconvert is indeed super simple:

find . -name '*.xlsx' -exec ssconvert -T Gnumeric_stf:stf_csv {} \;

and you're done!


Use csvkit

in2csv data.xlsx > data.csv

For details check their excellent docs


Another option would be to use R via a small bash wrapper for convenience:

xlsx2txt(){
echo '
require(xlsx)
write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep="\t")
' | Rscript --vanilla - $1 2>/dev/null
}

xlsx2txt file.xlsx > file.txt

The Gnumeric spreadsheet application comes with a command line utility called ssconvert that can convert between a variety of spreadsheet formats:

$ ssconvert Book1.xlsx newfile.csv
Using exporter Gnumeric_stf:stf_csv

$ cat newfile.csv 
Foo,Bar,Baz
1,2,3
123.6,7.89,
2012/05/14,,
The,last,Line

To install on Ubuntu:

apt-get install gnumeric

To install on Mac:

brew install gnumeric

If you already have a Desktop environment then I'm sure Gnumeric / LibreOffice would work well, but on a headless server (such as Amazon Web Services), they require dozens of dependencies that you also need to install.

I found this Python alternative:

https://github.com/dilshod/xlsx2csv

$ easy_install xlsx2csv
$ xlsx2csv file.xlsx > newfile.csv

Took 2 seconds to install and works like a charm.

If you have multiple sheets you can export all at once, or one at a time:

$ xlsx2csv file.xlsx --all > all.csv
$ xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv
$ xlsx2csv file.xlsx -s 1 > sheet1.csv

He also links to several alternatives built in Bash, Python, Ruby, and Java.


As others said, libreoffice can convert xls files to csv. The problem for me was the sheet selection.

This libreoffice Python script does a fine job at converting a single sheet to CSV.

Usage is:

./libreconverter.py File.xls:"Sheet Name" output.csv

The only downside (on my end) is that --headless doesn't seem to work. I have a LO window that shows up for a second and then quits.
That's OK with me, it's the only tool that does the job rapidly.


If .xlsx file has many sheets, -s flag can be used to get the sheet you want. For example:

xlsx2csv "my_file.xlsx" -s 2 second_sheet.csv

second_sheet.csv would contain data of 2nd sheet in my_file.xlsx.


You can do this with LibreOffice:

libreoffice --headless --convert-to csv $filename --outdir $outdir

For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:

users ALL=(ALL) NOPASSWD: libreoffice

Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to csv

Pandas: ValueError: cannot convert float NaN to integer Export result set on Dbeaver to CSV Convert txt to csv python script How to import an Excel file into SQL Server? "CSV file does not exist" for a filename with embedded quotes Save Dataframe to csv directly to s3 Python Data-frame Object has no Attribute (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape How to write to a CSV line by line? How to check encoding of a CSV file

Examples related to converter

No converter found capable of converting from type to type How to convert datetime to integer in python Convert Int to String in Swift Concatenating elements in an array to a string Decimal to Hexadecimal Converter in Java Calendar date to yyyy-MM-dd format in java ffmpeg - Converting MOV files to MP4 Java: How to convert String[] to List or Set Convert xlsx to csv in Linux with command line How to convert string to long

Examples related to xlsx

How to save .xlsx data to file as a blob Parse XLSX with Node and create json Easy way to export multiple data.frame to multiple Excel worksheets Using Pandas to pd.read_excel() for multiple worksheets of the same workbook Python convert csv to xlsx How to Bulk Insert from XLSX file extension? Convert xlsx to csv in Linux with command line Importing Excel files into R, xlsx or xls Convert xlsx file to csv using batch Excel "External table is not in the expected format."