How can I print a pandas dataframe as a nice text-based table, like the following?
+------------+---------+-------------+
| column_one | col_two | column_3 |
+------------+---------+-------------+
| 0 | 0.0001 | ABCD |
| 1 | 1e-005 | ABCD |
| 2 | 1e-006 | long string |
| 3 | 1e-007 | ABCD |
+------------+---------+-------------+
Maybe you're looking for something like this:
def tableize(df):
if not isinstance(df, pd.DataFrame):
return
df_columns = df.columns.tolist()
max_len_in_lst = lambda lst: len(sorted(lst, reverse=True, key=len)[0])
align_center = lambda st, sz: "{0}{1}{0}".format(" "*(1+(sz-len(st))//2), st)[:sz] if len(st) < sz else st
align_right = lambda st, sz: "{0}{1} ".format(" "*(sz-len(st)-1), st) if len(st) < sz else st
max_col_len = max_len_in_lst(df_columns)
max_val_len_for_col = dict([(col, max_len_in_lst(df.iloc[:,idx].astype('str'))) for idx, col in enumerate(df_columns)])
col_sizes = dict([(col, 2 + max(max_val_len_for_col.get(col, 0), max_col_len)) for col in df_columns])
build_hline = lambda row: '+'.join(['-' * col_sizes[col] for col in row]).join(['+', '+'])
build_data = lambda row, align: "|".join([align(str(val), col_sizes[df_columns[idx]]) for idx, val in enumerate(row)]).join(['|', '|'])
hline = build_hline(df_columns)
out = [hline, build_data(df_columns, align_center), hline]
for _, row in df.iterrows():
out.append(build_data(row.tolist(), align_right))
out.append(hline)
return "\n".join(out)
df = pd.DataFrame([[1, 2, 3], [11111, 22, 333]], columns=['a', 'b', 'c'])
print tableize(df)
Output: +-------+----+-----+ | a | b | c | +-------+----+-----+ | 1 | 2 | 3 | | 11111 | 22 | 333 | +-------+----+-----+
A simple approach is to output as html, which pandas does out of the box:
df.to_html('temp.html')
If you want an inbuilt function to dump your data into some github markdown, you now have one. Take a look at to_markdown
:
df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b'])
print(df.to_markdown())
| | A | B |
|:---|----:|----:|
| a | 1 | 1 |
| a | 2 | 2 |
| b | 3 | 3 |
Here's what that looks like on github:
Note that you will still need to have the tabulate
package installed.
Following up on Mark's answer, if you're not using Jupyter for some reason, e.g. you want to do some quick testing on the console, you can use the DataFrame.to_string
method, which works from -- at least -- Pandas 0.12 (2014) onwards.
import pandas as pd
matrix = [(1, 23, 45), (789, 1, 23), (45, 678, 90)]
df = pd.DataFrame(matrix, columns=list('abc'))
print(df.to_string())
# outputs:
# a b c
# 0 1 23 45
# 1 789 1 23
# 2 45 678 90
You can use prettytable to render the table as text. The trick is to convert the data_frame to an in-memory csv file and have prettytable read it. Here's the code:
from StringIO import StringIO
import prettytable
output = StringIO()
data_frame.to_csv(output)
output.seek(0)
pt = prettytable.from_csv(output)
print pt
If you are in Jupyter notebook, you could run the following code to interactively display the dataframe in a well formatted table.
This answer builds on the to_html('temp.html') answer above, but instead of creating a file displays the well formatted table directly in the notebook:
from IPython.display import display, HTML
display(HTML(df.to_html()))
Credit for this code due to example at: Show DataFrame as table in iPython Notebook
I used Ofer's answer for a while and found it great in most cases. Unfortunately, due to inconsistencies between pandas's to_csv and prettytable's from_csv, I had to use prettytable in a different way.
One failure case is a dataframe containing commas:
pd.DataFrame({'A': [1, 2], 'B': ['a,', 'b']})
Prettytable raises an error of the form:
Error: Could not determine delimiter
The following function handles this case:
def format_for_print(df):
table = PrettyTable([''] + list(df.columns))
for row in df.itertuples():
table.add_row(row)
return str(table)
If you don't care about the index, use:
def format_for_print2(df):
table = PrettyTable(list(df.columns))
for row in df.itertuples():
table.add_row(row[1:])
return str(table)
I've just found a great tool for that need, it is called tabulate.
It prints tabular data and works with DataFrame
.
from tabulate import tabulate
import pandas as pd
df = pd.DataFrame({'col_two' : [0.0001, 1e-005 , 1e-006, 1e-007],
'column_3' : ['ABCD', 'ABCD', 'long string', 'ABCD']})
print(tabulate(df, headers='keys', tablefmt='psql'))
+----+-----------+-------------+
| | col_two | column_3 |
|----+-----------+-------------|
| 0 | 0.0001 | ABCD |
| 1 | 1e-05 | ABCD |
| 2 | 1e-06 | long string |
| 3 | 1e-07 | ABCD |
+----+-----------+-------------+
Note:
To suppress row indices for all types of data, pass
showindex="never"
orshowindex=False
.
Source: Stackoverflow.com