[python] Sort Pandas Dataframe by Date

I have a pandas dataframe as follows:

Symbol  Date
A       02/20/2015
A       01/15/2016
A       08/21/2015

I want to sort it by Date, but the column is just an object.

I tried to make the column a date object, but I ran into an issue where that format is not the format needed. The format needed is 2015-02-20, etc.

So now I'm trying to figure out how to have numpy convert the 'American' dates into the ISO standard, so that I can make them date objects, so that I can sort by them.

How would I convert these american dates into ISO standard, or is there a more straight forward method I'm missing within pandas?

This question is related to python pandas

The answer is


The data containing the date column can be read by using the below code:

data = pd.csv(file_path,parse_dates=[date_column])

Once the data is read by using the above line of code, the column containing the information about the date can be accessed using pd.date_time() like:

pd.date_time(data[date_column], format = '%d/%m/%y')

to change the format of date as per the requirement.


@JAB's answer is fast and concise. But it changes the DataFrame you are trying to sort, which you may or may not want.

(Note: You almost certainly will want it, because your date columns should be dates, not strings!)

In the unlikely event that you don't want to change the dates into dates, you can also do it a different way.

First, get the index from your sorted Date column:

In [25]: pd.to_datetime(df.Date).order().index
Out[25]: Int64Index([0, 2, 1], dtype='int64')

Then use it to index your original DataFrame, leaving it untouched:

In [26]: df.ix[pd.to_datetime(df.Date).order().index]
Out[26]: 
        Date Symbol
0 2015-02-20      A
2 2015-08-21      A
1 2016-01-15      A

Magic!

Note: for Pandas versions 0.20.0 and later, use loc instead of ix, which is now deprecated.


sort method has been deprecated and replaced with sort_values. After converting to datetime object using df['Date']=pd.to_datetime(df['Date'])

df.sort_values(by=['Date'])

Note: to sort in-place and/or in a descending order (the most recent first):

df.sort_values(by=['Date'], inplace=True, ascending=False)