Filtering Pandas DataFrames on dates

Question

I have a Pandas DataFrame with a  date  column  Now I need to filter out all rows in the DataFrame that have dates outside of the next two months  Essentially  I only need to retain the rows that are within the next two months    What is the best way to achieve this

User · Answer

If you have already converted the string to a date format using pd to datetime you can just use  df   df  df  Date   gt   quot 2018-01-01 quot    amp   df  Date   lt   quot 2019-07-01 quot

User · Answer

If the dates are in the index then simply   df  20160101   20160301

User · Answer

So when loading the csv data file  we ll need to set the date column as index now as below  in order to filter data based on a range of dates  This was not needed for the now deprecated method  pd DataFrame from csv     If you just want to show the data for two months from Jan to Feb  e g  2020-01-01 to 2020-02-29  you can do so   import pandas as pd mydata   pd read csv  mydata csv  index col  date     or its index number  e g  index col  0  mydata  2020-01-01   2020-02-29     will pull all the columns  if just need one column  e g  Cost  can be done  mydata  2020-01-01   2020-02-29   Cost      This has been tested working for Python 3 7  Hope you will find this useful

User · Answer

You could just select the time range by doing  df loc  start date   end date

User · Answer

You can use pd Timestamp to perform a query and a local reference  import pandas as pd import numpy as np  df   pd DataFrame   ts   pd Timestamp  df  date     np array np arange 10    datetime now   timestamp    dtype  M8 s     print df  print df query  date  gt   ts  20190515T071320       with the output                   date 0 2019-05-15 07 13 16 1 2019-05-15 07 13 17 2 2019-05-15 07 13 18 3 2019-05-15 07 13 19 4 2019-05-15 07 13 20 5 2019-05-15 07 13 21 6 2019-05-15 07 13 22 7 2019-05-15 07 13 23 8 2019-05-15 07 13 24 9 2019-05-15 07 13 25                    date 5 2019-05-15 07 13 21 6 2019-05-15 07 13 22 7 2019-05-15 07 13 23 8 2019-05-15 07 13 24 9 2019-05-15 07 13 25   Have a look at the pandas documentation for DataFrame query  specifically the mention about the local variabile referenced udsing   prefix  In this case we reference pd Timestamp using the local alias ts to be able to supply a timestamp string

User · Answer

Previous answer is not correct in my experience  you can t pass it a simple string  needs to be a datetime object  So   import datetime  df loc datetime date year 2014 month 1 day 1  datetime date year 2014 month 2 day 1

User · Answer

If your datetime column have the Pandas datetime type  e g  datetime64 ns    for proper filtering you need the pd Timestamp object  for example   from datetime import date  import pandas as pd  value to check   pd Timestamp date today   year  1  1  filter mask   df  date column    lt  value to check filtered df   df filter mask

User · Answer

How about using pyjanitor  It has cool features   After pip install pyjanitor  import janitor  df filtered   df filter date your date column name  start date  end date

User · Answer

The shortest way to filter your dataframe by date  Lets suppose your date column is type of datetime64 ns     filter by single day df   df df  date   dt strftime   Y- m- d       2014-01-01      filter by single month df   df df  date   dt strftime   Y- m       2014-01      filter by single year df   df df  date   dt strftime   Y       2014

User · Answer

If date column is the index  then use  loc for label based indexing or  iloc for positional indexing   For example   df loc  2014-01-01   2014-02-01     See details here http   pandas pydata org pandas-docs stable dsintro html indexing-selection  If the column is not the index you have two choices    Make it the index  either temporarily or permanently if it s time-series data  df  df  date    gt   2013-01-01    amp   df  date    lt   2013-02-01      See here for the general explanation  Note   ix is deprecated

User · Answer

And if your dates are standardized by importing datetime package  you can simply use   df  df  date   gt datetime date 2016 1 1    amp   df  date   lt datetime date 2016 3 1        For standarding your date string using datetime package  you can use this function   import datetime datetime datetime strptime

User · Answer

I m not allowed to write any comments yet  so I ll write an answer  if somebody will read all of them and reach this one  If the index of the dataset is a datetime and you want to filter that just by  for example  months  you can do following  df loc df index month    3   That will filter the dataset for you by March

[python] Filtering Pandas DataFrames on dates

Examples related to python

Examples related to datetime

Examples related to pandas

Examples related to filtering

Examples related to dataframe