Extracting just Month and Year separately from Pandas Datetime column

Question

I have a Dataframe  df  with the following column   df  ArrivalDate         936   2012-12-31 938   2012-12-29 965   2012-12-31 966   2012-12-31 967   2012-12-31 968   2012-12-31 969   2012-12-31 970   2012-12-29 971   2012-12-31 972   2012-12-29 973   2012-12-29       The elements of the column are pandas tslib Timestamp   I want to just include the year and month   I thought there would be simple way to do it  but I can t figure it out   Here s what I ve tried   df  ArrivalDate   resample  M   how    mean     I got the following error   Only valid with DatetimeIndex or PeriodIndex    Then I tried   df  ArrivalDate   apply lambda x  x  -2     I got the following error    Timestamp  object has no attribute    getitem       Any suggestions   Edit  I sort of figured it out     df index   df  ArrivalDate     Then  I can resample another column using the index   But I d still like a method for reconfiguring the entire column   Any ideas

User · Answer

There is two steps to extract year for all the dataframe without using method apply.

Step1

convert the column to datetime :

df['ArrivalDate']=pd.to_datetime(df['ArrivalDate'], format='%Y-%m-%d')

Step2

extract the year or the month using DatetimeIndex() method

 pd.DatetimeIndex(df['ArrivalDate']).year

User · Answer

SINGLE LINE  Adding a column with  year-month -paires    pd to datetime  first changes the column dtype to date-time before the operation  df  yyyy-mm     pd to datetime df  ArrivalDate    dt strftime   Y- m     Accordingly for an extra  year  or  month  column  df  yyyy     pd to datetime df  ArrivalDate    dt strftime   Y    df  mm     pd to datetime df  ArrivalDate    dt strftime   m

User · Answer

You can first convert your date strings with pandas to datetime  which gives you access to all of the numpy datetime and timedelta facilities  For example   df  ArrivalDate     pandas to datetime df  ArrivalDate    df  Month     df  ArrivalDate   values astype  datetime64 M

User · Answer

Thanks to jaknap32  I wanted to aggregate the results according to Year and Month  so this worked   df join  YearMonth     df join  timestamp   apply lambda x x strftime   Y m      Output was neat   0    201108 1    201108 2    201108

User · Answer

If you want new columns showing year and month separately you can do this   df  year     pd DatetimeIndex df  ArrivalDate    year df  month     pd DatetimeIndex df  ArrivalDate    month   or     df  year     df  ArrivalDate   dt year df  month     df  ArrivalDate   dt month   Then you can combine them or work with them just as they are

User · Answer

df  year month   df datetime column apply lambda x  str x   7     This worked fine for me  didn t think pandas would interpret the resultant string date as date  but when i did the plot  it knew very well my agenda and the string year month where ordered properly    gotta love pandas

User · Answer

You can directly access the year and month attributes  or request a datetime datetime   In  15   t   pandas tslib Timestamp now    In  16   t Out 16   Timestamp  2014-08-05 14 49 39 643701   tz None   In  17   t to pydatetime    datetime method is deprecated Out 17   datetime datetime 2014  8  5  14  49  39  643701   In  18   t day Out 18   5  In  19   t month Out 19   8  In  20   t year Out 20   2014   One way to combine year and month is to make an integer encoding them  such as  201408 for August  2014  Along a whole column  you could do this as   df  YearMonth     df  ArrivalDate   map lambda x  100 x year   x month    or many variants thereof   I m not a big fan of doing this  though  since it makes date alignment and arithmetic painful later and especially painful for others who come upon your code or data without this same convention  A better way is to choose a day-of-month convention  such as final non-US-holiday weekday  or first day  etc   and leave the data in a date time format with the chosen date convention   The calendar module is useful for obtaining the number value of certain days such as the final weekday  Then you could do something like   import calendar import datetime df  AdjustedDateToEndOfMonth     df  ArrivalDate   map      lambda x  datetime datetime          x year          x month          max calendar monthcalendar x year  x month  -1   5             If you happen to be looking for a way to solve the simpler problem of just formatting the datetime column into some stringified representation  for that you can just make use of the strftime function from the datetime datetime class  like this   In  5   df Out 5                date time 0 2014-10-17 22 00 03  In  6   df date time Out 6    0   2014-10-17 22 00 03 Name  date time  dtype  datetime64 ns   In  7   df date time map lambda x  x strftime   Y- m- d    Out 7    0    2014-10-17 Name  date time  dtype  object

User · Answer

If you want the month year unique pair  using apply is pretty sleek   df  mnth yr     df  date column   apply lambda x  x strftime   B- Y       Outputs month-year in one column   Don t forget to first change the format to date-time before  I generally forget   df  date column     pd to datetime df  date column

User · Answer

KieranPC s solution is the correct approach for Pandas  but is not easily extendible for arbitrary attributes  For this  you can use getattr within a generator comprehension and combine using pd concat     input data list of dates     2012-12-31    2012-12-29    2012-12-30   df   pd DataFrame   ArrivalDate   pd to datetime list of dates       define list of attributes required     L     year    month    day    dayofweek    dayofyear    weekofyear    quarter      define generator expression of series  one for each attribute date gen    getattr df  ArrivalDate   dt  i  rename i  for i in L     concatenate results and join to original dataframe df   df join pd concat date gen  axis 1    print df     ArrivalDate  year  month  day  dayofweek  dayofyear  weekofyear  quarter 0  2012-12-31  2012     12   31          0        366           1        4 1  2012-12-29  2012     12   29          5        364          52        4 2  2012-12-30  2012     12   30          6        365          52        4

User · Answer

Extracting the Year say from   2018-03-04      df  Year     pd DatetimeIndex df  date    year     The df  Year   creates a new column  While if you want to extract the month just use  month

User · Answer

Best way found    the df  date column   has to be in date time format   df  month year     df  date column   dt to period  M     You could also use D for Day  2M for 2 Months etc  for different sampling intervals  and in case one has time series data with time stamp  we can go for granular sampling intervals such as 45Min for 45 min  15Min for 15 min sampling etc

[python] Extracting just Month and Year separately from Pandas Datetime column

Examples related to python

Examples related to pandas