Convert pandas timezone-aware DateTimeIndex to naive timestamp but in certain timezone

Question

You can use the function tz localize to make a Timestamp or DateTimeIndex timezone aware  but how can you do the opposite  how can you convert a timezone aware Timestamp to a naive one  while preserving its timezone   An example   In  82   t   pd date range start  2013-05-18 12 00 00   periods 10  freq  s   tz  Europe Brussels    In  83   t Out 83     lt class  pandas tseries index DatetimeIndex  gt   2013-05-18 12 00 00       2013-05-18 12 00 09  Length  10  Freq  S  Timezone  Europe Brussels   I could remove the timezone by setting it to None  but then the result is converted to UTC  12 o clock became 10    In  86   t tz   None  In  87   t Out 87     lt class  pandas tseries index DatetimeIndex  gt   2013-05-18 10 00 00       2013-05-18 10 00 09  Length  10  Freq  S  Timezone  None   Is there another way I can convert a DateTimeIndex to timezone naive  but while preserving the timezone it was set in     Some context on the reason I am asking this  I want to work with timezone naive timeseries  to avoid the extra hassle with timezones  and I do not need them for the case I am working on   But for some reason  I have to deal with a timezone-aware timeseries in my local timezone  Europe Brussels   As all my other data are timezone naive  but represented in my local timezone   I want to convert this timeseries to naive to further work with it  but it also has to be represented in my local timezone  so just remove the timezone info  without converting the user-visible time to UTC      I know the time is actually internal stored as UTC and only converted to another timezone when you represent it  so there has to be some kind of conversion when I want to  delocalize  it  For example  with the python datetime module you can  remove  the timezone like this   In  119   d   pd Timestamp  2013-05-18 12 00 00   tz  Europe Brussels    In  120   d Out 120    lt Timestamp  2013-05-18 12 00 00 0200 CEST  tz Europe Brussels gt   In  121   d replace tzinfo None  Out 121    lt Timestamp  2013-05-18 12 00 00 gt     So  based on this  I could do the following  but I suppose this will not be very efficient when working with a larger timeseries   In  124   t Out 124     lt class  pandas tseries index DatetimeIndex  gt   2013-05-18 12 00 00       2013-05-18 12 00 09  Length  10  Freq  S  Timezone  Europe Brussels  In  125   pd DatetimeIndex  i replace tzinfo None  for i in t   Out 125     lt class  pandas tseries index DatetimeIndex  gt   2013-05-18 12 00 00       2013-05-18 12 00 09  Length  10  Freq  None  Timezone  None

User · Answer

Building on D A  s suggestion that  the only way to do what you want is to modify the underlying data  and using numpy to modify the underlying data     This works for me  and is pretty fast   def tz to naive datetime index          Converts a tz-aware DatetimeIndex into a tz-naive DatetimeIndex      effectively baking the timezone into the internal representation       Parameters     ----------     datetime index   pandas DatetimeIndex  tz-aware      Returns     -------     pandas DatetimeIndex  tz-naive               Calculate timezone offset relative to UTC     timestamp   datetime index 0      tz offset    timestamp replace tzinfo None  -                   timestamp tz convert  UTC   replace tzinfo None       tz offset td64   np timedelta64 tz offset         Now convert to naive DatetimeIndex     return pd DatetimeIndex datetime index values   tz offset td64

User · Answer

Because I always struggle to remember  a quick summary of what each of these do    gt  gt  gt  pd Timestamp now      naive local time Timestamp  2019-10-07 10 30 19 428748     gt  gt  gt  pd Timestamp utcnow      tz aware UTC Timestamp  2019-10-07 08 30 19 428748 0000   tz  UTC     gt  gt  gt  pd Timestamp now tz  Europe Brussels      tz aware local time Timestamp  2019-10-07 10 30 19 428748 0200   tz  Europe Brussels     gt  gt  gt  pd Timestamp now tz  Europe Brussels   tz localize None     naive local time Timestamp  2019-10-07 10 30 19 428748     gt  gt  gt  pd Timestamp now tz  Europe Brussels   tz convert None     naive UTC Timestamp  2019-10-07 08 30 19 428748     gt  gt  gt  pd Timestamp utcnow   tz localize None     naive UTC Timestamp  2019-10-07 08 30 19 428748     gt  gt  gt  pd Timestamp utcnow   tz convert None     naive UTC Timestamp  2019-10-07 08 30 19 428748

User · Answer

I think you can t achieve what you want in a more efficient manner than you proposed   The underlying problem is that the timestamps  as you seem aware  are made up of two parts   The data that represents the UTC time  and the timezone  tz info   The timezone information is used only for display purposes when printing the timezone to the screen   At display time  the data is offset appropriately and  01 00  or similar  is added to the string   Stripping off the tz info value  using tz convert tz None   doesn t doesn t actually change the data that represents the naive part of the timestamp     So  the only way to do what you want is to modify the underlying data  pandas doesn t allow this    DatetimeIndex are immutable -- see the help on DatetimeIndex   or to create a new set of timestamp objects and wrap them in a new DatetimeIndex   Your solution does the latter   pd DatetimeIndex  i replace tzinfo None  for i in t     For reference  here is the replace method of Timestamp  see tslib pyx    def replace self    kwds       return Timestamp datetime replace self    kwds                        offset self offset    You can refer to the docs on datetime datetime to see that datetime datetime replace also creates a new object      If you can  your best bet for efficiency is to modify the source of the data so that it  incorrectly  reports the timestamps without their timezone   You mentioned      I want to work with timezone naive timeseries  to avoid the extra hassle with timezones  and I do not need them for the case I am working on    I d be curious what extra hassle you are referring to   I recommend as a general rule for all software development  keep your timestamp  naive values  in UTC   There is little worse than looking at two different int64 values wondering which timezone they belong to   If you always  always  always use UTC for the internal storage  then you will avoid countless headaches   My mantra is Timezones are for human I O only

User · Answer

Setting the tz attribute of the index explicitly seems to work   ts utc   ts tz convert  UTC   ts utc index tz   None

User · Answer

The most important thing is add tzinfo when you define a datetime object   from datetime import datetime  timezone from tzinfo examples import HOUR  Eastern u0   datetime 2016  3  13  5  tzinfo timezone utc  for i in range 4        u   u0   i HOUR      t   u astimezone Eastern       print u time     UTC     t time    t tzname

User · Answer

Late contribution but just came across something similar in Python datetime and pandas give different timestamps for the same date  If you have timezone-aware datetime in pandas  technically  tz localize None  changes the POSIX timestamp  that is used internally  as if the local time from the timestamp was UTC  Local in this context means local in the specified timezone  Ex  import pandas as pd  t   pd date range start  quot 2013-05-18 12 00 00 quot   periods 2  freq  H   tz  quot US Central quot     DatetimeIndex   2013-05-18 12 00 00-05 00    2013-05-18 13 00 00-05 00    dtype  datetime64 ns  US Central    freq  H    t loc   t tz localize None    DatetimeIndex   2013-05-18 12 00 00    2013-05-18 13 00 00    dtype  datetime64 ns    freq  H      offset in seconds according to timezone   t loc values-t values   1e9   array  -18000  -18000   dtype  timedelta64 ns     Note that this will leave you with strange things during DST transitions  e g  t   pd date range start  quot 2020-03-08 01 00 00 quot   periods 2  freq  H   tz  quot US Central quot    t values 1 -t values 0    1e9   numpy timedelta64 3600  ns    t loc   t tz localize None   t loc values 1 -t loc values 0    1e9   numpy timedelta64 7200  ns    In contrast  tz convert None  does not modify the internal timestamp  it just removes the tzinfo  t utc   t tz convert None   t utc values-t values   1e9   array  0  0   dtype  timedelta64 ns      My bottom line would be  stick with timezone-aware datetime if you can or only use t tz convert None  which doesn t modify the underlying POSIX timestamp  Just keep in mind that you re practically working with UTC then   Python 3 8 2 x64 on Windows 10  pandas v1 0 5

User · Answer

To answer my own question  this functionality has been added to pandas in the meantime  Starting from pandas 0 15 0  you can use tz localize None  to remove the timezone resulting in local time  See the whatsnew entry  http   pandas pydata org pandas-docs stable whatsnew html timezone-handling-improvements  So with my example from above   In  4   t   pd date range start  2013-05-18 12 00 00   periods 2  freq  H                             tz   Europe Brussels    In  5   t Out 5   DatetimeIndex   2013-05-18 12 00 00 02 00    2013-05-18 13 00 00 02 00                           dtype  datetime64 ns  Europe Brussels    freq  H     using tz localize None  removes the timezone information resulting in naive local time   In  6   t tz localize None  Out 6   DatetimeIndex   2013-05-18 12 00 00    2013-05-18 13 00 00                           dtype  datetime64 ns    freq  H     Further  you can also use tz convert None  to remove the timezone information but converting to UTC  so yielding naive UTC time   In  7   t tz convert None  Out 7   DatetimeIndex   2013-05-18 10 00 00    2013-05-18 11 00 00                           dtype  datetime64 ns    freq  H       This is much more performant than the datetime replace solution   In  31   t   pd date range start  2013-05-18 12 00 00   periods 10000  freq  H                              tz  Europe Brussels    In  32    timeit t tz localize None  1000 loops  best of 3  233   s per loop  In  33    timeit pd DatetimeIndex  i replace tzinfo None  for i in t   10 loops  best of 3  99 7 ms per loop

User · Answer

The accepted solution does not work when there are multiple different timezones in a Series  It throws ValueError  Tz-aware datetime datetime cannot be converted to datetime64 unless utc True  The solution is to use the apply method    Please see the examples below     Let s have a series  a  with different multiple timezones    gt  a 0    2019-10-04 16 30 00 02 00 1    2019-10-07 16 00 00-04 00 2    2019-09-24 08 30 00-07 00 Name  localized  dtype  object   gt  a iloc 0  Timestamp  2019-10-04 16 30 00 0200   tz  Europe Amsterdam      trying the accepted solution  gt  a dt tz localize None  ValueError  Tz-aware datetime datetime cannot be converted to datetime64 unless utc True    Make it tz-naive  This is the solution   gt  a apply lambda x x tz localize None   0   2019-10-04 16 30 00 1   2019-10-07 16 00 00 2   2019-09-24 08 30 00 Name  localized  dtype  datetime64 ns     a tz convert   also does not work with multiple timezones  but this works   gt  a apply lambda x x tz convert  America Los Angeles    0   2019-10-04 07 30 00-07 00 1   2019-10-07 13 00 00-07 00 2   2019-09-24 08 30 00-07 00 Name  localized  dtype  datetime64 ns  America Los Angeles

[python] Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone

Examples related to python

Examples related to pandas

Examples related to datetime

Examples related to timezone