I think you can't achieve what you want in a more efficient manner than you proposed.
The underlying problem is that the timestamps (as you seem aware) are made up of two parts. The data that represents the UTC time, and the timezone, tz_info. The timezone information is used only for display purposes when printing the timezone to the screen. At display time, the data is offset appropriately and +01:00 (or similar) is added to the string. Stripping off the tz_info value (using tz_convert(tz=None)) doesn't doesn't actually change the data that represents the naive part of the timestamp.
So, the only way to do what you want is to modify the underlying data (pandas doesn't allow this... DatetimeIndex are immutable -- see the help on DatetimeIndex), or to create a new set of timestamp objects and wrap them in a new DatetimeIndex. Your solution does the latter:
pd.DatetimeIndex([i.replace(tzinfo=None) for i in t])
For reference, here is the replace
method of Timestamp
(see tslib.pyx):
def replace(self, **kwds):
return Timestamp(datetime.replace(self, **kwds),
offset=self.offset)
You can refer to the docs on datetime.datetime
to see that datetime.datetime.replace
also creates a new object.
If you can, your best bet for efficiency is to modify the source of the data so that it (incorrectly) reports the timestamps without their timezone. You mentioned:
I want to work with timezone naive timeseries (to avoid the extra hassle with timezones, and I do not need them for the case I am working on)
I'd be curious what extra hassle you are referring to. I recommend as a general rule for all software development, keep your timestamp 'naive values' in UTC. There is little worse than looking at two different int64 values wondering which timezone they belong to. If you always, always, always use UTC for the internal storage, then you will avoid countless headaches. My mantra is Timezones are for human I/O only.