I have a CSV dumpfile from a Blackberry IPD backup, created using IPDDump.
The date/time strings in here look something like this
(where EST
is an Australian time-zone):
Tue Jun 22 07:46:22 EST 2010
I need to be able to parse this date in Python. At first, I tried to use the strptime()
function from datettime.
>>> datetime.datetime.strptime('Tue Jun 22 12:10:20 2010 EST', '%a %b %d %H:%M:%S %Y %Z')
However, for some reason, the datetime
object that comes back doesn't seem to have any tzinfo
associated with it.
I did read on this page that apparently datetime.strptime
silently discards tzinfo
, however, I checked the documentation, and I can't find anything to that effect documented here.
I have been able to get the date parsed using a third-party Python library, dateutil, however I'm still curious as to how I was using the in-built strptime()
incorrectly? Is there any way to get strptime()
to play nicely with timezones?
I recommend using python-dateutil. Its parser has been able to parse every date format I've thrown at it so far.
>>> from dateutil import parser
>>> parser.parse("Tue Jun 22 07:46:22 EST 2010")
datetime.datetime(2010, 6, 22, 7, 46, 22, tzinfo=tzlocal())
>>> parser.parse("Fri, 11 Nov 2011 03:18:09 -0400")
datetime.datetime(2011, 11, 11, 3, 18, 9, tzinfo=tzoffset(None, -14400))
>>> parser.parse("Sun")
datetime.datetime(2011, 12, 18, 0, 0)
>>> parser.parse("10-11-08")
datetime.datetime(2008, 10, 11, 0, 0)
and so on. No dealing with strptime()
format nonsense... just throw a date at it and it Does The Right Thing.
Update: Oops. I missed in your original question that you mentioned that you used dateutil
, sorry about that. But I hope this answer is still useful to other people who stumble across this question when they have date parsing questions and see the utility of that module.
Since strptime
returns a datetime object which has tzinfo
attribute, We can simply replace it with desired timezone.
>>> import datetime
>>> date_time_str = '2018-06-29 08:15:27.243860'
>>> date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S.%f').replace(tzinfo=datetime.timezone.utc)
>>> date_time_obj.tzname()
'UTC'
Ran into this exact problem.
What I ended up doing:
# starting with date string
sdt = "20190901"
std_format = '%Y%m%d'
# create naive datetime object
from datetime import datetime
dt = datetime.strptime(sdt, sdt_format)
# extract the relevant date time items
dt_formatters = ['%Y','%m','%d']
dt_vals = tuple(map(lambda formatter: int(datetime.strftime(dt,formatter)), dt_formatters))
# set timezone
import pendulum
tz = pendulum.timezone('utc')
dt_tz = datetime(*dt_vals,tzinfo=tz)
Your time string is similar to the time format in rfc 2822 (date format in email, http headers). You could parse it using only stdlib:
>>> from email.utils import parsedate_tz
>>> parsedate_tz('Tue Jun 22 07:46:22 EST 2010')
(2010, 6, 22, 7, 46, 22, 0, 1, -1, -18000)
See solutions that yield timezone-aware datetime objects for various Python versions: parsing date with timezone from an email.
In this format, EST
is semantically equivalent to -0500
. Though, in general, a timezone abbreviation is not enough, to identify a timezone uniquely.
Source: Stackoverflow.com