[python] Converting between datetime, Timestamp and datetime64

How do I convert a numpy.datetime64 object to a datetime.datetime (or Timestamp)?

In the following code, I create a datetime, timestamp and datetime64 objects.

import datetime
import numpy as np
import pandas as pd
dt = datetime.datetime(2012, 5, 1)
# A strange way to extract a Timestamp object, there's surely a better way?
ts = pd.DatetimeIndex([dt])[0]
dt64 = np.datetime64(dt)

In [7]: dt
Out[7]: datetime.datetime(2012, 5, 1, 0, 0)

In [8]: ts
Out[8]: <Timestamp: 2012-05-01 00:00:00>

In [9]: dt64
Out[9]: numpy.datetime64('2012-05-01T01:00:00.000000+0100')

Note: it's easy to get the datetime from the Timestamp:

In [10]: ts.to_datetime()
Out[10]: datetime.datetime(2012, 5, 1, 0, 0)

But how do we extract the datetime or Timestamp from a numpy.datetime64 (dt64)?

.

Update: a somewhat nasty example in my dataset (perhaps the motivating example) seems to be:

dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100')

which should be datetime.datetime(2002, 6, 28, 1, 0), and not a long (!) (1025222400000000000L)...

This question is related to python datetime numpy pandas

The answer is


indeed, all of these datetime types can be difficult, and potentially problematic (must keep careful track of timezone information). here's what i have done, though i admit that i am concerned that at least part of it is "not by design". also, this can be made a bit more compact as needed. starting with a numpy.datetime64 dt_a:

dt_a

numpy.datetime64('2015-04-24T23:11:26.270000-0700')

dt_a1 = dt_a.tolist() # yields a datetime object in UTC, but without tzinfo

dt_a1

datetime.datetime(2015, 4, 25, 6, 11, 26, 270000)

# now, make your "aware" datetime:

dt_a2=datetime.datetime(*list(dt_a1.timetuple()[:6]) + [dt_a1.microsecond], tzinfo=pytz.timezone('UTC'))

... and of course, that can be compressed into one line as needed.


I think there could be a more consolidated effort in an answer to better explain the relationship between Python's datetime module, numpy's datetime64/timedelta64 and pandas' Timestamp/Timedelta objects.

The datetime standard library of Python

The datetime standard library has four main objects

  • time - only time, measured in hours, minutes, seconds and microseconds
  • date - only year, month and day
  • datetime - All components of time and date
  • timedelta - An amount of time with maximum unit of days

Create these four objects

>>> import datetime
>>> datetime.time(hour=4, minute=3, second=10, microsecond=7199)
datetime.time(4, 3, 10, 7199)

>>> datetime.date(year=2017, month=10, day=24)
datetime.date(2017, 10, 24)

>>> datetime.datetime(year=2017, month=10, day=24, hour=4, minute=3, second=10, microsecond=7199)
datetime.datetime(2017, 10, 24, 4, 3, 10, 7199)

>>> datetime.timedelta(days=3, minutes = 55)
datetime.timedelta(3, 3300)

>>> # add timedelta to datetime
>>> datetime.timedelta(days=3, minutes = 55) + \
    datetime.datetime(year=2017, month=10, day=24, hour=4, minute=3, second=10, microsecond=7199)
datetime.datetime(2017, 10, 27, 4, 58, 10, 7199)

NumPy's datetime64 and timedelta64 objects

NumPy has no separate date and time objects, just a single datetime64 object to represent a single moment in time. The datetime module's datetime object has microsecond precision (one-millionth of a second). NumPy's datetime64 object allows you to set its precision from hours all the way to attoseconds (10 ^ -18). It's constructor is more flexible and can take a variety of inputs.

Construct NumPy's datetime64 and timedelta64 objects

Pass an integer with a string for the units. See all units here. It gets converted to that many units after the UNIX epoch: Jan 1, 1970

>>> np.datetime64(5, 'ns') 
numpy.datetime64('1970-01-01T00:00:00.000000005')

>>> np.datetime64(1508887504, 's')
numpy.datetime64('2017-10-24T23:25:04')

You can also use strings as long as they are in ISO 8601 format.

>>> np.datetime64('2017-10-24')
numpy.datetime64('2017-10-24')

Timedeltas have a single unit

>>> np.timedelta64(5, 'D') # 5 days
>>> np.timedelta64(10, 'h') 10 hours

Can also create them by subtracting two datetime64 objects

>>> np.datetime64('2017-10-24T05:30:45.67') - np.datetime64('2017-10-22T12:35:40.123')
numpy.timedelta64(147305547,'ms')

Pandas Timestamp and Timedelta build much more functionality on top of NumPy

A pandas Timestamp is a moment in time very similar to a datetime but with much more functionality. You can construct them with either pd.Timestamp or pd.to_datetime.

>>> pd.Timestamp(1239.1238934) #defautls to nanoseconds
Timestamp('1970-01-01 00:00:00.000001239')

>>> pd.Timestamp(1239.1238934, unit='D') # change units
Timestamp('1973-05-24 02:58:24.355200')

>>> pd.Timestamp('2017-10-24 05') # partial strings work
Timestamp('2017-10-24 05:00:00')

pd.to_datetime works very similarly (with a few more options) and can convert a list of strings into Timestamps.

>>> pd.to_datetime('2017-10-24 05')
Timestamp('2017-10-24 05:00:00')

>>> pd.to_datetime(['2017-1-1', '2017-1-2'])
DatetimeIndex(['2017-01-01', '2017-01-02'], dtype='datetime64[ns]', freq=None)

Converting Python datetime to datetime64 and Timestamp

>>> dt = datetime.datetime(year=2017, month=10, day=24, hour=4, 
                   minute=3, second=10, microsecond=7199)
>>> np.datetime64(dt)
numpy.datetime64('2017-10-24T04:03:10.007199')

>>> pd.Timestamp(dt) # or pd.to_datetime(dt)
Timestamp('2017-10-24 04:03:10.007199')

Converting numpy datetime64 to datetime and Timestamp

>>> dt64 = np.datetime64('2017-10-24 05:34:20.123456')
>>> unix_epoch = np.datetime64(0, 's')
>>> one_second = np.timedelta64(1, 's')
>>> seconds_since_epoch = (dt64 - unix_epoch) / one_second
>>> seconds_since_epoch
1508823260.123456

>>> datetime.datetime.utcfromtimestamp(seconds_since_epoch)
>>> datetime.datetime(2017, 10, 24, 5, 34, 20, 123456)

Convert to Timestamp

>>> pd.Timestamp(dt64)
Timestamp('2017-10-24 05:34:20.123456')

Convert from Timestamp to datetime and datetime64

This is quite easy as pandas timestamps are very powerful

>>> ts = pd.Timestamp('2017-10-24 04:24:33.654321')

>>> ts.to_pydatetime()   # Python's datetime
datetime.datetime(2017, 10, 24, 4, 24, 33, 654321)

>>> ts.to_datetime64()
numpy.datetime64('2017-10-24T04:24:33.654321000')

Welcome to hell.

You can just pass a datetime64 object to pandas.Timestamp:

In [16]: Timestamp(numpy.datetime64('2012-05-01T01:00:00.000000'))
Out[16]: <Timestamp: 2012-05-01 01:00:00>

I noticed that this doesn't work right though in NumPy 1.6.1:

numpy.datetime64('2012-05-01T01:00:00.000000+0100')

Also, pandas.to_datetime can be used (this is off of the dev version, haven't checked v0.9.1):

In [24]: pandas.to_datetime('2012-05-01T01:00:00.000000+0100')
Out[24]: datetime.datetime(2012, 5, 1, 1, 0, tzinfo=tzoffset(None, 3600))

You can just use the pd.Timestamp constructor. The following diagram may be useful for this and related questions.

Conversions between time representations


Some solutions work well for me but numpy will deprecate some parameters. The solution that work better for me is to read the date as a pandas datetime and excract explicitly the year, month and day of a pandas object. The following code works for the most common situation.

def format_dates(dates):
    dt = pd.to_datetime(dates)
    try: return [datetime.date(x.year, x.month, x.day) for x in dt]    
    except TypeError: return datetime.date(dt.year, dt.month, dt.day)

One option is to use str, and then to_datetime (or similar):

In [11]: str(dt64)
Out[11]: '2012-05-01T01:00:00.000000+0100'

In [12]: pd.to_datetime(str(dt64))
Out[12]: datetime.datetime(2012, 5, 1, 1, 0, tzinfo=tzoffset(None, 3600))

Note: it is not equal to dt because it's become "offset-aware":

In [13]: pd.to_datetime(str(dt64)).replace(tzinfo=None)
Out[13]: datetime.datetime(2012, 5, 1, 1, 0)

This seems inelegant.

.

Update: this can deal with the "nasty example":

In [21]: dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100')

In [22]: pd.to_datetime(str(dt64)).replace(tzinfo=None)
Out[22]: datetime.datetime(2002, 6, 28, 1, 0)

This post has been up for 4 years and I still struggled with this conversion problem - so the issue is still active in 2017 in some sense. I was somewhat shocked that the numpy documentation does not readily offer a simple conversion algorithm but that's another story.

I have come across another way to do the conversion that only involves modules numpy and datetime, it does not require pandas to be imported which seems to me to be a lot of code to import for such a simple conversion. I noticed that datetime64.astype(datetime.datetime) will return a datetime.datetime object if the original datetime64 is in micro-second units while other units return an integer timestamp. I use module xarray for data I/O from Netcdf files which uses the datetime64 in nanosecond units making the conversion fail unless you first convert to micro-second units. Here is the example conversion code,

import numpy as np
import datetime

def convert_datetime64_to_datetime( usert: np.datetime64 )->datetime.datetime:
    t = np.datetime64( usert, 'us').astype(datetime.datetime)
return t

Its only tested on my machine, which is Python 3.6 with a recent 2017 Anaconda distribution. I have only looked at scalar conversion and have not checked array based conversions although I'm guessing it will be good. Nor have I looked at the numpy datetime64 source code to see if the operation makes sense or not.


>>> dt64.tolist()
datetime.datetime(2012, 5, 1, 0, 0)

For DatetimeIndex, the tolist returns a list of datetime objects. For a single datetime64 object it returns a single datetime object.


import numpy as np
import pandas as pd 

def np64toDate(np64):
    return pd.to_datetime(str(np64)).replace(tzinfo=None).to_datetime()

use this function to get pythons native datetime object


If you want to convert an entire pandas series of datetimes to regular python datetimes, you can also use .to_pydatetime().

pd.date_range('20110101','20110102',freq='H').to_pydatetime()

> [datetime.datetime(2011, 1, 1, 0, 0) datetime.datetime(2011, 1, 1, 1, 0)
   datetime.datetime(2011, 1, 1, 2, 0) datetime.datetime(2011, 1, 1, 3, 0)
   ....

It also supports timezones:

pd.date_range('20110101','20110102',freq='H').tz_localize('UTC').tz_convert('Australia/Sydney').to_pydatetime()

[ datetime.datetime(2011, 1, 1, 11, 0, tzinfo=<DstTzInfo 'Australia/Sydney' EST+11:00:00 DST>)
 datetime.datetime(2011, 1, 1, 12, 0, tzinfo=<DstTzInfo 'Australia/Sydney' EST+11:00:00 DST>)
....

NOTE: If you are operating on a Pandas Series you cannot call to_pydatetime() on the entire series. You will need to call .to_pydatetime() on each individual datetime64 using a list comprehension or something similar:

datetimes = [val.to_pydatetime() for val in df.problem_datetime_column]

I've come back to this answer more times than I can count, so I decided to throw together a quick little class, which converts a Numpy datetime64 value to Python datetime value. I hope it helps others out there.

from datetime import datetime
import pandas as pd

class NumpyConverter(object):
    @classmethod
    def to_datetime(cls, dt64, tzinfo=None):
        """
        Converts a Numpy datetime64 to a Python datetime.
        :param dt64: A Numpy datetime64 variable
        :type dt64: numpy.datetime64
        :param tzinfo: The timezone the date / time value is in
        :type tzinfo: pytz.timezone
        :return: A Python datetime variable
        :rtype: datetime
        """
        ts = pd.to_datetime(dt64)
        if tzinfo is not None:
            return datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second, tzinfo=tzinfo)
        return datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second)

I'm gonna keep this in my tool bag, something tells me I'll need it again.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to datetime

Comparing two joda DateTime instances How to format DateTime in Flutter , How to get current time in flutter? How do I convert 2018-04-10T04:00:00.000Z string to DateTime? How to get current local date and time in Kotlin Converting unix time into date-time via excel Convert python datetime to timestamp in milliseconds SQL Server date format yyyymmdd Laravel Carbon subtract days from current date Check if date is a valid one Why is ZoneOffset.UTC != ZoneId.of("UTC")?

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float