[python] Extracting date from a string in Python

How can I extract the date from a string like "monkey 2010-07-10 love banana"? Thanks!

This question is related to python string date

The answer is


Using python-dateutil:

In [1]: import dateutil.parser as dparser

In [18]: dparser.parse("monkey 2010-07-10 love banana",fuzzy=True)
Out[18]: datetime.datetime(2010, 7, 10, 0, 0)

Invalid dates raise a ValueError:

In [19]: dparser.parse("monkey 2010-07-32 love banana",fuzzy=True)
# ValueError: day is out of range for month

It can recognize dates in many formats:

In [20]: dparser.parse("monkey 20/01/1980 love banana",fuzzy=True)
Out[20]: datetime.datetime(1980, 1, 20, 0, 0)

Note that it makes a guess if the date is ambiguous:

In [23]: dparser.parse("monkey 10/01/1980 love banana",fuzzy=True)
Out[23]: datetime.datetime(1980, 10, 1, 0, 0)

But the way it parses ambiguous dates is customizable:

In [21]: dparser.parse("monkey 10/01/1980 love banana",fuzzy=True, dayfirst=True)
Out[21]: datetime.datetime(1980, 1, 10, 0, 0)

If you know the position of the date object in the string (for example in a log file), you can use .split()[index] to extract the date without fully knowing the format.

For example:

>>> string = 'monkey 2010-07-10 love banana'
>>> date = string.split()[1]
>>> date
'2010-07-10'

Using Pygrok, you can define abstracted extensions to the Regular Expression syntax.

The custom patterns can be included in your regex in the format %{PATTERN_NAME}.

You can also create a label for that pattern, by separating with a colon: %s{PATTERN_NAME:matched_string}. If the pattern matches, the value will be returned as part of the resulting dictionary (e.g. result.get('matched_string'))

For example:

from pygrok import Grok

input_string = 'monkey 2010-07-10 love banana'
date_pattern = '%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day}'

grok = Grok(date_pattern)
print(grok.match(input_string))

The resulting value will be a dictionary:

{'month': '07', 'day': '10', 'year': '2010'}

If the date_pattern does not exist in the input_string, the return value will be None. By contrast, if your pattern does not have any labels, it will return an empty dictionary {}

References:


For extracting the date from a string in Python; the best module available is the datefinder module.

You can use it in your Python project by following the easy steps given below.

Step 1: Install datefinder Package

pip install datefinder

Step 2: Use It In Your Project

import datefinder

input_string = "monkey 2010-07-10 love banana"
# a generator will be returned by the datefinder module. I'm typecasting it to a list. Please read the note of caution provided at the bottom.
matches = list(datefinder.find_dates(input_string))

if len(matches) > 0:
    # date returned will be a datetime.datetime object. here we are only using the first match.
    date = matches[0]
    print date
else:
    print 'No dates found'

note: if you are expecting a large number of matches; then typecasting to list won't be a recommended way as it will be having a big performance overhead.


You could also try the dateparser module, which may be slower than datefinder on free text but which should cover more potential cases and date formats, as well as a significant number of languages.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to date

How do I format {{$timestamp}} as MM/DD/YYYY in Postman? iOS Swift - Get the Current Local Time and Date Timestamp Typescript Date Type? how to convert current date to YYYY-MM-DD format with angular 2 SQL Server date format yyyymmdd Date to milliseconds and back to date in Swift Check if date is a valid one change the date format in laravel view page Moment js get first and last day of current month How can I convert a date into an integer?