[python] How to extract a floating number from a string

I have a number of strings similar to Current Level: 13.4 db. and I would like to extract just the floating point number. I say floating and not decimal as it's sometimes whole. Can RegEx do this or is there a better way?

This question is related to python regex floating-point data-extraction

The answer is


Another approach that may be more readable is simple type conversion. I've added a replacement function to cover instances where people may enter European decimals:

>>> for possibility in "Current Level: -13.2 db or 14,2 or 3".split():
...     try:
...         str(float(possibility.replace(',', '.')))
...     except ValueError:
...         pass
'-13.2'
'14.2'
'3.0'

This has disadvantages too however. If someone types in "1,000", this will be converted to 1. Also, it assumes that people will be inputting with whitespace between words. This is not the case with other languages, such as Chinese.


re.findall(r"[-+]?\d*\.?\d+|\d+", "Current Level: -13.2 db or 14.2 or 3")

as described above, works really well! One suggestion though:

re.findall(r"[-+]?\d*\.?\d+|[-+]?\d+", "Current Level: -13.2 db or 14.2 or 3 or -3")

will also return negative int values (like -3 in the end of this string)


You can use the following regex to get integer and floating values from a string:

re.findall(r'[\d\.\d]+', 'hello -34 42 +34.478m 88 cricket -44.3')

['34', '42', '34.478', '88', '44.3']

Thanks Rex


I think that you'll find interesting stuff in the following answer of mine that I did for a previous similar question:

https://stackoverflow.com/q/5929469/551449

In this answer, I proposed a pattern that allows a regex to catch any kind of number and since I have nothing else to add to it, I think it is fairly complete


Python docs has an answer that covers +/-, and exponent notation

scanf() Token      Regular Expression
%e, %E, %f, %g     [-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?
%i                 [-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)

This regular expression does not support international formats where a comma is used as the separator character between the whole and fractional part (3,14159). In that case, replace all \. with [.,] in the above float regex.

                        Regular Expression
International float     [-+]?(\d+([.,]\d*)?|[.,]\d+)([eE][-+]?\d+)?

You may like to try something like this which covers all the bases, including not relying on whitespace after the number:

>>> import re
>>> numeric_const_pattern = r"""
...     [-+]? # optional sign
...     (?:
...         (?: \d* \. \d+ ) # .1 .12 .123 etc 9.1 etc 98.1 etc
...         |
...         (?: \d+ \.? ) # 1. 12. 123. etc 1 12 123 etc
...     )
...     # followed by optional exponent part if desired
...     (?: [Ee] [+-]? \d+ ) ?
...     """
>>> rx = re.compile(numeric_const_pattern, re.VERBOSE)
>>> rx.findall(".1 .12 9.1 98.1 1. 12. 1 12")
['.1', '.12', '9.1', '98.1', '1.', '12.', '1', '12']
>>> rx.findall("-1 +1 2e9 +2E+09 -2e-9")
['-1', '+1', '2e9', '+2E+09', '-2e-9']
>>> rx.findall("current level: -2.03e+99db")
['-2.03e+99']
>>>

For easy copy-pasting:

numeric_const_pattern = '[-+]? (?: (?: \d* \. \d+ ) | (?: \d+ \.? ) )(?: [Ee] [+-]? \d+ ) ?'
rx = re.compile(numeric_const_pattern, re.VERBOSE)
rx.findall("Some example: Jr. it. was .23 between 2.3 and 42.31 seconds")

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to floating-point

Convert list or numpy array of single element to float in python Convert float to string with precision & number of decimal digits specified? Float and double datatype in Java C convert floating point to int Convert String to Float in Swift How do I change data-type of pandas data frame to string with a defined format? How to check if a float value is a whole number Convert floats to ints in Pandas? Converting Float to Dollars and Cents Format / Suppress Scientific Notation from Python Pandas Aggregation Results

Examples related to data-extraction

How to extract a floating number from a string