Removing all non-numeric characters from string in Python

Question

How do we remove all non-numeric characters from a string in Python

User · Accepted Answer

gt  gt  gt  import re  gt  gt  gt  re sub    0-9         sdkjh987978asd098as0980a98sd    987978098098098

User · Answer

This should work for both strings and unicode objects in Python2  and both strings and bytes in Python3    python  lt 3 0 def only numerics seq       return filter type seq  isdigit  seq     python  3 0 def only numerics seq       seq type  type seq      return seq type   join filter seq type isdigit  seq

User · Answer

Many right answers but in case you want it in a float  directly  without using regex   x    123 45M   float    join c for c in x if  c isdigit   or c           123 45  You can change the point for a comma depending on your needs   change for this if you know your number is an integer  x   1123      int    join c for c in x if c isdigit      1123

User · Answer

Ned Batchelder and  newacct provided the right answer  but      Just in case if you have comma    decimal    in your string   import re re sub     d            1 999 888 77    1999888 77

User · Answer

Not sure if this is the most efficient way  but    gt  gt  gt     join c for c in  abc123def456  if c isdigit     123456    The    join part means to combine all the resulting characters together without any characters in between   Then the rest of it is a list comprehension  where  as you can probably guess  we only take the parts of the string that match the condition isdigit

User · Answer

Fastest approach  if you need to perform more than just one or two such removal operations  or even just one  but on a very long string -   is to rely on the translate method of strings  even though it does need some prep    gt  gt  gt  import string  gt  gt  gt  allchars      join chr i  for i in xrange 256    gt  gt  gt  identity   string maketrans          gt  gt  gt  nondigits   allchars translate identity  string digits   gt  gt  gt  s    abc123def456   gt  gt  gt  s translate identity  nondigits   123456    The translate method is different  and maybe a tad simpler simpler to use  on Unicode strings than it is on byte strings  btw    gt  gt  gt  unondig   dict fromkeys xrange 65536    gt  gt  gt  for x in string digits  del unondig ord x         gt  gt  gt  s   u abc123def456   gt  gt  gt  s translate unondig  u 123456    You might want to use a mapping class rather than an actual dict  especially if your Unicode string may potentially contain characters with very high ord values  that would make the dict excessively large -    For example    gt  gt  gt  class keeponly object         def   init   self  keep            self keep   set ord c  for c in keep        def   getitem   self  key           if key in self keep            return key         return None       gt  gt  gt  s translate keeponly string digits   u 123456   gt  gt  gt

User · Answer

Just to add another option to the mix  there are several useful constants within the string module  While more useful in other cases  they can be used here    gt  gt  gt  from string import digits  gt  gt  gt     join c for c in  abc123def456  if c in digits   123456    There are several constants in the module  including    ascii letters  abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ  hexdigits  0123456789abcdefABCDEF    If you are using these constants heavily  it can be worthwhile to covert them to a frozenset  That enables O 1  lookups  rather than O n   where n is the length of the constant for the original strings    gt  gt  gt  digits   frozenset digits   gt  gt  gt     join c for c in  abc123def456  if c in digits   123456

[python] Removing all non-numeric characters from string in Python

Examples related to python

Examples related to numbers