Evaluating a mathematical expression in a string

Question

stringExp    2 4  intVal   int stringExp         Expected value  16   This returns the following error   Traceback  most recent call last     File   lt stdin gt    line 1  in  lt module gt  ValueError  invalid literal for int   with base 10   2 4    I know that eval can work around this  but isn t there a better and - more importantly - safer method to evaluate a mathematical expression that is being stored in a string

User · Answer

This is a massively late reply  but I think useful for future reference  Rather than write your own math parser  although the pyparsing example above is great  you could use SymPy  I don t have a lot of experience with it  but it contains a much more powerful math engine than anyone is likely to write for a specific application and the basic expression evaluation is very easy    gt  gt  gt  import sympy  gt  gt  gt  x  y  z   sympy symbols  x y z    gt  gt  gt  sympy sympify  x  3   sin y    evalf subs  x 1  y -3   0 858879991940133   Very cool indeed  A from sympy import   brings in a lot more function support  such as trig functions  special functions  etc   but I ve avoided that here to show what s coming from where

User · Answer

Here s my solution to the problem without using eval  Works with Python2 and Python3  It doesn t work with negative numbers     python -m pytest test py   test py  from solution import Solutions  class SolutionsTestCase unittest TestCase       def setUp self           self solutions   Solutions        def test evaluate self           expressions                  2 3 5                6 4 2 2 10                3 2 45 8 3 30625                3  3 3 3 3 30                2 4 6                    results    x split      1  for x in expressions          for e in range len expressions                if     in results e                   results e    float results e               else                  results e    int results e               self assertEqual                  results e                   self solutions evaluate expressions e                   solution py  class Solutions object       def evaluate self  exp           def format res               if     in res                  try                      res   float res                  except ValueError                      pass             else                  try                      res   int res                  except ValueError                      pass             return res         def splitter item  op               mul   item split op              if len mul     2                  for x in                       -                        if x in mul 0                           mul    mul 0  split x  1   mul 1                       if x in mul 1                           mul    mul 0   mul 1  split x  0               elif len mul   gt  2                  pass             else                  pass             for x in range len mul                    mul x    format mul x               return mul         exp   exp replace                  if     in exp              res   exp split      1              res   format res              exp   exp replace    s    res              while     in exp              if     in exp                  itm   splitter exp                       res   itm 0    itm 1                  exp   exp replace   s  s     str itm 0    str itm 1     str res           while      in exp              if      in exp                  itm   splitter exp                        res   itm 0     itm 1                  exp   exp replace   s   s     str itm 0    str itm 1     str res           while     in exp              if     in exp                  itm   splitter exp                       res   itm 0    itm 1                  exp   exp replace   s  s     str itm 0    str itm 1     str res           while     in exp              if     in exp                  itm   splitter exp                       res   itm 0    itm 1                  exp   exp replace   s  s     str itm 0    str itm 1     str res           while     in exp              if     in exp                  itm   splitter exp                       res   itm 0    itm 1                  exp   exp replace   s  s     str itm 0    str itm 1     str res           while  -  in exp              if  -  in exp                  itm   splitter exp   -                   res   itm 0  - itm 1                  exp   exp replace   s- s     str itm 0    str itm 1     str res            return format exp

User · Answer

Use eval in a clean namespace    gt  gt  gt  ns       builtins     None   gt  gt  gt  eval  2    4   ns  16   The clean namespace should prevent injection  For instance    gt  gt  gt  eval    builtins     import    os   system  echo got through     ns  Traceback  most recent call last     File   lt stdin gt    line 1  in  lt module gt    File   lt string gt    line 1  in  lt module gt  AttributeError   NoneType  object has no attribute    import      Otherwise you would get    gt  gt  gt  eval    builtins     import    os   system  echo got through     got through 0   You might want to give access to the math module    gt  gt  gt  import math  gt  gt  gt  ns   vars math  copy    gt  gt  gt  ns    builtins       None  gt  gt  gt  eval  cos pi 3    ns  0 50000000000000011

User · Answer

You can use the ast module and write a NodeVisitor that verifies that the type of each node is part of a whitelist   import ast  math  locals     key  value for  key value  in vars math  items   if key 0          locals update   abs   abs   complex   complex   min   min   max   max   pow   pow   round   round    class Visitor ast NodeVisitor       def visit self  node          if not isinstance node  self whitelist              raise ValueError node         return super   visit node       whitelist    ast Module  ast Expr  ast Load  ast Expression  ast Add  ast Sub  ast UnaryOp  ast Num  ast BinOp              ast Mult  ast Div  ast Pow  ast BitOr  ast BitAnd  ast BitXor  ast USub  ast UAdd  ast FloorDiv  ast Mod              ast LShift  ast RShift  ast Invert  ast Call  ast Name   def evaluate expr  locals            if any elem in expr for elem in   n      raise ValueError expr      try          node   ast parse expr strip    mode  eval           Visitor   visit node          return eval compile node    lt string gt     eval        builtins     None   locals      except Exception  raise ValueError expr    Because it works via a whitelist rather than a blacklist  it is safe  The only functions and variables it can access are those you explicitly give it access to  I populated a dict with math-related functions so you can easily provide access to those if you want  but you have to explicitly use it   If the string attempts to call functions that haven t been provided  or invoke any methods  an exception will be raised  and it will not be executed   Because this uses Python s built in parser and evaluator  it also inherits Python s precedence and promotion rules as well    gt  gt  gt  evaluate  7   9    2  lt  lt  2    79  gt  gt  gt  evaluate  6    2   0 0   3 0   The above code has only been tested on Python 3   If desired  you can add a timeout decorator on this function

User · Answer

eval is evil  eval    import    os   remove  important file       arbitrary commands eval  9  9  9  9  9  9  9  9       builtins     None     CPU  memory   Note  even if you use set   builtins   to None it still might be possible to break out using introspection   eval   1    class     bases   0    subclasses           builtins     None     Evaluate arithmetic expression using ast  import ast import operator as op    supported operators operators    ast Add  op add  ast Sub  op sub  ast Mult  op mul               ast Div  op truediv  ast Pow  op pow  ast BitXor  op xor               ast USub  op neg   def eval expr expr                gt  gt  gt  eval expr  2 6       4      gt  gt  gt  eval expr  2  6       64      gt  gt  gt  eval expr  1   2 3   4 5     6   -7        -5 0             return eval  ast parse expr  mode  eval   body   def eval  node       if isinstance node  ast Num      lt number gt          return node n     elif isinstance node  ast BinOp      lt left gt   lt operator gt   lt right gt          return operators type node op   eval  node left   eval  node right       elif isinstance node  ast UnaryOp      lt operator gt   lt operand gt  e g   -1         return operators type node op   eval  node operand       else          raise TypeError node    You can easily limit allowed range for each operation or any intermediate result  e g   to limit input arguments for a  b   def power a  b       if any abs n   gt  100 for n in  a  b            raise ValueError  a b       return op pow a  b  operators ast Pow    power   Or to limit magnitude of intermediate results   import functools  def limit max  None          Return decorator that limits allowed returned values         def decorator func            functools wraps func          def wrapper  args    kwargs               ret   func  args    kwargs              try                  mag   abs ret              except TypeError                  pass   not applicable             else                  if mag  gt  max                       raise ValueError ret              return ret         return wrapper     return decorator  eval    limit max  10  100  eval     Example   gt  gt  gt  evil      import    os   remove  important file     gt  gt  gt  eval expr evil   doctest  IGNORE EXCEPTION DETAIL Traceback  most recent call last       TypeError   gt  gt  gt  eval expr  9  9   387420489  gt  gt  gt  eval expr  9  9  9  9  9  9  9  9    doctest  IGNORE EXCEPTION DETAIL Traceback  most recent call last       ValueError

User · Answer

Pyparsing can be used to parse mathematical expressions  In particular  fourFn py shows how to parse basic arithmetic expressions  Below  I ve rewrapped fourFn into a numeric parser class for easier reuse    from   future   import division from pyparsing import  Literal  CaselessLiteral  Word  Combine  Group  Optional                         ZeroOrMore  Forward  nums  alphas  oneOf  import math import operator    author      Paul McGuire    version       Revision  0 0      date       Date  2009-03-20      source        http   pyparsing wikispaces com file view fourFn py http   pyparsing wikispaces com message view home 15549426       note         All I ve done is rewrap Paul McGuire s fourFn py as a class  so I can use it more easily in other places        class NumericStringParser object               Most of this code comes from the fourFn py pyparsing example               def pushFirst self  strg  loc  toks           self exprStack append toks 0        def pushUMinus self  strg  loc  toks           if toks and toks 0      -               self exprStack append  unary -        def   init   self                       expop                  multop                       addop             -          integer            -    0    9           atom       PI   E   real   fn     expr           expr             factor     atom   expop factor            term       factor   multop factor            expr       term   addop term                        point   Literal              e   CaselessLiteral  E           fnumber   Combine Word   -    nums  nums                              Optional point   Optional Word nums                                Optional e   Word   -    nums  nums            ident   Word alphas  alphas   nums                 plus   Literal              minus   Literal  -           mult   Literal              div   Literal              lpar   Literal      suppress           rpar   Literal      suppress           addop   plus   minus         multop   mult   div         expop   Literal              pi   CaselessLiteral  PI           expr   Forward           atom     Optional oneOf  -                          ident   lpar   expr   rpar   pi   e   fnumber  setParseAction self pushFirst                     Optional oneOf  -        Group lpar   expr   rpar                    setParseAction self pushUMinus            by defining exponentiation as  atom     factor       instead of            atom     atom        we get right-to-left exponents  instead of left-to-right           that is  2 3 2   2  3 2   not  2 3  2          factor   Forward           factor  lt  lt  atom                 ZeroOrMore  expop   factor  setParseAction self pushFirst           term   factor                 ZeroOrMore  multop   factor  setParseAction self pushFirst           expr  lt  lt  term                 ZeroOrMore  addop   term  setParseAction self pushFirst             addop term     addop   term   setParseAction  self pushFirst             general term   term   ZeroOrMore  addop term     OneOrMore  addop term            expr  lt  lt   general term         self bnf   expr           map operator symbols to corresponding arithmetic operations         epsilon   1e-12         self opn         operator add                       -   operator sub                           operator mul                           operator truediv                           operator pow          self fn     sin   math sin                      cos   math cos                      tan   math tan                      exp   math exp                      abs   abs                      trunc   lambda a  int a                       round   round                      sgn   lambda a  abs a   gt  epsilon and cmp a  0  or 0       def evaluateStack self  s           op   s pop           if op     unary -               return -self evaluateStack s          if op in   -                  op2   self evaluateStack s              op1   self evaluateStack s              return self opn op  op1  op2          elif op     PI               return math pi    3 1415926535         elif op     E               return math e    2 718281828         elif op in self fn              return self fn op  self evaluateStack s           elif op 0  isalpha                return 0         else              return float op       def eval self  num string  parseAll True           self exprStack              results   self bnf parseString num string  parseAll          val   self evaluateStack self exprStack             return val   You can use it like this  nsp   NumericStringParser   result   nsp eval  2 4   print result    16 0  result   nsp eval  exp 2 4    print result    8886110 520507872

User · Answer

Okay  so the problem with eval is that it can escape its sandbox too easily  even if you get rid of   builtins     All the methods for escaping the sandbox come down to using getattr or object   getattribute    via the   operator  to obtain a reference to some dangerous object via some allowed object       class     bases   0    subclasses   or similar    getattr is eliminated by setting   builtins   to None   object   getattribute   is the difficult one  since it cannot simply be removed  both because object is immutable and because removing it would break everything   However    getattribute   is only accessible via the   operator  so purging that from your input is sufficient to ensure eval cannot escape its sandbox  In processing formulas  the only valid use of a decimal is when it is preceded or followed by  0-9   so we just remove all other instances of     import re inp   re sub r       0-9        inp  val   eval inp      builtins    None     Note that while python normally treats 1   1  as 1   1 0  this will remove the trailing   and leave you with 1   1   You could add     and EOF to the list of things allowed to follow    but why bother

User · Answer

I think I would use eval    but would first check to make sure the string is a valid mathematical expression  as opposed to something malicious   You could use a regex for the validation   eval   also takes additional arguments which you can use to restrict the namespace it operates in for greater security

User · Answer

The reason eval and exec are so dangerous is that the default compile function will generate bytecode for any valid python expression  and the default eval or exec will execute any valid python bytecode   All the answers to date have focused on restricting the bytecode that can be generated  by sanitizing input  or building your own domain-specific-language using the AST     Instead  you can easily create a simple eval function that is incapable of doing anything nefarious and can easily have runtime checks on memory or time used   Of course  if it is simple math  than there is a shortcut   c   compile stringExp   userinput    eval   if c co code 0   b d  and c co code 3   b S       return c co consts ord c co code 1   ord c co code 2   256    The way this works is simple  any constant mathematic expression is safely evaluated during compilation and stored as a constant   The code object returned by compile consists of d  which is the bytecode for LOAD CONST  followed by the number of the constant to load  usually the last one in the list   followed by S  which is the bytecode for RETURN VALUE   If this shortcut doesn t work  it means that the user input isn t a constant expression  contains a variable or function call or similar      This also opens the door to some more sophisticated input formats   For example   stringExp    1   cos 2     This requires actually evaluating the bytecode  which is still quite simple   Python bytecode is a stack oriented language  so everything is a simple matter of TOS stack pop    op TOS   stack put TOS  or similar   The key is to only implement the opcodes that are safe  loading storing values  math operations  returning values  and not unsafe ones  attribute lookup    If you want the user to be able to call functions  the whole reason not to use the shortcut above   simple make your implementation of CALL FUNCTION only allow functions in a  safe  list   from dis import opmap from Queue import LifoQueue from math import sin cos import operator  globs     sin  sin   cos  cos  safe   globs values    stack   LifoQueue    class BINARY object       def   init   self  operator           self op operator     def   call   self  context           stack put self op stack get   stack get      class UNARY object       def   init   self  operator           self op operator     def   call   self  context           stack put self op stack get       def CALL FUNCTION context  arg       argc   arg 0  arg 1  256     args    stack get   for i in range argc       func   stack get       if func not in safe          raise TypeError  Function  r now allowed  func      stack put func  args    def LOAD CONST context  arg       cons   arg 0  arg 1  256     stack put context  code   co consts cons    def LOAD NAME context  arg       name num   arg 0  arg 1  256     name   context  code   co names name num      if name in context  locals            stack put context  locals   name       else          stack put context  globals   name    def RETURN VALUE context       return stack get    opfuncs         opmap  BINARY ADD    BINARY operator add       opmap  UNARY INVERT    UNARY operator invert       opmap  CALL FUNCTION    CALL FUNCTION      opmap  LOAD CONST    LOAD CONST      opmap  LOAD NAME    LOAD NAME     opmap  RETURN VALUE    RETURN VALUE     def VMeval c       context   dict locals     globals globs  code c      bci   iter c co code      for bytecode in bci          func   opfuncs ord bytecode           if func func code co argcount  1              ret   func context          else              args   ord bci next     ord bci next                ret   func context  args          if ret              return ret  def evaluate expr       return VMeval compile expr   userinput    eval      Obviously  the real version of this would be a bit longer  there are 119 opcodes  24 of which are math related    Adding STORE FAST and a couple others would allow for input like  x 5 return x x or similar  trivially easily   It can even be used to execute user-created functions  so long as the user created functions are themselves executed via VMeval  don t make them callable    or they could get used as a callback somewhere    Handling loops requires support for the goto bytecodes  which means changing from a for iterator to while and maintaining a pointer to the current instruction  but isn t too hard   For resistance to DOS  the main loop should check how much time has passed since the start of the calculation  and certain operators should deny input over some reasonable limit  BINARY POWER being the most obvious    While this approach is somewhat longer than a simple grammar parser for simple expressions  see above about just grabbing the compiled constant   it extends easily to more complicated input  and doesn t require dealing with grammar  compile take anything arbitrarily complicated and reduces it to a sequence of simple instructions

User · Answer

I know this is an old question  but it is worth pointing out new useful solutions as they pop up   Since python3 6  this capability is now built into the language  coined  f-strings    See  PEP 498 -- Literal String Interpolation  For example  note the f prefix    f  2  4     gt   16

User · Answer

Based on Perkins  amazing approach  I ve updated and improved his  quot shortcut quot  for simple algebraic expressions  no functions or variables   Now it works on Python 3 6  and avoids some pitfalls  import re  sys    Kept outside simple eval   just for performance  re simple eval   re compile rb d   x00- xFF   S x00    def simple eval expr       c   compile expr   userinput    eval       m    re simple eval fullmatch c co code      if not m          raise ValueError f quot Not a simple algebraic expresion   expr  quot       return c co consts int from bytes m group 1   sys byteorder    Testing  using some of the examples in other answers  for expr  res in         2 4                           6               2  4                         16               1   2 3   4 5     6   -7     -5 0             7   9    2  lt  lt  2              79               6    2   0 0                  3 0             2 3                           5               6 4 2 2                      10 0             3 2 45 8                      3 30625         3  3 3 3 3                   30 0              result   simple eval expr      ok    result    res and type result     type res       print  quot            quot  format  quot OK  quot  if ok else  quot FAIL  quot   expr  result    OK  2 4   6 OK  2  4   16 OK  1   2 3   4 5     6   -7    -5 0 OK  7   9    2  lt  lt  2    79 OK  6    2   0 0   3 0 OK  2 3   5 OK  6 4 2 2   10 0 OK  3 2 45 8   3 30625 OK  3  3 3 3 3   30 0

User · Answer

Some safer alternatives to eval   and sympy sympify   evalf       asteval  numexpr    SymPy sympify is also unsafe according to the following warning from the documentation      Warning  Note that this function uses eval  and thus shouldn   t be used on unsanitized input

[python] Evaluating a mathematical expression in a string

Examples related to python

Examples related to math