[python] Python SQL query string formatting

I'm trying to find the best way to format an sql query string. When I'm debugging my application I'd like to log to file all the sql query strings, and it is important that the string is properly formated.

Option 1

def myquery():
    sql = "select field1, field2, field3, field4 from table where condition1=1 and condition2=2"
    con = mymodule.get_connection()
    ...
  • This is good for printing the sql string.
  • It is not a good solution if the string is long and not fits the standard width of 80 characters.

Option 2

def query():
    sql = """
        select field1, field2, field3, field4
        from table
        where condition1=1
        and condition2=2"""
    con = mymodule.get_connection()
    ...
  • Here the code is clear but when you print the sql query string you get all these annoying white spaces.

    u'\nselect field1, field2, field3, field4\n_____from table\n____where condition1=1 \n_____and condition2=2'

Note: I have replaced white spaces with underscore _, because they are trimmed by the editor

Option 3

def query():
    sql = """select field1, field2, field3, field4
from table
where condition1=1
and condition2=2"""
    con = mymodule.get_connection()
    ...
  • I don't like this option because it breaks the clearness of the well tabulated code.

Option 4

def query():
    sql = "select field1, field2, field3, field4 " \
          "from table " \
          "where condition1=1 " \
          "and condition2=2 "
    con = mymodule.get_connection()    
    ...
  • I don't like this option because all the extra typing in each line and is difficult to edit the query also.

For me the best solution would be Option 2 but I don't like the extra whitespaces when I print the sql string.

Do you know of any other options?

This question is related to python sql string-formatting

The answer is


I would suggest sticking to option 2 (I'm always using it for queries any more complex than SELECT * FROM table) and if you want to print it in a nice way you may always use a separate module.


You can use inspect.cleandoc to nicely format your printed SQL statement.

This works very well with your option 2.

Note: the print("-"*40) is only to demonstrate the superflous blank lines if you do not use cleandoc.

from inspect import cleandoc
def query():
    sql = """
        select field1, field2, field3, field4
        from table
        where condition1=1
        and condition2=2
    """

    print("-"*40)
    print(sql)
    print("-"*40)
    print(cleandoc(sql))
    print("-"*40)

query()

Output:

----------------------------------------

        select field1, field2, field3, field4
        from table
        where condition1=1
        and condition2=2

----------------------------------------
select field1, field2, field3, field4
from table
where condition1=1
and condition2=2
----------------------------------------

From the docs:

inspect.cleandoc(doc)

Clean up indentation from docstrings that are indented to line up with blocks of code.

All leading whitespace is removed from the first line. Any leading whitespace that can be uniformly removed from the second line onwards is removed. Empty lines at the beginning and end are subsequently removed. Also, all tabs are expanded to spaces.


Using 'sqlparse' library we can format the sqls.

>>> import sqlparse
>>> raw = 'select * from foo; select * from bar;'
>>> print(sqlparse.format(raw, reindent=True, keyword_case='upper'))
SELECT *
FROM foo;

SELECT *
FROM bar;

Ref: https://pypi.org/project/sqlparse/


sql = ("select field1, field2, field3, field4 "
       "from table "
       "where condition1={} "
       "and condition2={}").format(1, 2)

Output: 'select field1, field2, field3, field4 from table 
         where condition1=1 and condition2=2'

if the value of condition should be a string, you can do like this:

sql = ("select field1, field2, field3, field4 "
       "from table "
       "where condition1='{0}' "
       "and condition2='{1}'").format('2016-10-12', '2017-10-12')

Output: "select field1, field2, field3, field4 from table where
         condition1='2016-10-12' and condition2='2017-10-12'"

sql = """\
select field1, field2, field3, field4
from table
where condition1=1
and condition2=2
"""

[edit in responese to comment]
Having an SQL string inside a method does NOT mean that you have to "tabulate" it:

>>> class Foo:
...     def fubar(self):
...         sql = """\
... select *
... from frobozz
... where zorkmids > 10
... ;"""
...         print sql
...
>>> Foo().fubar()
select *
from frobozz
where zorkmids > 10
;
>>>

You've obviously considered lots of ways to write the SQL such that it prints out okay, but how about changing the 'print' statement you use for debug logging, rather than writing your SQL in ways you don't like? Using your favourite option above, how about a logging function such as this:

def debugLogSQL(sql):
     print ' '.join([line.strip() for line in sql.splitlines()]).strip()

sql = """
    select field1, field2, field3, field4
    from table"""
if debug:
    debugLogSQL(sql)

This would also make it trivial to add additional logic to split the logged string across multiple lines if the line is longer than your desired length.


To avoid formatting entirely, I think a great solution is to use procedures.

Calling a procedure gives you the result of whatever query you want to put in this procedure. You can actually process multiple queries within a procedure. The call will just return the last query that was called.

MYSQL

DROP PROCEDURE IF EXISTS example;
 DELIMITER //
 CREATE PROCEDURE example()
   BEGIN
   SELECT 2+222+2222+222+222+2222+2222 AS this_is_a_really_long_string_test;
   END //
 DELIMITER;

#calling the procedure gives you the result of whatever query you want to put in this procedure. You can actually process multiple queries within a procedure. The call just returns the last query result
 call example;

Python

sql =('call example;')

For short queries that can fit on one or two lines, I use the string literal solution in the top-voted solution above. For longer queries, I break them out to .sql files. I then use a wrapper function to load the file and execute the script, something like:

script_cache = {}
def execute_script(cursor,script,*args,**kwargs):
    if not script in script_cache:
        with open(script,'r') as s:
            script_cache[script] = s
    return cursor.execute(script_cache[script],*args,**kwargs)

Of course this often lives inside a class so I don't usually have to pass cursor explicitly. I also generally use codecs.open(), but this gets the general idea across. Then SQL scripts are completely self-contained in their own files with their own syntax highlighting.


you could put the field names into an array "fields", and then:


sql = 'select %s from table where condition1=1 and condition2=2' % (
 ', '.join(fields))

This is slightly modified version of @aandis answer. When it comes to raw string, prefix 'r' character before the string. For example:

sql = r"""
    SELECT field1, field2, field3, field4
      FROM table
     WHERE condition1 = 1
       AND condition2 = 2;
"""

This is recommended when your query has any special character like '\' which requires escaping and lint tools like flake8 reports it as error.


Cleanest way I have come across is inspired by the sql style guide.

sql = """
    SELECT field1, field2, field3, field4
      FROM table
     WHERE condition1 = 1
       AND condition2 = 2;
"""

Essentially, the keywords that begin a clause should be right-aligned and the field names etc, should be left aligned. This looks very neat and is easier to debug as well.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to string-formatting

JavaScript Chart.js - Custom data formatting to display on tooltip Format in kotlin string templates Converting Float to Dollars and Cents Chart.js - Formatting Y axis String.Format not work in TypeScript Use StringFormat to add a string to a WPF XAML binding How to format number of decimal places in wpf using style/template? How to left align a fixed width string? Convert Java Date to UTC String Format a Go string without printing?