Sqlparse
Sqlparse
Sqlparse
Release 0.4.4.dev0
Andi Albrecht
1 Quick Start 3
2 Contents 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 sqlparse – Parse SQL statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Analyzing the Parsed Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Changes in python-sqlparse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.6 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Indices and tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 Resources 23
Index 27
i
ii
python-sqlparse Documentation, Release 0.4.4.dev0
sqlparse is a non-validating SQL parser for Python. It provides support for parsing, splitting and formatting SQL
statements.
The module is compatible with Python 3.5+ and released under the terms of the New BSD license.
Visit the project page at https://github.com/andialbrecht/sqlparse for further information about this project.
Contents 1
python-sqlparse Documentation, Release 0.4.4.dev0
2 Contents
CHAPTER 1
Quick Start
>>>
3
python-sqlparse Documentation, Release 0.4.4.dev0
Contents
2.1 Introduction
The latest released version can be obtained from the Python Package Index (PyPI). To extract and install the module
system-wide run
The sqlparse module provides three simple functions on module level to achieve some common tasks when work-
ing with SQL statements. This section shows some simple usage examples of these functions.
Let’s get started with splitting a string containing one or more SQL statements into a list of single statements using
split():
The end of a statement is identified by the occurrence of a semicolon. Semicolons within certain SQL constructs like
BEGIN ... END blocks are handled correctly by the splitting mechanism.
SQL statements can be beautified by using the format() function.
5
python-sqlparse Documentation, Release 0.4.4.dev0
In this case all keywords in the given SQL are uppercased and the indentation is changed to make it more readable.
Read Formatting of SQL Statements for a full reference of supported options given as keyword arguments to that
function.
Before proceeding with a closer look at the internal representation of SQL statements, you should be aware that this
SQL parser is intentionally non-validating. It assumes that the given input is at least some kind of SQL and then it
tries to analyze as much as possible without making too much assumptions about the concrete dialect or the actual
statement. At least it’s up to the user of this API to interpret the results right.
When using the parse() function a tuple of Statement instances is returned:
Each item of the tuple is a single statement as identified by the above mentioned split() function. So let’s grab the
only element from that list and have a look at the tokens attribute. Sub-tokens are stored in this attribute.
Details of the returned objects are described in Analyzing the Parsed Statement.
6 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
sqlparse is currently tested under Python 3.5+ and PyPy. Tests are automatically run on each commit and for each
pull request on Travis: https://travis-ci.org/andialbrecht/sqlparse
Make sure to run the test suite before sending a pull request by running
$ tox
It’s ok, if tox doesn’t find all interpreters listed above. Ideally a Python 2 and a Python 3 version should be tested
locally.
Please file bug reports and feature requests on the project site at https://github.com/andialbrecht/sqlparse/issues/new.
truncate_strings If truncate_strings is a positive integer, string literals longer than the given value will
be truncated.
truncate_char (default: “[. . . ]”) If long string literals are truncated (see above) this value will be append to the
truncated string.
reindent If True the indentations of the statements are changed.
reindent_aligned If True the indentations of the statements are changed, and statements are aligned by key-
words.
use_space_around_operators If True spaces are used around all operators.
indent_tabs If True tabs instead of spaces are used for indentation.
indent_width The width of the indentation, defaults to 2.
wrap_after The column limit (in characters) for wrapping comma-separated lists. If unspecified, it puts every
item in the list on its own line.
output_format If given the output is additionally formatted to be used as a variable in a programming language.
Allowed values are “python” and “php”.
comma_first If True comma-first notation for column names is used.
When the parse() function is called the returned value is a tree-ish representation of the analyzed statements. The
returned objects can be used by applications to retrieve further information about the parsed SQL.
All returned objects inherit from these base classes. The Token class represents a single token and TokenList class
is a group of tokens. The latter provides methods for inspecting its child tokens.
class sqlparse.sql.Token(ttype, value)
Base class for all other classes in this module.
It represents a single token and has two instance attributes: value is the unchanged value of the token and
ttype is the type of the token.
flatten()
Resolve subgroups.
has_ancestor(other)
Returns True if other is in this tokens ancestry.
is_child_of(other)
Returns True if this token is a direct child of other.
match(ttype, values, regex=False)
Checks whether the token matches the given arguments.
ttype is a token type. If this token doesn’t match the given token type. values is a list of possible values for
this token. The values are OR’ed together so if only one of the values matches True is returned. Except
for keyword tokens the comparison is case-sensitive. For convenience it’s OK to pass in a single string. If
regex is True (default is False) the given values are treated as regular expressions.
8 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
within(group_cls)
Returns True if this token is within group_cls.
Use this method for example to check if an identifier is within a function: t.within(sql.
Function).
class sqlparse.sql.TokenList(tokens=None)
A group of tokens.
It has an additional instance attribute tokens which holds a list of child-tokens.
flatten()
Generator yielding ungrouped tokens.
This method is recursively called for all child tokens.
get_alias()
Returns the alias for this identifier or None.
get_name()
Returns the name of this identifier.
This is either it’s alias or it’s real name. The returned valued can be considered as the name under which
the object corresponding to this identifier is known within the current statement.
get_parent_name()
Return name of the parent object if any.
A parent object is identified by the first occurring dot.
get_real_name()
Returns the real name (object name) of this identifier.
get_token_at_offset(offset)
Returns the token that is on position offset.
group_tokens(grp_cls, start, end, include_end=True, extend=False)
Replace tokens by an instance of grp_cls.
has_alias()
Returns True if an alias is present.
insert_after(where, token, skip_ws=True)
Inserts token after where.
insert_before(where, token)
Inserts token before where.
token_first(skip_ws=True, skip_cm=False)
Returns the first child token.
If skip_ws is True (the default), whitespace tokens are ignored.
if skip_cm is True (default: False), comments are ignored too.
token_index(token, start=0)
Return list index of token.
token_next(idx, skip_ws=True, skip_cm=False, _reverse=False)
Returns the next token relative to idx.
If skip_ws is True (the default) whitespace tokens are ignored. If skip_cm is True comments are ignored.
None is returned if there’s no next token.
10 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
class sqlparse.sql.If(tokens=None)
An ‘if’ clause with possible ‘else if’ or ‘else’ parts.
class sqlparse.sql.For(tokens=None)
A ‘FOR’ loop.
class sqlparse.sql.Assignment(tokens=None)
An assignment like ‘var := val;’
class sqlparse.sql.Comparison(tokens=None)
A comparison used for example in WHERE clauses.
sqlformat The sqlformat command line script ist distributed with the module. Run sqlformat --help to
list available options and for usage hints.
sqlformat.appspot.com An example Google App Engine application that exposes the formatting features us-
ing a web front-end. See https://sqlformat.org/ for details. The source for this application is available from a
source code check out of the sqlparse module (see extras/appengine).
2.5.2 Changelog
Development Version
Nothing yet.
Enhancements
• Add support for DIV operator (pr664, by chezou).
• Add support for additional SPARK keywords (pr643, by mrmasterplan).
• Avoid tokens copy (pr622, by living180).
• Add REGEXP as a comparision (pr647, by PeterSandwich).
• Add DISTINCTROW keyword for MS Access (issue677).
• Improve parsing of CREATE TABLE AS SELECT (pr662, by chezou).
Bug Fixes
• Fix spelling of INDICATOR keyword (pr653, by ptld).
• Fix formatting error in EXTRACT function (issue562, issue670, pr676, by ecederstrand).
• Fix bad parsing of create table statements that use lower case (issue217, pr642, by mrmasterplan).
• Handle backtick as valid quote char (issue628, pr629, by codenamelxl).
• Allow any unicode character as valid identifier name (issue641).
Other
• Update github actions to test on Python 3.10 as well (pr661, by cclaus).
Notable Changes
• IMPORTANT: This release fixes a security vulnerability in the strip comments filter. In this filter a regular
expression that was vulnerable to ReDOS (Regular Expression Denial of Service) was used. See the security
advisory for details: https://github.com/andialbrecht/sqlparse/security/advisories/GHSA-p5w8-wqhj-9hhf The
vulnerability was discovered by @erik-krogh and @yoff from GitHub Security Lab (GHSL). Thanks for report-
ing!
Enhancements
• Add ELSIF as keyword (issue584).
• Add CONFLICT and ON_ERROR_STOP keywords (pr595, by j-martin).
Bug Fixes
• Fix parsing of backticks (issue588).
• Fix parsing of scientific number (issue399).
Bug Fixes
• Just removed a debug print statement, sorry. . .
Notable Changes
• Remove support for end-of-life Python 2.7 and 3.4. Python 3.5+ is now required.
• Remaining strings that only consist of whitespaces are not treated as statements anymore. Code that ignored
the last element from sqlparse.split() should be updated accordingly since that function now doesn’t return an
empty string as the last element in some cases (issue496).
Enhancements
• Add WINDOW keyword (pr579 by ali-tny).
• Add RLIKE keyword (pr582 by wjones1).
Bug Fixes
• Improved parsing of IN(. . . ) statements (issue566, pr567 by hurcy).
• Preserve line breaks when removing comments (issue484).
• Fix parsing error when using square bracket notation (issue583).
• Fix splitting when using DECLARE . . . HANDLER (issue581).
12 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
Enhancements
• Add HQL keywords (pr475, by matwalk).
• Add support for time zone casts (issue489).
• Enhance formatting of AS keyword (issue507, by john-bodley).
• Stabilize grouping engine when parsing invalid SQL statements.
Bug Fixes
• Fix splitting of SQL with multiple statements inside parentheses (issue485, pr486 by win39).
• Correctly identify NULLS FIRST / NULLS LAST as keywords (issue487).
• Fix splitting of SQL statements that contain dollar signs in identifiers (issue491).
• Remove support for parsing double slash comments introduced in 0.3.0 (issue456) as it had some side-effects
with other dialects and doesn’t seem to be widely used (issue476).
• Restrict detection of alias names to objects that actually could have an alias (issue455, adopted some parts of
pr509 by john-bodley).
• Fix parsing of date/time literals (issue438, by vashek).
• Fix initialization of TokenList (issue499, pr505 by john-bodley).
• Fix parsing of LIKE (issue493, pr525 by dbczumar).
• Improve parsing of identifiers (pr527 by liulk).
Notable Changes
• Remove support for Python 3.3.
Enhancements
• New formatting option “–indent_after_first” (pr345, by johshoff).
• New formatting option “–indent_columns” (pr393, by digitalarbeiter).
• Add UPSERT keyword (issue408).
• Strip multiple whitespace within parentheses (issue473, by john-bodley).
• Support double slash (//) comments (issue456, by theianrobertson).
• Support for Calcite temporal keywords (pr468, by john-bodley).
Bug Fixes
• Fix occasional IndexError (pr390, by circld, issue313).
• Fix incorrect splitting of strings containing new lines (pr396, by fredyw).
Enhancements
• Add more keywords for MySQL table options (pr328, pr333, by phdru).
• Add more PL/pgSQL keywords (pr357, by Demetrio92).
• Improve parsing of floats (pr330, by atronah).
Bug Fixes
• Fix parsing of MySQL table names starting with digits (issue337).
• Fix detection of identifiers using comparisons (issue327).
• Fix parsing of UNION ALL after WHERE (issue349).
• Fix handling of semicolon in assignments (issue359, issue358).
Enhancements
• New command line option “–encoding” (by twang2218, pr317).
• Support CONCURRENTLY keyword (issue322, by rowanseymour).
Bug Fixes
• Fix some edge-cases when parsing invalid SQL statements.
• Fix indentation of LIMIT (by romainr, issue320).
• Fix parsing of INTO keyword (issue324).
Internal Changes
• Several improvements regarding encodings.
14 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
Enhancements
• Add comma_first option: When splitting list “comma first” notation is used (issue141).
Bug Fixes
• Fix parsing of incomplete AS (issue284, by vmuriart).
• Fix parsing of Oracle names containing dollars (issue291).
• Fix parsing of UNION ALL (issue294).
• Fix grouping of identifiers containing typecasts (issue297).
• Add Changelog to sdist again (issue302).
Internal Changes
• is_whitespace and is_group changed into properties
Notable Changes
• PostgreSQL: Function bodys are parsed as literal string. Previously sqlparse assumed that all function bodys are
parsable psql strings (see issue277).
Bug Fixes
• Fix a regression to parse streams again (issue273, reported and test case by gmccreight).
• Improve Python 2/3 compatibility when using parsestream (issue190, by phdru).
• Improve splitting of PostgreSQL functions (issue277).
IMPORTANT: The supported Python versions have changed with this release. sqlparse 0.2.x supports Python 2.7 and
Python >= 3.3.
Thanks to the many contributors for writing bug reports and working on pull requests who made this version possible!
Internal Changes
• sqlparse.SQLParseError was removed from top-level module and moved to sqlparse.exceptions.
• sqlparse.sql.Token.to_unicode was removed.
• The signature of a filter’s process method has changed from process(stack, stream) -> to process(stream). Stack
was never used at all.
• Lots of code cleanups and modernization (thanks esp. to vmuriart!).
• Improved grouping performance. (sjoerdjob)
Enhancements
• Support WHILE loops (issue215, by shenlongxing).
• Better support for CTEs (issue217, by Andrew Tipton).
• Recognize USING as a keyword more consistently (issue236, by koljonen).
Bug Fixes
• Fix IndexError when statement contains WITH clauses (issue205).
Bug Fixes
• Remove universal wheel support, added in 0.1.17 by mistake.
Enhancements
• Speed up parsing of large SQL statements (pull request: issue201, fixes the following issues: issue199, issue135,
issue62, issue41, by Ryan Wooden).
Bug Fixes
• Fix another splitter bug regarding DECLARE (issue194).
Misc
• Packages on PyPI are signed from now on.
Bug Fixes
• Fix a regression in get_alias() introduced in 0.1.15 (issue185).
• Fix a bug in the splitter regarding DECLARE (issue193).
• sqlformat command line tool doesn’t duplicate newlines anymore (issue191).
16 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
• Don’t mix up MySQL comments starting with hash and MSSQL temp tables (issue192).
• Statement.get_type() now ignores comments at the beginning of a statement (issue186).
Bug Fixes
• Fix a regression for identifiers with square bracktes notation (issue153, by darikg).
• Add missing SQL types (issue154, issue155, issue156, by jukebox).
• Fix parsing of multi-line comments (issue172, by JacekPliszka).
• Fix parsing of escaped backslashes (issue174, by caseyching).
• Fix parsing of identifiers starting with underscore (issue175).
• Fix misinterpretation of IN keyword (issue183).
Enhancements
• Improve formatting of HAVING statements.
• Improve parsing of inline comments (issue163).
• Group comments to parent object (issue128, issue160).
• Add double precision builtin (issue169, by darikg).
• Add support for square bracket array indexing (issue170, issue176, issue177 by darikg).
• Improve grouping of aliased elements (issue167, by darikg).
• Support comments starting with ‘#’ character (issue178).
Bug Fixes
• Floats in UPDATE statements are now handled correctly (issue145).
• Properly handle string literals in comparisons (issue148, change proposed by aadis).
• Fix indentation when using tabs (issue146).
Enhancements
• Improved formatting in list when newlines precede commas (issue140).
Bug Fixes
• Fix a regression in handling of NULL keywords introduced in 0.1.12.
Bug Fixes
• Fix handling of NULL keywords in aliased identifiers.
• Fix SerializerUnicode to split unquoted newlines (issue131, by Michael Schuller).
• Fix handling of modulo operators without spaces (by gavinwahl).
Enhancements
• Improve parsing of identifier lists containing placeholders.
• Speed up query parsing of unquoted lines (by Michael Schuller).
Bug Fixes
• Fix incorrect parsing of string literals containing line breaks (issue118).
• Fix typo in keywords, add MERGE, COLLECT keywords (issue122/124, by Cristian Orellana).
• Improve parsing of string literals in columns.
• Fix parsing and formatting of statements containing EXCEPT keyword.
• Fix Function.get_parameters() (issue126/127, by spigwitmer).
Enhancements
• Classify DML keywords (issue116, by Victor Hahn).
• Add missing FOREACH keyword.
• Grouping of BEGIN/END blocks.
Other
• Python 2.5 isn’t automatically tested anymore, neither Travis nor Tox still support it out of the box.
Bug Fixes
• Removed buffered reading again, it obviously causes wrong parsing in some rare cases (issue114).
• Fix regression in setup.py introduced 10 months ago (issue115).
Enhancements
• Improved support for JOINs, by Alexander Beedie.
Bug Fixes
• Fix an regression introduced in 0.1.5 where sqlparse didn’t properly distinguished between single and double
quoted strings when tagging identifier (issue111).
Enhancements
• New option to truncate long string literals when formatting.
18 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
Bug Fixes
• Whitespaces within certain keywords are now allowed (issue97, patch proposed by xcombelle).
Enhancements
• Improve parsing of assignments in UPDATE statements (issue90).
• Add STRAIGHT_JOIN statement (by Yago Riveiro).
• Function.get_parameters() now returns the parameter if only one parameter is given (issue94, by wayne.wuw).
• sqlparse.split() now removes leading and trailing whitespaces from split statements.
• Add USE as keyword token (by mulos).
• Improve parsing of PEP249-style placeholders (issue103).
Bug Fixes
• Fix Python 3 compatibility of sqlformat script (by Pi Delport).
• Fix parsing of SQL statements that contain binary data (by Alexey Malyshev).
• Fix a bug where keywords were identified as aliased identifiers in invalid SQL statements.
• Fix parsing of identifier lists where identifiers are keywords too (issue10).
Enhancements
• Top-level API functions now accept encoding keyword to parse statements in certain encodings more reliable
(issue20).
• Improve parsing speed when SQL contains CLOBs or BLOBs (issue86).
• Improve formatting of ORDER BY clauses (issue89).
• Formatter now tries to detect runaway indentations caused by parsing errors or invalid SQL statements. When
re-indenting such statements the formatter flips back to column 0 before going crazy.
Other
• Documentation updates.
sqlparse is now compatible with Python 3 without any patches. The Python 3 version is generated during install by
2to3. You’ll need distribute to install sqlparse for Python 3.
Bug Fixes
• Fix parsing error with dollar-quoted procedure bodies (issue83).
Other
• Documentation updates.
Bug Fixes
• Improve handling of quoted identifiers (issue78).
• Improve grouping and formatting of identifiers with operators (issue53).
• Improve grouping and formatting of concatenated strings (issue53).
• Improve handling of varchar() (by Mike Amy).
• Clean up handling of various SQL elements.
• Switch to pytest and clean up tests.
• Several minor fixes.
Other
• Deprecate sqlparse.SQLParseError. Please use sqlparse.exceptions.SQLParseError instead.
• Add caching to speed up processing.
• Add experimental filters for token processing.
• Add sqlformat.parsestream (by quest).
Bug Fixes
• Avoid “stair case” effects when identifiers, functions, placeholders or keywords are mixed in identifier lists
(issue45, issue49, issue52) and when asterisks are used as operators (issue58).
• Make keyword detection more restrict (issue47).
• Improve handling of CASE statements (issue46).
• Fix statement splitting when parsing recursive statements (issue57, thanks to piranna).
• Fix for negative numbers (issue56, thanks to kevinjqiu).
• Pretty format comments in identifier lists (issue59).
• Several minor bug fixes and improvements.
Bug Fixes
• Improve parsing of floats (thanks to Kris).
• When formatting a statement a space before LIMIT was removed (issue35).
• Fix strip_comments flag (issue38, reported by ooberm. . . @gmail.com).
20 Chapter 2. Contents
python-sqlparse Documentation, Release 0.4.4.dev0
Bug Fixes
• Fixed incorrect detection of keyword fragments embed in names (issue7, reported and initial patch by andy-
boyko).
• Stricter detection of identifier aliases (issue8, reported by estama).
• WHERE grouping consumed closing parenthesis (issue9, reported by estama).
• Fixed an issue with trailing whitespaces (reported by Kris).
• Better detection of escaped single quotes (issue13, reported by Martin Brochhaus, patch by bluemaro with test
case by Dan Carley).
• Ignore identifier in double-quotes when changing cases (issue 21).
• Lots of minor fixes targeting encoding, indentation, statement parsing and more (issues 12, 14, 15, 16, 18, 19).
• Code cleanup with a pinch of refactoring.
Bug Fixes
• Lexers preserves original line breaks (issue1).
• Improved identifier parsing: backtick quotes, wildcards, T-SQL variables prefixed with @.
• Improved parsing of identifier lists (issue2).
• Recursive recognition of AS (issue4) and CASE.
• Improved support for UPDATE statements.
Other
• Code cleanup and better test coverage.
Initial release.
2.6 License
• genindex
• modindex
• search
22 Chapter 2. Contents
CHAPTER 3
Resources
23
python-sqlparse Documentation, Release 0.4.4.dev0
24 Chapter 3. Resources
Python Module Index
s
sqlparse, 7
25
python-sqlparse Documentation, Release 0.4.4.dev0
A If (class in sqlparse.sql), 10
Assignment (class in sqlparse.sql), 11 insert_after() (sqlparse.sql.TokenList method), 9
insert_before() (sqlparse.sql.TokenList method), 9
C is_child_of() (sqlparse.sql.Token method), 8
is_wildcard() (sqlparse.sql.Identifier method), 10
Case (class in sqlparse.sql), 10
Comment (class in sqlparse.sql), 10
Comparison (class in sqlparse.sql), 11
M
match() (sqlparse.sql.Token method), 8
F
flatten() (sqlparse.sql.Token method), 8
P
flatten() (sqlparse.sql.TokenList method), 9 Parenthesis (class in sqlparse.sql), 10
For (class in sqlparse.sql), 11 parse() (in module sqlparse), 7
format() (in module sqlparse), 7
S
G split() (in module sqlparse), 7
get_alias() (sqlparse.sql.TokenList method), 9 sqlparse (module), 7
get_array_indices() (sqlparse.sql.Identifier Statement (class in sqlparse.sql), 10
method), 10
get_cases() (sqlparse.sql.Case method), 10 T
get_identifiers() (sqlparse.sql.IdentifierList Token (class in sqlparse.sql), 8
method), 10 token_first() (sqlparse.sql.TokenList method), 9
get_name() (sqlparse.sql.TokenList method), 9 token_index() (sqlparse.sql.TokenList method), 9
get_ordering() (sqlparse.sql.Identifier method), 10 token_next() (sqlparse.sql.TokenList method), 9
get_parent_name() (sqlparse.sql.TokenList token_prev() (sqlparse.sql.TokenList method), 9
method), 9 TokenList (class in sqlparse.sql), 9
get_real_name() (sqlparse.sql.TokenList method), 9
get_token_at_offset() (sqlparse.sql.TokenList W
method), 9 Where (class in sqlparse.sql), 10
get_type() (sqlparse.sql.Statement method), 10 within() (sqlparse.sql.Token method), 8
get_typecast() (sqlparse.sql.Identifier method), 10
group_tokens() (sqlparse.sql.TokenList method), 9
H
has_alias() (sqlparse.sql.TokenList method), 9
has_ancestor() (sqlparse.sql.Token method), 8
I
Identifier (class in sqlparse.sql), 10
IdentifierList (class in sqlparse.sql), 10
27
Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.
Alternative Proxies: