While working on any large-ish Django project you are bound to come across a slow query that’s
perhaps missing an index or doing something else expensive. My workflow for diagnosing this was to
get the query that is being executed (
str(queryset.query)) and paste it into a database shell,
prefixing it with
EXPLAIN. The database would then give you the query plan which reveals the most
expensive parts and can be a great help in optimizing it. This can be quite cumbersome though as
you often have to fix up parts of the query to make it successfully execute, and it involves copy-pasting
into another tool.
The latest release of Django (2.1) includes a feature that I implemented which should automate all of this for you in a way that works with whatever database you are using: Queryset.explain().
Quite simply, given a
Queryset you can run
print(queryset.explain()) and it will execute the
query with the appropriate prefix for your database and return the query plan.
For example, from the Django documentation:
>>> print(Blog.objects.filter(title='My Blog').explain()) Seq Scan on blog (cost=0.00..35.50 rows=10 width=12) Filter: (title = 'My Blog'::bpchar)
This is obviously a pretty simple query, but if you’re using
select_related() or doing other
complex joins it will show you the complete plan.
If you’re using Postgres or MySQL you can add keywords to the call to get more detailed information:
>>> print(Blog.objects.filter(title='My Blog').explain(verbose=True)) Seq Scan on public.blog (cost=0.00..35.50 rows=10 width=12) (actual time=0.004..0.004 rows=10 loops=1) Output: id, title Filter: (blog.title = 'My Blog'::bpchar) Planning time: 0.064 ms Execution time: 0.058 ms
The code changes needed to implement this in Django where not too bad, compared to other database related work that I have done in the past.
All backends define a
explain_prefix string, and while executing the
QuerySet we check if the query is an
’explained’ query (
.explain() has been called on it). If so we prefix the query with this string and execute the query.
The results of this is different for each backend. With Postgres you get a nice human-readable string back, but with MySQL you get a table result that we coerce into a string before returning.
There is some extra logic to handle special database-specific parameters (
verbose in Postgres for example), but that
is pretty much it.
I’m glad you asked. The documentation says that:
explain() is supported by all built-in database backends except Oracle because an implementation there isn’t straightforward.
This is a huge understatement. During my work on this feature I was amazed at how complex this was to implement on Oracle, versus the simplicity of other databases. Here is how you get the query execution plan:
EXPLAIN PLAN FOR
EXPLAIN PLAN FORa specific UID per plan.
So this is a pretty complex workflow to just get a query plan, but it’s doable. I had this feature working in my branch and it seemed OK. Until I hit a blocker:
Yeah. So… that’s a blocker. All queries in Django are parameterized to prevent SQL injections, so rather than executing
SELECT * FROM table WHERE id = 1;
Django would execute:
SELECT * FROM table WHERE id = ?;
1 as the parameter (denoted as
? in the query above). With oracle you cannot do this, all parameters must
be filled in.
So that’s why we cannot support Oracle with this, which kind of sucks.