Drafting a POC JSONField for SQLite
There's this neat little python package called django-jsonfallback. It's made by one of my mentors. To quote the description, this package allows you to
Use PostgreSQL's and MySQL's JSONField, but fall back to a TextField implementation on other databases.
Since my project is to bring a cross-DB JSONField, this project is very useful for me as a starting point. To get a better understanding of how that package works (and how Django model fields work in general), I did a good read on these wonderful Django docs about custom model fields and custom lookups. I also found this interesting Custom Database Backends presentation from Django Under the Hood 2016 event.
As explained in the docs and presentation, you can create
as_sqlite, etc. to create different implementations of
lookups and transforms (it also applies for any class that extends
BaseExpression, if I'm not mistaken). I began looking at the code in
django-jsonfallback, and I found that it was made with some
to check for the database backend. I tried replacing the
as_sql methods with
as_mysql as you can see in this pull request, and it worked pretty well.
I also tried to replace the conditionals used in
process_rhs methods, and it worked for the
when I tried to do the same for
tests failed. I tried to print the compiled query by adding:
and it shows:
SELECT `testapp_book`.`id`, `testapp_book`.`data` FROM `testapp_book` WHERE LOWER(JSON_EXTRACT(`testapp_book`.`data`, $.title)) LIKE LOWER(harry potter)
If I print the query before I replaced the
process_rhs method, it shows:
SELECT `testapp_book`.`id`, `testapp_book`.`data` FROM `testapp_book` WHERE LOWER(JSON_EXTRACT(`testapp_book`.`data`, $.title)) LIKE LOWER("harry potter")
As you can see, the quotes around
harry potter went missing. I don't know
why. I have spent hours trying to debug this but I still haven't found a clue.
Anyway, I decided to stop refactoring and tried to do something new: implementing the lookups for SQLite.
The JSON1 extension provides some functions to handle JSON data in SQLite. These functions are quite simple and somewhat limited compared to the functions in other database backends such as PostgreSQL or MySQL. However, if you can come up with some neat tricks, you can create some useful lookups with these functions.
The existing PostgreSQL JSONField already has some lookups and transforms
implemented. I made a simple summary of them in this Google Sheets document.
Looking at the JSON1 functions, I found
be useful in implementing
How would that work?
json_extract(X,P1,P2,...) function extracts and returns one or more
values from the well-formed JSON string
X that are selected by the paths
P2, .... If the selected path doesn't exist, it will return
That means, if we would like to do a
has_key lookup such as:
We could imagine it being an SQL query like this:
SELECT * FROM myapp_book WHERE json_extract(myapp_book.data, '$.title') IS NOT NULL
That would work, right?
Not quite. The
json_extract function also returns
NULL if the path exists
and the value is actually JSON
null. So if the JSON object has the specified
key but has
null as the value, our query would not return such rows. This
isn't what we want!
Let's look for an alternative:
To quote the JSON1 documentation,
json_type(X,P)function returns the "type" of the element in
Xthat is selected by path P. The "type" returned by json_type() is one of the following an SQL text values:
As you can see, it returns SQL text values of the JSON type of the value.
json_extract(), it also returns
NULL if the path doesn't exist.
null values, it returns
'null'. See the difference?
Therefore, we can still use the same SQL formula, replacing
json_type, like so:
SELECT * FROM myapp_book WHERE json_type(myapp_book.data, '$.title') IS NOT NULL
Pretty clever, don't you think? We just used a function that checks the
datatype of a JSON value to implement
Now, you might already see it coming, but implementing
has_keys lookups are very easy now that we already know what function to use.
has_any_keys, we just need to chain multiple
json_type() IS NOT NULL
OR for each key specified in the lookup. To illustrate, the
Book.objects.filter(data__has_any_keys=['title', 'foo', 'bar'])
would translate into the following SQL query:
SELECT * FROM myapp_book WHERE ( json_type(myapp_book.data, '$.title') IS NOT NULL OR json_type(myapp_book.data, '$.foo') IS NOT NULL OR json_type(myapp_book.data, '$.bar') IS NOT NULL )
has_keys, it's pretty much the same. We just need to chain it with
AND instead of
I've made some commits implementing those lookups on my sqlite-lookups
branch. I also deleted
@xfail decorators for those lookups' tests. So now,
running the tests with SQLite should expect those lookups to work correctly.
Now, I haven't figured out the best way to implement the remaining lookups (and transforms). Also, I'm not really sure if what I did is the best approach to implement those three lookups. If you have any thoughts about this, please let me know on the comments below!