We will leverage the built-in FTS5 plugin to search through messages in a simple chat app.
This blog post is inspired by Fly.io’s SQLite3 Full Text Search With Phoenix but instead of creating a new virtual table from scratch we keep our existing messages table and add a new virtual table for our search functionality on top.
This has some advantages:
-
The new search functionality and the old data & code are more separated.
-
We can keep our Ecto abstractions and can easily add fields to our messages.
-
In many cases adding full text search retroactively might be the only way.
One disadvantage is that we have to keep both tables in sync. We will use SQLite triggers whenever we update our messages data.
Table of Contents
To see how everything works, let’s start to creates a basic app that saves chat messages:
Creating our chat app
First, we start a new app with a SQLite3 database.
|
|
In our app directory, we generate a context Chat and a Message model with an Ecto schema and database migration:
|
|
Our migration file in ./priv/repo/migrations
should look like this:
|
|
Next, we run the migration:
|
|
Let’s insert some messages so we can add full text search to some existing data later:
|
|
It works! Now it’s time to add full text search without any external dependencies:
Implementing our search
I recommend taking a look at Fly.io’s SQLite3 Full Text Search With Phoenix post that explains how the FTS5-plugin uses virtual tables to search through our data.
Instead of creating a FTS5 virtual table from scratch that keeps our search index AND the data in one place we will make use of the External Content Tables functionality.
How do External Content Tables work?
This feature creates a FTS5 table that stores only FTS index entries. Whenever we search for some content in our messages, the FTS5 plugin queries the data from our existing messages table.
As mentioned above, this leaves our messages untouched and we can keep using all the conveniences provided by Ecto.
For example, I first tried the from-scratch-migration from Fly.io’s post but got an error when I added another column to the messages table via ecto.gen.migration
. I would have to work with manual SQL commands whenever I wanted to change something.
The approach in this post let’s me use messages in the traditional Ecto way.
Furthermore, since we don’t store any data in our FTS5 virtual table, we keep our database small.
Adding a FTS5 virtual table to our Phoenix app
Let’s generate a new migration:
|
|
The ecto_sqlite3 adapter doesn’t have any convenience functions to add a virtual table - we have to do it by hand. So let’s replace the generated code in the migration with this:
|
|
Note: I’ve done my best to implement the steps found in the FTS5 docs in our migration. I’m by no means an SQLite expert, so be careful to blindly copy these commands.
Let’s go through our manual migration:
We have up/0
and down/0
functions to migrate our database or to roll it back. We could put everything in one change() function, but I like to separate the steps (to make it easier to grasp what is going on).
The execute/1 command allows us to use several SQL commands:
The virtual table for the FTS5 search index
|
|
In the first step we create a new virtual table named messages_search with a column body.
This part content='messages', content_rowid='id'
tells FTS5 to get the data from our existing messages table with the primary_key id.
We could also add other fields like the title - as long as they exist as a column in the queried table messages.
Okay - we have an FTS5 virtual table with an empty index. Now it’s time to sync it with our data.
Keeping the index up-to-date with SQLite triggers
|
|
This command creates a trigger that fires whenever we insert data in our old messages table. Whenever a new message gets inserted, SQLite automatically updates our FTS5 virtual table so it knows where to find the message id and the message body.
We also create 2 other triggers:
|
|
They update the FTS5 index each time a message gets deleted or updated.
With these commands we can create the virtual table and the triggers to keep it synced.
Rollback the database
|
|
In the down() function we instruct Ecto to delete all triggers and drop the messages_search table when we choose to rollback the database.
Syncing our tables manually
Let’s migrate our database with
|
|
We now have all necessary tables and triggers.
Let’s check our database with DB Browser for SQLite:
We can see our messages table created by Ecto and also multiple messages_search-tables created by our migration (and the FTS5 plugin).
We can also see our triggers which keep our index in sync.
Instead of triggers, we could also opt for a simpler solution like syncing the tables periodically. In fact we have to do it once to index our two existing messages in our database.
Let’s open iex to update our index manually:
|
|
We’ll use a single SQL command that builds our FTS5 search index for our existing messages:
|
|
Ecto schema for our FTS5 table
Let’s add a schema in .lib/chatter/search/messages_search.ex
to use our messages_search table with Ecto:
|
|
We map our schema’s primary key to :id (based on the :rowid in the database that references our messages :id) so we can use it with Ecto.
FTS5 also creates a :rank column that lets us order our results, the virtual field in our schema makes sure we can use it in our queries.
Querying the Full Text Search index
Now with everything in place, we can search within our messages. Let’s add a Search context:
|
|
fragment()
searches through our FTS5 index and returns a list of matching messages (or an empty list if nothing matches our query string).
|
|
Everything works! We can search for strings in our messages and our FTS5 index will get updated automatically.
Other things to try out
Add more fields to our index. You could include the :title column in the migration. You could also try to add another column in a completely new migration (that executes manual SQL commands).
Our implementation only queries full strings. For substrings like “virt” (instead of “virtual”). It’s also case-insensitive. From looking at the FTS5 docs, you could leverage the built-in trigram tokenizer to add this functionality.
That’s it! I hope this little project was helpful. Please share this post with others and let me know how you would improve or build upon this code.