Fulltext search in SQLite and Django app
Since version about 3.5.9 SQLite contains full-text search module called FTS3 (older releases may have FTS1, FTS2). Using this module we can easily add fast full text search to a Django application (or any other using SQLite). During my tests I got FTS3 only on Python 2.6. On Python 2.5 and using pysqlite may be hard to get FTS3 (try recompiling with -DSQLITE_ENABLE_FTS3=1 flag).
Creating virtual table
Here is an example table:
FTS1 and FTS2 can be used exactly in the same way as FTS3. More on sqlite.org.
Adding data to a virtual table
We can start with importing data from existing table. With this script (executed from Django project folder) we can import the data:
# -*- coding: utf-8 -*-
import sys
import urllib2
from os import environ
environ['DJANGO_SETTINGS_MODULE'] = 'settings'
from settings import *
from django.contrib.sessions.models import *
from django.db import connection, transaction
from MYAPP.models import *
cursor = connection.cursor()
j = MY_SOME_MODEL.objects.all()
iterr = 1
for i in j:
print iterr
txt = i.some_txt + ' ' + i.more_txt + ' ' + i.city_of_something
# txt should be stripped from HTML, stop words etc. to get smaller size of the database
cursor.execute("INSERT INTO my_search (slug, body) VALUES (%s, %s)", (i.slug, txt))
iterr += 1
transaction.commit_unless_managed()
Indexing new entries we can handle in Django with signals. For example in models.py add:
from django.db.models import signals
#....
def update_index(sender, instance, created, **kwargs):
cursor = connection.cursor()
txt = instance.some_txt + ' ' + instance.more_txt + ' ' + instance.city_of_something
# txt should be stripped from HTML, stop words etc. to get smaller size of the database
txt = clean_to_search(txt)
if created:
# add if object is created, not updated
cursor.execute("INSERT INTO my_search (slug, body) VALUES (%s, %s)", (instance.slug, txt))
transaction.commit_unless_managed()
signals.post_save.connect(update_index, sender=MY_SOME_MODEL)
Full-text in SQLite
The query looks like this:
cursor = connection.cursor()
cursor.execute("SELECT slug FROM my_search WHERE body MATCH %s", (query,))
results = cursor.fetchall()
I'm testing this search solution on my JobMaster - job offer searcher hosted on megiteam.pl. As for now no problems occurred with this (except switch from Python 2.5 to Python 2.6). It doesn't leak memory (Nginx+FastCGI), and it doesn't seem to be slow (it index entries faster than Whoosh, but no number on it). It's simple, easy to setup, and no whoosh. solr, xapian needed, so it's cool way to add full text search to SQLite powered websites.
Comment article