Code Monkey home page Code Monkey logo

django-postgres-searchindex's Introduction

django-postgres-searchindex

CI Version Licence PyPI Downloads

A bit like django-haystack, but everything in postgres, accessible via Django ORM, using postgres fullext search capabilites. The goal is to ease setup and maintainance for smaller and medium sized projects - without dependencies on search technology like elastic, solr or whoosh.

During conception, I was thinking about developing a backend for django-haystack, but decided against, to be able to develop from the ground up, as simple as possible. The project could still provide a haystack backend one day, but it was just not my priority.

Features

  • Searchindex in PostgreSQL
  • No dependencies besides Django and PostgreSQL
  • contrib.djangocms, for easy indexing of django-cms sites

Quickstart

Describe, index, search.

Define index(es) in django settings

Default value, simplest possible configuration:

POSTGRES_SEARCHINDEX = {
    "default": {},
}

Example for a multilanguage setup:

POSTGRES_SEARCHINDEX = {
    "de": {
        "kwargs": {
            "language": "de",
        }
    },
    "fr": {
        "kwargs": {
            "language": "fr",
        }
    },
}

More complex configurations could include django's SITE_ID or other relevant infos in searchindex key and kwargs.

Define sources

Example, hopefully self explaining.

import html

from django.utils.html import strip_tags
from postgres_searchindex.base import IndexSource / MultiLanguageIndexSource
from postgres_searchindex.source_pool import source_pool

from news.models import News

@source_pool.register
class NewsIndexSource(IndexSource / MultiLanguageIndexSource):
    model = News

    def get_title(self, obj):
        return strip_tags(obj.description)

    def get_content(self, obj):
        return html.unescape(strip_tags(obj.description))

    def get_queryset(self):
        return self.model.objects.published()

Place this code in index_sources.py of your app, and it will be autodiscovered.

Populate the index

Run ./manage.py postgres_searchindex_update to update/build the index.

» ./manage.py postgres_searchindex_update
====================================
Updating index "de" with kwargs {'language': 'de'}
Person. Indexing 5 entries
> Done. Removed from index: 0
Project. Indexing 66 entries
> Done. Removed from index: 0
Media. Indexing 36 entries
> Done. Removed from index: 2
====================================
Updating index "fr" with kwargs {'language': 'fr'}
Person. Indexing 5 entries
> Done. Removed from index: 0
Project. Indexing 66 entries
> Done. Removed from index: 0
Media. Indexing 36 entries
> Done. Removed from index: 2

If you want to control how things were indexed, you can check your IndexEntry instances in Django admin.

Search!

You can now search in your index. You are free to use Django's builtin fulltext features as you like - as in the following example, or in a way more advanced manner.

from django.contrib.postgres.search import SearchVector
from postgres_searchindex.models import IndexEntry

# this will return entries containing "überhaupt" and "uberhaupt"
IndexEntry.objects.annotate(
    search=SearchVector("content", "title", config="german")
).filter(index_key=self.request.LANGUAGE_CODE, search="uberhaupt")

There is a full example in the source: views.py and urls.py will give you an idea.

To be done: |highlight:query templatefilter, to highlight the serach query in the search result text.

Keep the index fresh

Either you'll regularly run ./manage.py postgres_searchindex_update, or you'll implement a realtime or near realtime solution, with signals, throug the POSTGRES_SEARCHINDEX_SIGNAL_PROCESSOR setting.

There are two currently one none (not yet) builtin processors:

  • postgres_searchindex.signal_processors.RealtimeSyncedSignalProcessor
  • postgres_searchindex.signal_processors.RealtimeCelerySignalProcessor

The async signal processor will require you to have celery configured.

contrib.djangocms

A few tools to speed up indexing of django-cms sites.

AppHook

Add postgres_searchindex.contrib.djangocms to settings.INSTALLED_APPS. Configure one of your cms pages to use the app hook "Search Form (postgres_searchindex)". It will provide a very basic search form, and you can override the template postgres_searchindex/search.html if you want.

Indexing of cms pages

Add postgres_searchindex.contrib.djangocms to settings.INSTALLED_APPS.
And set settings.POSTGRES_SEARCHINDEX_USE_CMS_INDEX = True to have your django-cms pages indexed automagically (with the next call of ./manage.py postgres_searchindex_rebuild).

Indexing models with a PlaceholderField

Example Event model, with a PlaceholderField called "content":

import html

from django.utils.html import strip_tags
from postgres_searchindex.base import MultiLanguageIndexSource
from postgres_searchindex.contrib.djangocms.base import PlaceholderIndexSourceMixin
from postgres_searchindex.source_pool import source_pool

from .models import Event

@source_pool.register
class EventIndexSource(PlaceholderIndexSourceMixin, MultiLanguageIndexSource):
    model = Event
    placeholder_field_name = "content"

    def get_content(self, obj):
        c = strip_tags(obj.description)  # prepend with preview/description
        c += super().get_content(obj)  # render placeholder
        c = html.unescape(c)  # convert & to "
        return c

    def get_queryset(self):
        return self.model.objects.published()

Inspired by haystack

I used django-haystack for a decade, and I really like the concept. Building my first index though, was quite time intensive. After development of haystack and also some of it's backends have sometimes stalled, I was regularly thinking about writing my own search index, with PostgreSQL only.

TODO

See open issues.

django-postgres-searchindex's People

Contributors

benzkji avatar wullerot avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.