Code Monkey home page Code Monkey logo

tinyfts's Introduction

tinyfts

CI badge

A very small standalone full-text search HTTP/SCGI server.

A screenshot of what the unofficial tinyfts search service for the Tcler's Wiki looked like

Contents

Dependencies

Server

  • Tcl 8.6
  • tclsqlite3 with FTS5

Building, tools, and tests

The above and

  • Tcllib
  • kill(1), make(1), sqlite3(1)
  • tDOM and file(1) to run tools/dir2json

On recent Debian and Ubuntu install the dependencies with

sudo apt install libsqlite3-tcl make sqlite3 tcl tcllib tdom

On FreeBSD with sudo install the dependencies with

sudo pkg install sqlite3 tcl-sqlite3 tcl86 tcllib tdom
cd /usr/local/bin
sudo ln -s tclsh8.6 tclsh

Usage

Usage:
    tinyfts --db-file path [option ...] [wapp-arg ...]
Options:
    --css-file ''
    --credits <HTML>
    --header <HTML>
    --footer <HTML>
    --title tinyfts
    --subtitle <HTML>
    --table tinyfts
    --rate-limit 60
    --result-limit 100
    --log 'access bad-request error rate'
    --behind-reverse-proxy false
    --snippet-size 20
    --title-weight 1000.0
    --query-min-length 2
    --query-syntax web

The basic usage is

tools/import json example.jsonl example.sqlite3
# Local server
./tinyfts --db-file example.sqlite3 --local 8080
# Server available over the network
./tinyfts --db-file example.sqlite3 --server 8080

Query syntax

Default or "web"

The default full-text search query syntax in tinyfts resembles that of a Web search engine. It can handle the following types of expressions.

  • foo — search for the word foo.
  • "foo bar" — search for the phrase foo bar.
  • foo AND bar, foo OR bar, NOT foo — search for both foo and bar, at least one of foo and bar, documents without foo respectively. foo AND bar is identical to foo bar. The operators AND, OR, and NOT must be in all caps.
  • -foo, -"foo bar" — the same as NOT foo, NOT "foo bar".

FTS5

You can allow your users to write full FTS5 queries with the command line option --query-syntax fts5. FTS5 queries are more powerful but expose the technical details of the underlying database. (For example, the column names.) Users who are unfamiliar with the FTS5 syntax will find it surprising and run into errors because they did not quote a word that has a special meaning.

Setup

Tinyfts searches the contents of an SQLite database table with a particular schema. The bundled import tool tools/import can import serialized data (text files with one JSON object or Tcl dictionary per line) and wiki pages from a Wikit/Nikit database into a tinyfts database.

Example

This example shows how to set up search for a backup copy of the Tcler's Wiki. The instructions should work on most Linux distributions and FreeBSD with the dependencies and Git installed.

1. Go to https://sourceforge.net/project/showfiles.php?group_id=211498. Download and extract the last Wikit database snapshot of the Tcler's Wiki. Currently that is wikit-20141112.zip. Let's assume you have extracted the database file to ~/Downloads/wikit.tkd.

2. Download, build, and test tinyfts. In this example we use Git to get the latest development version.

git clone https://github.com/dbohdan/tinyfts
cd tinyfts
make

3. Create a tinyfts search database from the Tcler's Wiki database. The repository includes an import tool that supports Wikit databases. Depending on your hardware, this may take up to several minutes with an input database size in the hundreds of megabytes.

./tools/import wikit ~/Downloads/wikit.tkd /tmp/fts.sqlite3

4. Start tinyfts on http://localhost:8080. The server URL should open automatically in your browser. Try searching.

./tinyfts --db-file /tmp/fts.sqlite3 --title 'tinyfts demo' --local 8080

Operating notes

  • If you put tinyfts behind a reverse proxy, remember to start it with the command line option --behind-reverse-proxy true. It is necessary for correct client IP address detection, which rate limiting depends on. Do not enable --behind-reverse-proxy if tinyfts is not behind a reverse proxy. It will let clients spoof their IP with the header X-Real-IP or X-Forwarded-For and evade rate limiting themselves and rate limit others.

License

MIT. Wapp is copyright (c) 2017-2022 D. Richard Hipp and is distributed under the Simplified BSD License. Tacit is copyright (c) 2015-2020 Yegor Bugayenko and is distributed under the MIT license.

tinyfts's People

Contributors

dbohdan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

tinyfts's Issues

Prioritize search terms in title

When searching for ABC, it would be nice to list the pages that have ABC in the title at the top (before pages containing ABC only in the content)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.