Code Monkey home page Code Monkey logo

sphinxit's Introduction

Meet Sphinxit!

https://secure.travis-ci.org/semirook/sphinxit.png?branch=master

I don't support this product any more. If you need some fixes or new features - fork and play with that, please. Snaql with native SphinxQL are recommended now.

Sphinxit is light and powerful SphinxQL query constructor. More, it is the set of helpers to make search with Sphinx easy to use in any kind of Python projects like Django, Flask and others. Sphinxit is independent Python library and you are free to use it anywhere you need powerful Sphinx-based search.

Make full-text queries, filtering, ordering, grouping and aggregations. Forget about deprecated Sphinx API or unpredictable non-tested batteries. Sphinxit is just better.

Documentation and examples are on RTD: http://sphinxit.readthedocs.org/

sphinxit's People

Contributors

coagulant avatar se7ge avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sphinxit's Issues

Grouping by MVA

The behaviour in grouping by MVA is wrong now. For example, we have some query like this:

q = Sphinxit('index').cluster('mva_attr')

where 'mva_attr' is an ordinary MVA attribute. I expect to receive counted distinct values, not unique groups of them.

Problem with CALL SNIPPETS and escaping of characters

Конфиг Sphinxit:

class SphinxitConfig(object):
    DEBUG = True
    WITH_META = True
    WITH_STATUS = False
    POOL_SIZE = 5
    SEARCHD_CONNECTION = {
        'host': '127.0.0.1',
        'port': 9306,
    }

Для воспроизведения бага не требуется создавать какие-либо индексы вообще.
При выполнении следующего кода

Snippet(index='nonexistent_index', config=SphinxitConfig)
    .for_query("nonexistent_query") 
    .from_data("'", 'test')
    .ask()

Sphinx не сможет выполнить запрос, так как в нем содержится неэкранированная одинарная кавычка.

sphinxql: syntax error, unexpected $end, expecting ')' or ',' near ...

Unicode problem

When i try to fulltext searching with unicode keyword

from sphinxit.core.processor import Search
search_query = Search(**myconfig)
search_query = search.match(u'молоко')
search_query.ask() # returns empty result items, but in fact there are exists

Solution is:

lex = search_query.lex().encode('utf8')
It can be used in Search->ask() for example

Also Search class could be inherited like this

class UnicodeSearch(Search):
def lex(self):
return super(UnicodeSearch, self).lex().encode('utf8')

Disconnect after invalid query

Sphinxit saves connection after invalid query (index is not exists, for example). It`s bad for some reasons, we have to destroy connector first, to see chages after fix.

Make more clean Snippets API

Sphinxit.Snippets('index', data=['only good news'], query='Good News').process() sucks. I think it's possible to make more clean interface for that.

error: (1040, 'Too many connections', None)

Hello,
when i use it, it always show error: (1040, 'Too many connections', None),
i think maybe you don't close Mysql connection after query.
would you please check it again ?
thank you for this great project!

Implicit attempt to make proper values types from improper

For example, limit() method. Proper values are integers but I don't want to make redundant conversion from strings if they are values from GET, for example. limit('10', '100') -> limit(10, 100), automatically. I want the same behaviour in another methods too.

Strict Mode On/Off

Sphinxit is very sensitive for proper values and raises exceptions if something is wrong. It's good for debug but annoys in production. We have to check values and their types manually in our views and that sucks. It would be nice to have some 'strict' or 'debug' mode to raise or not to raise exceptions (on/off). If values are improper - just ignore them and go on.

Expression ranker

Does the implementation allow rankers in expression mode?

I see, there is check of 'expr' ranker in in https://github.com/semirook/sphinxit/blob/master/sphinxit/core/convertors.py#L364 , but I can't find any statement to pass an expression to the ranker. How to do that?

I could pass my expression to my ranker using a hack of the line https://github.com/semirook/sphinxit/blob/master/sphinxit/core/convertors.py#L364 from
if not ranker in valid_rankers:
to
if not (ranker in valid_rankers or ranker[:4]=='expr'):
and the client code:

searcher = Search(indexes=[index], config=SphinxitConfig)
options = {
        'ranker': "expr('sum(lcs*user_weight)*1000+bm25')",
        'max_matches': 1,
}
search_query = searcher.match(query).options(**options)

The code works, but I suggest there is a direct way to do that. Isn't it?

Q expressions work incorrectly

I encountered several bugs with Q objects in Sphinxit.

  • In complex expressions there are no parentheses around groups, and also some parts just get lost. For example:

.filter((Q(a__eq=1) | Q(b__eq=2)) & (Q(c__eq=3) | (Q(d__eq=4))))

The generated query:

(a=1) OR (b=2) AND (d=4) AS cnd

The c attribute is gone.

  • Negation doesn't work at all.

Expressions .filter(~Q(a__eq=1)) and

.filter(Q(a__eq=1))

have the same result:

(a=1) AS cnd

Your tests are wrong too, as far as I can see:
https://github.com/semirook/sphinxit/blob/master/sphinxit/tests/test_core.py#L200, https://github.com/semirook/sphinxit/blob/master/sphinxit/tests/test_core.py#L224

Update for Sphinx 2.19

Not working for me on my CentOS system with Sphinx 2.19. I suspect that Sphinx 2.19 might have had some changes that broke it.

>>> from sphinxit.core.helpers import BaseSearchConfig
>>> from sphinxit.core.processor import Search
>>> search_query = Search(indexes=['sourcecode'], config=BaseSearchConfig)
>>> search_query.ask()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sphinxit/core/processor.py", line 287, in ask
    return self.connector.execute(query_batch)
  File "sphinxit/core/connector.py", line 149, in execute
    raise SphinxQLDriverException(e)
sphinxit.core.exceptions.SphinxQLDriverException: u'Variable_name'

Batch queries are really batch?

Hi, i have a perfomance trouble. I used python sphinxapi before, and sometimes i had strange problems with huge batch queries (unpack exception). Anyway, it worked pretty fast, about 2s. Now i use sphinxit, and the same queries executes for 10s. I have only one question: is it really batch?

for sub_ql_pair in sxql_batch:
        subresult = {}
        sub_ql, sub_alias = sub_ql_pair
        cursor_exec(sub_ql)
        subresult['items'] = [r for r in cursor]

Profiler said some interesting:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      670    8.266    0.012    8.266    0.012 {method 'execute' of 'oursql.Cursor' objects}

Order_by conditions sorting

If I create query with conditions:

query = Search().match(term + '*').order_by('somefield2', 'desc').order_by('somefield', 'asc')

I want to get this ordering: "ORDER BY somefield2 DESC, somefield ASC"

But I always have this: "ORDER BY somefield ASC, somefield2 DESC"

This occurs because type of query._nodes.OrderBy.orderings is "set".

I temporarily solved this problem by set a "OrderBy.ordering" manifestly as list. You can fix this?

Update + oursql is not working correctly.

Example query:

search_query = Search(indexes=['test'], config=SphinxitConfig)
search_query = search_query.update(broken = 1).filter(id__in=[1, 2])
search_query.ask()

ends with:

sphinxit.core.exceptions.SphinxQLDriverException: no results available

It's failing on line 117 in connector.py

subresult['items'] = [r for r in cursor]

And to be exact the real exception is:

  File "/usr/lib/python2.6/site-packages/sphinxit/core/connector.py", line 118, in _execute_batch
    subresult['items'] = [r for r in cursor]
  File "cursor.pyx", line 183, in oursql.Cursor.fetchone (oursqlx/oursql.c:17532)
  File "cursor.pyx", line 161, in oursql.Cursor._check_statements (oursqlx/oursql.c:17369)
ProgrammingError: (None, 'no results available', None)

Search via sql_attr_multi attribute

Hello,

Could you advise how to use SphinxIt to search via sql_attr_multi attributes? I need to implement search through the text with specified tags and rank the search results based on the number of matched tags as well. Could you provide me with an example how this can be implemented.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.