Code Monkey home page Code Monkey logo

search_index's Introduction

Search Index

Description

Search Index provides an easy way to implement high performance fulltext searching on your Symphony site. By setting filters for each Section in your site you control which entries are indexed and therefore searchable. Frontend search can be implemented either using the Search Index field that allows keyword filtering in data sources, or the included Search Index data source for searching multiple sections at once.

Usage

  1. Add the search_index folder to your Extensions directory
  2. Enable the extension from the Extensions page
  3. Configure indexes from Search Index > Indexes

1. Configuring section indexes

After installation navigate to Search Index > Indexes whereupon you will see a list of all sections in your site. Click on a name to configure the indexing criteria for that section. The index editor works the same as the data source editor:

  • Only values of fields selected from the Included Elements list are used for searching
  • Index Filters work exactly like data source filters. Use these to ensure only desired entries are indexed

Once saved the "Index" column will display "0 entries" on the Search Indexes page. Select the row and choose "Re-index Entries" from the With Selected menu. When the page reloads you will see the index being rebuilt, one page of results at a time.

Multiple sections can be selected at once for re-indexing.

The page size and speed of refresh can be modified by editing the re-index-per-page and re-index-refresh-rate variables in your Symphony config.php.

The index will be automatically updated whenever an entry within the indexed section is created, edited or deleted.

2. Fulltext search in a data source (single section)

Adding a keyword search to an existing data source is extremely easy. Start by adding the Search Index field to your section. This allows you to add a filter on this field when building a data source. For example:

  • add the Search Index field to your section
  • modify your data source to filter this field with a filter value of {$url-keywords}
  • attach the data source to a page and access like /my-page/?keywords=foo+bar

3. Fulltext search across multiple Sections

A full-site search can be achieved using the custom Search Index data source included with this extension. Attach this data source to a page and invoke it using the following GET parameters:

  • keywords the string to search on e.g. foo bar
  • sort (default score) either id (entry ID), date (entry creation date), score (relevance) or score-recency (relevance with a higher weighting for newer entries)
  • direction (default desc) either asc or desc
  • per-page (default 20) number of results per page
  • page the results page number
  • sections a comma-delimited list of section handles to search within (only those with indexes will work) e.g. articles,comments

Your search form might look like this:

<form action="/search/" method="get">
	<label>Search <input type="text" name="keywords" /></label>
	<input type="hidden" name="sort" value="score-recency" />
	<input type="hidden" name="per-page" value="10" />
	<input type="hidden" name="sections" value="articles,comments,categories" />
</form>

Note that all of these variables (except for keywords) have defaults in config.php. So if you would rather not include these on your URLs, modify the defaults there and omit them from your HTML.

If you want to change the name of these variables, they can be modified in your Symphony config.php. If you are using Form Controls to post these variables from a form your variable names may be in the form fields[...]. If so, add fields to the get-param-prefix variable in your Symphony config.php. For more on renaming variables please see the "Configuration" section in this README for an example.

Using Symphony URL Parameters

The default is to use GET parameters such as /search/?keywords=foo+bar&page=2 but if you prefer to use URL Parameters such as /search/foo+bar/2/, set the get-param-prefix variable to a value of param_pool in your config.php and the extension will look at the Param Pool rather than the $_GET array for its values.

Example XML

The XML returned from this data source looks like this:

<search keywords="foo+bar+symfony" sort="score" direction="desc">
	<alternative-keywords>
		<keyword original="foo" alternative="food" distance="1" />
		<keyword original="symfony" alternative="symphony" distance="2" />
	</alternative-keywords>
	<pagination total-entries="5" total-pages="1" entries-per-page="20" current-page="1" />
	<sections>
		<section id="1" handle="articles">Articles</section>
		<section id="2" handle="comments">Comments</section>
	</sections>
	<entry id="3" section="comments">
		<excerpt>...</excerpt>
	</entry>
	<entry id="5" section="articles">
		<excerpt>...</excerpt>
	</entry>
	<entry id="2" section="articles">
		<excerpt>...</excerpt>
	</entry>
	<entry id="1" section="comments">
		<excerpt>...</excerpt>
	</entry>
	<entry id="3" section="comments">
		<excerpt>...</excerpt>
	</entry>
</search>

This in itself is not enough to render a results page. To do so, use the $ds-search Output Parameter created by this data source to filter by System ID in other data sources. In the example above you would create a new data source each for Articles and Comments, filtering System ID by the $ds-search parameter. Use XSLT to iterate over the <entry ... /> elements above, and cross-reference with the matching entries from the Articles and Comments data sources.

(But if you're very lazy and don't give two-hoots about performance, see the build-entries config option explained later.)

Weighting

We all know that all sections are equal, only some are more equal than others ;-) You can give higher or lower weighting to results from certain sections, by issuing them a weighting when you configure their Search Index. The default is Medium (no weighting), but if you want more chance of entries from your section appearing higher up the search results, choose High; or for even more prominence Highest. The opposite is true: to bury entries lower down the results then choose Low or Lowest. This weighting has the effect of doubling/quadrupling or halving/quartering the original "relevance" score calculated by the search.

Configuration

The common configuration options are discussed above. This is a full list of the variables you should see in your config.php. If some are missing it is because you have previously installed an earlier version of the extension. You can add these variables manually to make use of them.

re-index-per-page

Defaults to 20. When manually re-indexing sections in the backend (Search Index > Indexes, highlight rows an select "Re-index" from the With Selected dropdown) this is the number of entries per "page" that will be re-indexed at once. If you have 100 entries and re-index-per-page is 20 then you will have 5 pages of entries that will index, one after the other.

re-index-refresh-rate

Defaults to 0.5 seconds. This is the "pause" between each cycle of indexing when manually re-indexing sections. If you have a high traffic site (or slow server) and you are worried that many consecutive page refreshes will use too much server power, then choose a higher number and there will be a longer pause between each page of indexing. The larger the number, the longer you have to wait during re-indexing. Set to 0 for super-quick times.

min-word-length

The smallest length of word to index. Words shorter than this will be ignored. If your site is technical and you need to index abbreviations such as CSS then make sure min-word-length is set to 3 to allow for these!

max-word-length

The longest length of word to index. Words longer than this will be ignored. The maximum value this variable can be is limited by the database column size (currently varchar(255)).

stem-words

Allow word stems to be included in searches. This usually results in more matches. The popular Porter Stemmer algorithm is used. Examples:

  • summary, summarise => summar
  • filters, filtering => filter

Note: I found a few oddities, namely words ending in y which are shortened to end in i. For example symphony and entry become symphoni and entri respectively. This is obviously incorrect, therefore the Porter algorithm is recommended for English-language sites only.

mode

Three query modes are supported:

  • like uses LIKE '%...%' syntax to match whole and partial words
  • regexp uses REGEXP [[:<:]]...[[:>:]] syntax to match whole words only
  • fulltext uses MATCH(...) AGAINST(...) syntax for MySQL's own fulltext binary search

Changing this variable changes the query mode for all searches made by this extension, both the Search Index data source and filtering on the Search Index field. Mode switching was introduced because of the limitations of fulltext binary search: while very fast, there is a word length limitation, and doesn't work well with short indexed strings or small data sets.

like is the default as this seems to provide the best compromise between performance, in-word matching, and narrowness of results returned.

Both like and regexp modes correctly handle boolean operators in search results:

  • prefix a keyword with + to make it required
  • prefix a keyword with - to make it forbidden
  • surround a phrase with "..." to match the whole phrase

excerpt-length

When using the Search Index data source, each matched entry will include an excerpt with search keywords highlighted in the text. The default length of this string is 250 characters, but modify it to suit your design.

build-entries

By default the Search Index data source will only return an <entry /> stub for each entry found. It is the developer's job to add additional data sources that filter using the search output parameter, in order to provide extra fields to build search results fully.

However, for the lazy amongst you, set this variable to yes and the entries will be built in their entirety in the data source. This has the benefit that you need only a single data source, but if your entries have many fields, then this will likely have a performance hit as you are adding fields to your XML that you don't need. With great power comes great responsibility, my son.

default-sections

A comma-separated string of section handles to include in the search by default. If you would rather not pass these via a GET parameter to the search data source (e.g. /search/?sections=articles,comments) then add these to the config and omit them from the URL. Defaults to none.

default-per-page

Default number of entries to show per page. Passing this value as a GET parameter to the search data source (e.g. /search/?per-page=10) overrides this default. Defaults to 20.

default-sort

Default field to sort results by. Passing this value as a GET parameter to the search data source (e.g. /search/?sort=date) overrides this default. Defaults to score.

default-direction

Default direction to sort results by. Passing this value as a GET parameter to the search data source (e.g. /search/?sort=asc) overrides this default. Defaults to desc.

log-keywords

When enabled, each unique search will be logged and be visible under Search Index > Logs.

get-param-*

These variables store the name of the GET parameter that the Search Index data source looks for. Change these if you don't like my choice of GET parameter names, or if you want them in your own language. For example:

get-param-keywords' => 'term',
get-param-per-page' => 'limit',
get-param-sort' => 'order-by',
get-param-direction' => 'order-direction',
get-param-sections' => 'in',
get-param-page' => 'p',

This would mean you'd create your search URL as:

/?term=foo+bar&limit=20&order-by=id&order-direction=asc&in=articles,comments&p=2

The get-param-prefix variable is explained above in "Using Symphony URL Parameters".

indexes and synonyms

These serialised arrays are created by saving settings from Search Index > Indexes and Search Index > Synonyms. Please don't edit them here, or bad things will happen to you.

Synonyms

This allows you to configure word replacements so that commonly mis-spelt terms are automatically fixed, or terms with many alternative spellings or variations can be normalised to a single spelling. An example:

  • Replacement word United Kingdom
  • Synonyms: uk, great britain, GB, united kingdoms

When a user searches for any of the synonym words, they will be replaced by the replacement word. So if a user searches for countries in the UK their search will actually use the phrase counties in the United Kingdom.

Synonym matches are not case-sensitive.

Auto-complete/auto-suggest

There is a "Search Index Suggestions" data source which can be used for auto-complete search inputs. Attach this data source to a page and pass two GET parameters:

  • keywords is the keywords to search for (the start of words are matched, less than 3 chars are ignored)
  • sort (optional) defaults to alphabetical but pass frequency to order words by the frequency in which they occur in your index
  • sections (optional) a comma-delimited list of section handles to return keywords for (only those with indexes will work) e.g. articles,comments. If omitted all indexed sections are used.

This extension does not provide the JavaScript "glue" to build the auto-suggest or auto-complete functionality. There are plenty of jQuery plugins to do this for you, and each expect slightly different XML/JSON/plain text, so I have not attempted to implement this for you. Sorry, old chum.

Log viewer

You can see what your users have searched for on the Search Index > Logs page. When logging is enabled, every search made through the Search Index data source will be stored. However the log viewer only displays unique searches — if in one session a user searches using the same keywords four times, it will only display in the log viewer once.

Column descriptions:

  • Date is the time of the search. If a user has searched multiple times, this is the time of the _first_search
  • Keywords is the raw keyword phrase the user used
  • Adjusted Keywords shows the keyword phrase if it was modified by synonym expansion
  • Results is the number of matched entries the search yielded
  • Depth is the maximum number of search results pages the user clicked through

Known issues

  • you can not order results by relevance score when using a single data source. This is only available when using the custom Search Index data source
  • if you hit the word-length limitations using boolean fulltext searching, try an alternative mode (like or regexp).

search_index's People

Contributors

alexbirukov avatar animaux avatar brendo avatar bzerangue avatar kirkstrobeck avatar klaftertief avatar nathanhornby avatar nickdunn avatar nils-werner avatar nilshoerrmann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

search_index's Issues

Missing argument 3 for SearchIndex::substr()

When searching for a term that has a - in the string, it returns the error Missing argument 3 for SearchIndex::substr(), called in /usr/home/sites/clampline.com.au/public_html/extensions/search_index/lib/class.search_index.php on line 663 and defined

For example, if the url params are like this:
?keywords=Narva+Auto+Fuses+-+Bulk&per-page=8&page=1&sections=products
it returns an error (note the hyphen between fuses and bulk)

But if the keywords param doesn't have the hyphen, like ?keywords=Narva+Auto+Fuses+Bulk&per-page=8&page=1&sections=products then there is no error.

Has this been encountered before? Is there an easy solution or am I doing something stupid? If you need more info just let me know. You can go to clampline.com.au and search for narva globes and the first 2 should throw a hissy.

count(): Parameter must be an array or an object that implements Countable

Raised this error on installing on 2.7.10, Search Index > Indexes on the admin.

298         if ($index) {
299             $col_name->appendChild(Widget::Input("items[{$section->get('id')}]", null, 'checkbox'));
300         }
301               
302         if ($index && isset($index['fields']) && count($index['fields'] > 0)) {  // highlighted row ------
303             $section_fields = $section->fetchFields();
304             $fields = $this->_indexes[$section->get('id')]['fields'];
305             $fields_list = '';
306             foreach($section_fields as $section_field) { 

Plugin v 0.9.5 on php 7.2.22

Thx

sym_ prefix hardcoded.

Datasource data.search.php contains hardcoded sym_ prefix in SQL query at lines 106/107.

Using MATCH ... AGAINST is not always working well

I have a database with names, some of the names are like "Henk" "Loes" "Frederick" ... no problems here but there are also names like "an" "wim" "bil" and "jo" ..... this means that the MYSQL ft_max_word_len variable must be set to 3 or 2 .... and as far as i know there is no way to do this on index level .... or is there?

The only way i found to get around this is to set the mentioned variable in the mysql config file to 2, but i have no access to this file at the webhost. Further more i have some other fulltext indexes where i do not want this max_word_len .... news articles containing words like "the" and "or" and "if" .....

An alternative would be to use the less performing LIKE, but only in circumstances like mentioned above! So, in short, we should be able to choose between matching algorithms on a index basis. This also means that when we perform a search over several sections, the sections that are configured to use the LIKE method would run separably.

I have no idea wetter this is the way to go, just thinking out loud!

Datasource returns no results on 2.6.5 and 2.6.4

I have an installation with Symphony 2.6.4 and 2.6.5 and my datasource doesn't contain my field searchindex, the field searchindex is added, but in debugging doesn't appear. I use 0.9.4 version of the Search_Index.
I have index 10 items.. have it running on other sites (2.6.4) works perfect.

Uninstalled the extension, readded the field, reindex, readded to the datasource, but nothing.

I have noticed 1 difference in my config.php compared to other working sites:
no single quotes for the numbers
're-index-per-page' => 20,
So i have added them, but no difference.

Where should i start to debug?

###### SEARCH_INDEX ######
    'search_index' => array(
        're-index-per-page' => '20',
        're-index-refresh-rate' => '0.5',
        'get-param-prefix' => null,
        'get-param-keywords' => 'keywords',
        'get-param-per-page' => 'per-page',
        'get-param-sort' => 'sort',
        'get-param-direction' => 'direction',
        'get-param-sections' => 'sections',
        'get-param-page' => 'page',
        'default-sections' => null,
        'default-per-page' => '20',
        'default-sort' => 'score',
        'default-direction' => 'desc',
        'excerpt-length' => '250',
        'min-word-length' => '3',
        'max-word-length' => '30',
        'stem-words' => 'yes',
        'build-entries' => 'no',
        'mode' => 'like',
        'log-keywords' => 'yes',
        'indexes' => 'a:1:{i:60;a:3:{s:6:"fields";a:1:{i:0;s:3:"sku";}s:9:"weighting";s:1:"0";s:7:"filters";a:0:{}}}'
    ),

Error while saving through frontend form

Might be a problem with another extension?

  1. August 2014 10:28 > Fatal Error: GenericExceptionHandler 1: Call to undefined method FieldCheckbox::buildDSRetrivalSQL() on line 103 of file /…/extensions/search_index/lib/class.search_index.php

2.6 Compatibility/Issues

After updating to 2.6: New entries are not indexed and I cannot select a section to reindex.

Are these fixes possibly applicable to search index too?

preg_replace /e flag neccessary?

Is the deprecated preg_replace /e flag neccessary in this line?

I have not got an error yet, but I’m not sure when exactly this is line run. I wonder if the /e flag is needed here, since there is no retrieved stored value in the replacement.

Unescaped data in log view.

When entering JavaScript into a search field, the resulting search results page correctly shows the escaped version of the text (due to the XSS Filter Extension).

However when viewing the log of performed searches the JavaScript is successfully executed, opening a XSS vulnerability.

Using Symphony URL Parameters Full Example?

Is there a full example of how to do the "/search/foo+bar/2/" url scheme? I've tried for days to get it to work, but I keep getting a "Invalid search sections" error in my XML. I've edited the config and manually added my section name and tried adding every variant of the param_pool field.

I have a page called "Search" and I have my data source that I want to search against as well as the Index DS attached. I've added a url param called "site-search" to my page.

I've added 'site-search' and tried every variant (site-search | '$site-search' | etc.) to the prefix variable in the config.

I must be doing something terribly wrong.

Truncation isn't UTF-8 compatible

<excerpt><p> img_0335_2-4ea5b0c3834e0.jpg 1163 2011-10-24 Schminkanleitung: Step 1: Zeichnen Sie mit dem Kinder-Schminkstift Grün eine ovale Form über das ganze Gesicht. Anschließend mischen Sie aus den Tiegel-Farben Grün, Gelb und Weiß ein blasses gelb-gr�&#8230;</p></excerpt>

Having issues with German characters being broken into non characters,
my guess is a UTF-8 substring error somewhere ..

Taken from ..

Zeichnen Sie mit dem Kinder-Schminkstift Grün eine ovale Form über das ganze Gesicht. Anschließend mischen Sie aus den Tiegel-Farben Grün, Gelb und Weiß ein blasses gelb-grün zusammen und malen damit die Fläche mit dem breiten Pinsel aus.

I checked thru the code and saw your mb_substring fix anf the call to self, so I'm stumped :(

The XSLT error page displays

loadXML(): Input is not proper UTF-8, indicate encoding ! Bytes: 0xC3 0x26 0x23 0x38 in Entity, line: 760

Entry Manager not found

I'm using Symphony 2.2.1 with your latest Seach Index extension.
When I set build-entries to yes I get :

Fatal error: Class 'EntryManager' not found in C:\dev\apache2\htdocs\extensions\search_index\data-sources\data.search.php on line 386

Search Index Suggestions

It would be great if the extension was recoded slightly to include the section id/name in the index keywords or index entry keywords tables.

Use case: I have a search form that can be globally searched, which can use the Search Index Suggestions datasource for auto-complete, but I also have advanced search boxes, and for those auto completes, I would like to also use the Search Index Suggestions datasource. To do this I would have a separate page that would filter by section (param) and also have the keywords (url param) which would filter the Search Index Suggestions datasource. I would like this datasource to also output the section name in it's xml so I can use the section param to filter results that I show in my auto complete.

Does that make sense?

charset error

Hi i had some issues with charset in search results. (See attach) Database and database connection use 'utf-8' charset.

screen shot 2014-01-25 at 2 44 19 pm

I change line 414 in data.search.php:

$excerpt = utf8_encode($excerpt);

with:

$excerpt = iconv('UTF-8', 'UTF-8//IGNORE', $excerpt);

And now have right symbols in search results.

Suggestion: Config options

Rather than have serialised data in the config file it should be in the database.

IMO, and as best practice, the data stored by the extension should be in the DB not the config file. Config files can change between servers (as mine do) and lose all the pertinent information as to what is indexed in the tables that are in the DB.

It took me ages to realise what had happened here, as I assumed (as I shouldn't, I know) that there was a problem with my SQL export/import. I thought only the Preferences page would write to the config file and not the extension. My fault for not checking, but it cost me 2 hours of dumbstruck dev time.

What are your thoughts on this Nick? If you're ok for this idea, I'll have a look at implementing it. I ask for so many changes here, but never contribute ;o)

Error "SELECT list is not in GROUP BY" clicking on Log

When clicking on the log menu item I get this error:

Symphony Fatal Database Error: Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'sym_symphony3.sym_search_index_logs.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

An error occurred while attempting to execute the following query

SELECT SQL_CACHE id, keywords, keywords_manipulated, date, sections, results, MAX(page) as `depth`, session_id FROM `sym_search_index_logs` GROUP BY keywords, session_id ORDER BY date desc LIMIT 0, 20

I'm using Symphony 2.7.10 on php 7.1.13, mysql 5.7.26
The plugin seems like working, but still I get this error in the admin side.

Experimental: Version Number and Installer

It seems like the installer is missing a few tables or columns:

  • sym_search_index_stopwords
  • sym_search_index_synonyms
  • Unknown column user_agent in field list

Nick, would you mind updating the installer and increasing the version number? Thanks a lot!

Suggestion: Fields in autocomplete

You're going to hate me...

Now we have sections, would it be at all possible to include which field the keyword came from? Either that or allow multiple indexes per section?

explode() expects parameter 2 to be string, array given

Hi Nick, I hit this error tonight updating an older site:

explode() expects parameter 2 to be string, array given
/Users/tonyarnold/Sites/tonyarnold.com/extensions/search_index/data-sources/data.search.php line 94

89          } else {
90              $param_sections = array();
91          }
92          
93          $sections = array();
94          foreach(array_map('trim', explode(',', $param_sections)) as $handle) {
95              $section = Symphony::Database()->fetchRow(0,
96                  sprintf(
97                      "SELECT `id`, `name` FROM `tbl_sections` WHERE handle = '%s' LIMIT 1",
98                      Symphony::Database()->cleanValue($handle)

Either/Or Search

Is there a way to do a search like in Google when you use the OR operator? Is there a certain syntax to use or is it not possible?

XSS Vulnerability

I've discovered an XSS vulnerability in this extension.

When I search for something like <script type="text/javascript">alert('XSS Attack!');</script>, I get a parse error (which is a bug on itself). But... when I go in the backend and look into the logs, the JavaScript gets executed!!!

So I could for example search for <script type="text/javascript"> --- ajax call to evil-server.com/evil-script.php?parameter=session ID or something --- </script>. Then when the administrator logs in and goos looking in their log, this script is executed!

Very bad indeed...

search-suggestions Datasource

@brendo @nathanhornby

I get an empty search-suggestions node in it’s DS. An Error is logged:

13. August 2014 13:53 > UNKNOWN: SymphonyErrorPage 0 -  on line 685 of /…/symphony/lib/core/class.symphony.php

Has anyone else tried search-suggestions in 2.4+?

Problems with multi-byte UTF-8 characters

The extension does not handle multibyte UTF-8 characters properly.

  • When searching for a word containing multibyte characters (e.g. "döner"), results can be found. But in the excerpts the keyword will not be enclosed in a <strong/> tag.
  • Furthermore there will be no search-suggestions if the search keyword contains a multibyte character. Searching for döner will not show any suggestions even if your field values contain döner or superdöner.
  • In the search suggestions, multi-byte characters are displayed wrong. For example, when you search fpr superd and you have fields containing superdöner, the suggestion superdöner will actually be superdöner (which is a typically wrong multibyte representation).

So I guess the extension needs some utf-encoding/-decoding functions in the right places so that PHP can work with the strings correctly. Unfortunately I don't know enough about the inner workings to fix this myself, but I will do testing! (I really think that this extension is a big step forward for Symphony as a whole.)

DB contains deleted info

I have deleted an index, yet the keywords still remain in the database. Also, entries which have been deleted still remain in the keyword indexes.

I have re-indexed the relevant sections, yet the entries remain.

Clicking on re-index seems to add to the index, rather than replace.

Untitled

After installing the extension and setting everything as described in the readme i'm getting the following error:

/extensions/search_index/data-sources/data.search.php around line 253:
array_walk() [function.array-walk]: Unable to call _search_excerpt_replace() - function does not exist

This only occurs when i search on something that actually matches an entry.

Word Issues

There are several word issues, and the PorterStemmer never was able to resolve.
(from what I've noticed)

Words Ending with: (these are just some ive noticed, I wish we had a solution)

  • 'er' (this is a tricky one because 'er' can be removed or kept the e)
    • counter => counter (should equal 'count')
    • crusher => crusher (should be crush)
  • 'le'
    • puzzle => puzzl (should equal puzzle)
    • rumble => rumbl (should equal rumble)
  • 'y'
    (rocky => rocki (should equal rock)
    (communities => communiti) -this one is a well-known issue
    (plays => plai) Im surprised at this one

for some, you can run a spell check program like 'pspell' php ext, and compare if its correct spelling of a 'real' word.

I wonder if for some, you could find a pattern for why there is an e and some arent. Though these algorithms are old and possibly all thats able to do without a 'speech' program.

Code for pagination

I'd like to paginate results. While I see a pagination node in the data, I cannot figure out how to write a chunk of code for enabling this feature. Any source?

Thanks

Call to member fetch() error

Migrated a 2.2.5 install to 2.3.2 and have the latest version of search_index. I re-indexed all of my search indexes but keep getting the following error in my PHP logs:

[26-Jun-2013 23:45:58 UTC] PHP Fatal error: Call to a member function fetch() on a non-object in /Users/jdsimcoe/Sites/churchdeploy/workspace/data-sources/data.search.php on line 425

Any idea how to rectify this?

ds-search still being created?

in Symphony 2.5beta2 I get an error in a custom datasource complaining about ds-search being null. However I can’t really check if it’s there since the error overrides the ?debug console.

Symphony Warning: Use of undefined constant DS_FILTER_AND

  • Sym: 2.7.10
  • PHP: 7.3
  • Search Index: 0.9.5

Steps to reproduce:

Create or save an entry in a Section that is configured in Search Index's indexes.

If I roll back to PHP 7.1 all is well.

Symphony Warning: Use of undefined constant DS_FILTER_AND - assumed 'DS_FILTER_AND' (this will throw an Error in a future version of PHP)

An error occurred in /path-redacted/extensions/search_index/lib/class.search_index.php around line 81

76    if((is_array($filter) && empty($filter)) || trim($filter) == '') continue;
77    
78    if(!is_array($filter)){
79        $filter_type = DataSource::__determineFilterType($filter);
80
81        $value = preg_split('/'.($filter_type == DS_FILTER_AND ? '\+' : '(?<!\\\\),').'\s*/', $filter, -1, PREG_SPLIT_NO_EMPTY);            
82        $value = array_map('trim', $value);
83
84        $value = array_map(array('Datasource', 'removeEscapedCommas'), $value);
85    } 

Backtrace

[/path-redacted/extensions/search_index/lib/class.search_index.php:81]
    GenericErrorHandler::handler();
[/path-redacted/extensions/search_index/extension.driver.php:232]
    SearchIndex::indexEntry();
[/path-redacted/symphony/lib/toolkit/class.extensionmanager.php:702]
    Extension_Search_Index->indexEntry();
[/path-redacted/symphony/content/content.publish.php:1159]
    ExtensionManager::notifyMembers();
[/path-redacted/symphony/content/content.publish.php:356]
    contentPublish->__actionNew();
[/path-redacted/symphony/content/content.publish.php:332]
    contentPublish->__switchboard();
[/path-redacted/symphony/lib/toolkit/class.administrationpage.php:465]
    contentPublish->action();
[/path-redacted/symphony/content/content.publish.php:327]
    AdministrationPage->build();
[/path-redacted/symphony/lib/core/class.administration.php:205]
    contentPublish->build();
[/path-redacted/symphony/lib/core/class.administration.php:483]
    Administration->__buildPage();
[/path-redacted/symphony/lib/boot/func.utilities.php:253]
    Administration->display();
[/path-redacted/symphony/lib/boot/func.utilities.php:235]
    symphony_launcher();
[/path-redacted/index.php:19]
    symphony();

Database Query Log

[0.0004] SET character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8';
[0.0001] SET CHARACTER SET 'utf8';
[0.0001] SET time_zone = '+00:00';
[0.0007] SELECT SQL_CACHE t1.name, t2.page, t2.delegate, t2.callback FROM `sym_extensions` as t1 INNER JOIN `sym_extensions_delegates` as t2 ON t1.id = t2.extension_id WHERE t1.status = 'enabled' ORDER BY t2.delegate, t1.name;
[0.0002] SELECT SQL_CACHE `session_data` FROM `sym_sessions` WHERE `session` = 'session-redacted' LIMIT 1;
[0.0002] SELECT SQL_CACHE a.* FROM `sym_authors` AS `a` WHERE `username` = 'user-redacted' ORDER BY a.id ASC LIMIT 1;
[0.0002] UPDATE sym_authors SET `last_seen` = '2020-03-02 16:31:40' WHERE `id` = 1;
[0.0002] SELECT SQL_CACHE `name` FROM `sym_extensions` WHERE `status` = 'enabled';
[0.0002] SELECT SQL_CACHE * FROM `sym_extensions`;
[0.0002] SELECT SQL_CACHE `id` FROM `sym_sections` WHERE `handle` = 'general' LIMIT 1;
[0.0003] SELECT SQL_CACHE * FROM `sym_sections_association` AS `sa`, `sym_sections` AS `s` WHERE `sa`.`child_section_id` = 2 AND `s`.`id` = `sa`.`parent_section_id` ORDER BY `s`.`sortorder` ASC;
[0.0003] SELECT SQL_CACHE * FROM `sym_sections_association` AS `sa`, `sym_sections` AS `s` WHERE `sa`.`parent_section_id` = 2 AND `s`.`id` = `sa`.`child_section_id` ORDER BY `s`.`sortorder` ASC;
[0.0003] SELECT SQL_CACHE `s`.* FROM `sym_sections` AS `s` ORDER BY `s`.`sortorder` asc;
[0.0002] SELECT SQL_CACHE `id` FROM `sym_sections` WHERE `handle` = 'general' LIMIT 1;
[0.0002] SELECT SQL_CACHE `id` FROM `sym_sections` WHERE `handle` = 'general' LIMIT 1;
[0.0002] SELECT SQL_CACHE `id`, `element_name`, `type`, `location` FROM `sym_fields` WHERE `parent_section` = 2 ORDER BY `sortorder` ASC;
[0.0002] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(18);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_input` WHERE `field_id` IN (18);
[0.0001] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(19);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_textarea` WHERE `field_id` IN (19);
[0.0002] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(21);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_select` WHERE `field_id` IN (21);
[0.0002] SELECT SQL_CACHE CASE hide_association WHEN "no" THEN "yes" ELSE "no" END as show_association FROM `sym_sections_association` WHERE `child_section_field_id` = 21;
[0.0002] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(20);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_upload` WHERE `field_id` IN (20);
[0.0001] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(180);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_input` WHERE `field_id` IN (180);
[0.0001] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(22);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_input` WHERE `field_id` IN (22);
[0.0001] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(172);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_textarea` WHERE `field_id` IN (172);
[0.0001] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`id` IN(178);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_checkbox` WHERE `field_id` IN (178);
[0.0002] INSERT INTO `sym_entries` (`author_id`, `section_id`, `creation_date`, `modification_date`, `modification_date_gmt`, `creation_date_gmt`, `modification_author_id`) VALUES ('1', '2', '2020-03-02 16:31:40', '2020-03-02 16:31:40', '2020-03-02 16:31:40', '2020-03-02 16:31:40', '1');
[0.0002] SELECT SQL_CACHE `id`, `element_name`, `type`, `location` FROM `sym_fields` WHERE `parent_section` = 2 ORDER BY `sortorder` ASC;
[0.0002] SELECT SQL_CACHE `file`, `mimetype`, `size`, `meta` FROM `sym_entries_data_20` WHERE `entry_id` = 3796 LIMIT 1;
[0.0002] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`parent_section` = '2' ORDER BY t1.`sortorder` ASC;
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_input` WHERE `field_id` IN (18,180,22);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_textarea` WHERE `field_id` IN (19,172);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_select` WHERE `field_id` IN (21);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_upload` WHERE `field_id` IN (20);
[0.0001] SELECT SQL_CACHE * FROM `sym_fields_checkbox` WHERE `field_id` IN (178);
[0.0001] UPDATE sym_entries SET `modification_author_id` = '1', `modification_date` = '2020-03-02 16:31:40', `modification_date_gmt` = '2020-03-02 16:31:40' WHERE `id` = 3796;
[0.0005] SHOW TABLES LIKE 'sym_entries_data_18';
[0.0001] LOCK TABLES `sym_entries_data_18` WRITE;
[0.0001] DELETE FROM `sym_entries_data_18` WHERE `entry_id` = 3796;
[0.0001] INSERT INTO `sym_entries_data_18` (`entry_id`, `value`, `handle`) VALUES ('3796', 'test', 'test');
[0.0001] UNLOCK TABLES;
[0.0004] SHOW TABLES LIKE 'sym_entries_data_19';
[0.0001] LOCK TABLES `sym_entries_data_19` WRITE;
[0.0001] DELETE FROM `sym_entries_data_19` WHERE `entry_id` = 3796;
[0.0002] INSERT INTO `sym_entries_data_19` (`entry_id`, `value`, `value_formatted`) VALUES ('3796', 'test', '<p>test</p>\n');
[0.0001] UNLOCK TABLES;
[0.0004] SHOW TABLES LIKE 'sym_entries_data_21';
[0.0001] LOCK TABLES `sym_entries_data_21` WRITE;
[0.0001] DELETE FROM `sym_entries_data_21` WHERE `entry_id` = 3796;
[0.0001] INSERT INTO `sym_entries_data_21` (`entry_id`, `value`, `handle`) VALUES ('3796', 'White', 'white');
[0.0001] UNLOCK TABLES;
[0.0004] SHOW TABLES LIKE 'sym_entries_data_20';
[0.0001] LOCK TABLES `sym_entries_data_20` WRITE;
[0.0001] DELETE FROM `sym_entries_data_20` WHERE `entry_id` = 3796;
[0.0001] UNLOCK TABLES;
[0.0004] SHOW TABLES LIKE 'sym_entries_data_180';
[0.0001] LOCK TABLES `sym_entries_data_180` WRITE;
[0.0001] DELETE FROM `sym_entries_data_180` WHERE `entry_id` = 3796;
[0.0001] UNLOCK TABLES;
[0.0004] SHOW TABLES LIKE 'sym_entries_data_22';
[0.0001] LOCK TABLES `sym_entries_data_22` WRITE;
[0.0001] DELETE FROM `sym_entries_data_22` WHERE `entry_id` = 3796;
[0.0001] UNLOCK TABLES;
[0.0004] SHOW TABLES LIKE 'sym_entries_data_172';
[0.0001] LOCK TABLES `sym_entries_data_172` WRITE;
[0.0001] DELETE FROM `sym_entries_data_172` WHERE `entry_id` = 3796;
[0.0001] UNLOCK TABLES;
[0.0004] SHOW TABLES LIKE 'sym_entries_data_178';
[0.0001] LOCK TABLES `sym_entries_data_178` WRITE;
[0.0001] DELETE FROM `sym_entries_data_178` WHERE `entry_id` = 3796;
[0.0001] INSERT INTO `sym_entries_data_178` (`entry_id`, `value`) VALUES ('3796', 'no');
[0.0001] UNLOCK TABLES;
[0.0002] SELECT SQL_CACHE t1.* FROM sym_fields AS `t1` WHERE 1 AND t1.`type` = 'reflection' AND t1.`parent_section` = '2' ORDER BY t1.`sortorder` ASC; 

PHP 7.2+ compatibility issue line 302 count() parameter must be an array

  • Sym: 2.7.10
  • PHP: 7.3
  • Search Index: 0.9.5

Steps to reproduce:

Visit /symphony/extension/search_index/indexes/ via the Search Index > Indexes menu item.

From php.net: Since PHP 7.2.0: count() will now yield a warning on invalid countable types passed to the array_or_countable parameter.

If I roll back to PHP 7.1 all is well.

Symphony Warning: count(): Parameter must be an array or an object that implements Countable

An error occurred in /path-redacted/extensions/search_index/content/content.indexes.php around line 302

    297  
    298  if ($index) {
    299      $col_name->appendChild(Widget::Input("items[{$section->get('id')}]", null, 'checkbox'));
    300  }
    301  
    302  if ($index && isset($index['fields']) && count($index['fields'] > 0)) {
    303      $section_fields = $section->fetchFields();
    304      $fields = $this->_indexes[$section->get('id')]['fields'];
    305      $fields_list = '';
    306      foreach($section_fields as $section_field) { 

Backtrace

    [/path-redacted/extensions/search_index/content/content.indexes.php:302]
        GenericErrorHandler::handler();
    [/path-redacted/symphony/lib/toolkit/class.administrationpage.php:801]
        contentExtensionSearch_IndexIndexes->__viewIndex();
    [/path-redacted/symphony/lib/toolkit/class.administrationpage.php:751]
        AdministrationPage->__switchboard();
    [/path-redacted/symphony/lib/toolkit/class.administrationpage.php:496]
        AdministrationPage->view();
    [/path-redacted/extensions/search_index/content/content.indexes.php:36]
        AdministrationPage->build();
    [/path-redacted/symphony/lib/core/class.administration.php:205]
        contentExtensionSearch_IndexIndexes->build();
    [/path-redacted/symphony/lib/core/class.administration.php:483]
        Administration->__buildPage();
    [/path-redacted/symphony/lib/boot/func.utilities.php:253]
        Administration->display();
    [/path-redacted/symphony/lib/boot/func.utilities.php:235]
        symphony_launcher();
    [/path-redacted/index.php:19]
        symphony();

Database Query Log

    [0.0001] SET character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8';
    [0.0000] SET CHARACTER SET 'utf8';
    [0.0001] SET time_zone = '+01:00';
    [0.0005] SELECT SQL_CACHE t1.name, t2.page, t2.delegate, t2.callback FROM `sym_extensions` as t1 INNER JOIN `sym_extensions_delegates` as t2 ON t1.id = t2.extension_id WHERE t1.status = 'enabled' ORDER BY t2.delegate, t1.name;
    [0.0002] SELECT SQL_CACHE `session_data` FROM `sym_sessions` WHERE `session` = 'session-token-redacted' LIMIT 1;
    [0.0002] SELECT SQL_CACHE a.* FROM `sym_authors` AS `a` WHERE `username` = 'user-redacted' ORDER BY a.id ASC LIMIT 1;
    [0.0002] UPDATE sym_authors SET `last_seen` = 'time-redacted' WHERE `id` = 1;
    [0.0003] SELECT SQL_CACHE `s`.* FROM `sym_sections` AS `s` ORDER BY `s`.`name` ASC;
    [0.0002] SELECT SQL_CACHE `name` FROM `sym_extensions` WHERE `status` = 'enabled';
    [0.0001] SELECT SQL_CACHE * FROM `sym_extensions`;
    [0.0003] SELECT SQL_CACHE `s`.* FROM `sym_sections` AS `s` ORDER BY `s`.`sortorder` asc; 

Search - Strange Excerpt

I am not too sure If I have set up something incorrectly however I think I would qualify this one as a bug.

Once search Index is created properly and set to sort from multiple sources (using the custom datasource) whenever I search for the word 'search' something weird comes up within the excerpt itself. (noted this as for some reason the first word I entered was search...)

SEARCH_INDEX_ELIPSIS

The excerpt appears enclosed within a pair of the above... I would assume this is some text generated automatically... Not a big issue really but if unnoticed before I think its important to note as people will have some strange looks if they find all their results containing that string... (Just installed latest Master - rest of the process went quite smooth.)

Possible XSS vulnerability?

I implemented Search Index in a site recently and already notice XSS attacks ("tries", I guess) popping up in the logs.

While I don't think there are serious issues one keyword does result in a XSLT error:

loadXML(): attributes construct error in Entity, line: 275
loadXML(): Couldn't find end of Start Tag keyword line 275 in Entity, line: 275

I am hesitant to post the triggering keyword but could mail you more details personally?

Problem with search_index/lib/class.search_index.php

Hey Nick. I installed the Search Index extension, and installed without issue. There was one exception to that, the Search Index Logs item gives you a Fatal error if your database your are using a different table name prefix than 'sym_'.

In the search_index/lib/class.search_index.php file, the table name is explicitly 'sym_search_index_logs' and if you change it to 'tbl_search_index_logs' it fixes the errors.

Logs not working corectly

if I change get-param-keywords in config.php I can't get the logs working

For example this is not working, and I do use "search" as a get variable in the url (?search=something)
'get-param-keywords' => 'search',

But it worked when it was the default "keywords" and I used ?keywords=something

The search is working as expected, just the keywords are not logged.

Edit :

The variable for logging the keywords is setted to yes : 'log-keywords' => 'yes',

Missing latest release?

As reported in #52 concerning the Symphony 2.6.x compatibility, which should solve the 0.9.3 tag release, the current master has a 0.9.3 release entry in its extension.meta.xml, but it does not work like said in #52, and the latest github tag to be found is 0.9.1.

Did the latest updates gone missing somehow?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.