Code Monkey home page Code Monkey logo

Comments (44)

evenfrost avatar evenfrost commented on June 12, 2024 7

Thanks for the helper @Pysis868, I took the liberty of modifying it a little bit and made a function to format the fuse.search output right away. It returns input objects with modified highlighted values (using TypeScript):

const highlight = (fuseSearchResult: any, highlightClassName: string = 'highlight') => {
  const set = (obj: object, path: string, value: any) => {
      const pathValue = path.split('.');
      let i;

      for (i = 0; i < pathValue.length - 1; i++) {
        obj = obj[pathValue[i]];
      }

      obj[pathValue[i]] = value;
  };

  const generateHighlightedText = (inputText: string, regions: number[] = []) => {
    let content = '';
    let nextUnhighlightedRegionStartingIndex = 0;

    regions.forEach(region => {
      const lastRegionNextIndex = region[1] + 1;

      content += [
        inputText.substring(nextUnhighlightedRegionStartingIndex, region[0]),
        `<span class="${highlightClassName}">`,
        inputText.substring(region[0], lastRegionNextIndex),
        '</span>',
      ].join('');

      nextUnhighlightedRegionStartingIndex = lastRegionNextIndex;
    });

    content += inputText.substring(nextUnhighlightedRegionStartingIndex);

    return content;
  };

  return fuseSearchResult
    .filter(({ matches }: any) => matches && matches.length)
    .map(({ item, matches }: any) => {
      const highlightedItem = { ...item };

      matches.forEach((match: any) => {
        _.set(highlightedItem, match.key, generateHighlightedText(match.value, match.indices));
      });

      return highlightedItem;
    });
};

// usage:

const res = highlight(fuse.search(text)); // array of items with highlighted fields

Also, here is the gist.

from fuse.

Pysis868 avatar Pysis868 commented on June 12, 2024 5

Saw krisk's sample highlighting method. It was useful as a reference! I made a slightly different one that uses substring functions instead of going by each character:

// Does not account for overlapping highlighted regions, if that exists at all O_o..
function generateHighlightedText(text, regions) {
  if(!regions) return text;

  var content = '', nextUnhighlightedRegionStartingIndex = 0;

  regions.forEach(function(region) {
    content += '' +
      text.substring(nextUnhighlightedRegionStartingIndex, region[0]) +
      '<span class="highlight">' +
        text.substring(region[0], region[1]) +
      '</span>' +
    '';
    nextUnhighlightedRegionStartingIndex = region[1];
  });

  content += text.substring(nextUnhighlightedRegionStartingIndex);

  return content;
};

Added a comment thinking it could be more complex, but not sure if anybody needs overlapping highlights yet. Not sure how to implement it that way, but a different matches data structure would probably be needed, like having a sort of tree array.

Also, I know Krisk mentioned this is not really a feature for the core library, but as I was making some other methods, maybe some formatters inserted to be run at certain event/hook points could be in order. Maybe that could be how one would implement a plugin system for the library anyway, but even generally anyone could use that feature immediately if they wanted too.

from fuse.

krisk avatar krisk commented on June 12, 2024 3

For reference, once you have the matched indices, you can then do a simple string builder to format the text according to those indices:

var text = 'hello world' // assume this is the text 
var result = []
var matches = [[1, 3], [6, 8]] // assume these are the matched indices
var pair = matches.shift()
// Build the formatted string
for (var i = 0; i < text.length; i++) {
  var char = text.charAt(i)
  if (pair && i == pair[0]) {
    result.push('<b>')
  }
  result.push(char)
  if (pair && i == pair[1]) {
    result.push('</b>')
    pair = matches.shift()
  }
}
console.log(result.join(''))
// h<b>ell</b>o <b>wor</b>ld

from fuse.

farzher avatar farzher commented on June 12, 2024 3

Sometimes pair[0] is greater than pair[1], which messes up the loop that inserts wrapper html.

Idk too many details about this, I just wanted to make you aware that this happens sometimes

from fuse.

krisk avatar krisk commented on June 12, 2024 2

Ah, that is a highly desirable feature. However, there's one problem with it: Fuse is DOM agnostic.

Specifically, remember that Fuse operates on data objects only - in very simple terms, it iterates over text data and performs a fuzzy match algorithm. Ergo, the highlighting would have to be handled by something other than Fuse.

Nonetheless, what Fuse could do, however, is return all indices of the matches in the form of [index, length]:

Search: rown

  • dan b rown --> [[6, 4]]
  • ro b gra n t --> [[0, 2], [9, 1]]

With these indices, you can do a simple substr to highlight the text as you see fit.

I will think about this some more. Thanks!

from fuse.

titanism avatar titanism commented on June 12, 2024 1

https://github.com/tricinel/highlight-words

from fuse.

Abhinav1217 avatar Abhinav1217 commented on June 12, 2024

Thanks.. substring worked after a few tries. There is a visible performance difference so now it is clear why you didn't have this feature. But this is, as you said, is a highly desirable feature. I have chained the output to the display function like var result = displayResult(f.search('brwn')).
I am not a good plugin writer but i have seen lots of developer, writing there libraries allowing for extending it via helper plugins( eg Taffy db). Is there some scope to create plugins of similar desirable features for this library. As to keep the main library fast and efficient, but adding extra functionality for those who want it.
I have (highly unprofessional) code for highlighting the output and for sorting the output based on its rank. Currently i use it like this.

var a = f.search("bir");
var b = sort(a); //this is my own sort function
var result = displayResult(b);

Using them together(or individually) sure has a visible performance deterioration, but I am recently graduated student and so are my codes. And this adds a little extra spice my application needed.

Do think about extending with extra plugins. It will add shines to this already excellent masterpiece.

from fuse.

janraasch avatar janraasch commented on June 12, 2024

return all indices of the matches in the form of [index, length].

+1

from fuse.

fxck avatar fxck commented on June 12, 2024

still planning on doing this?

from fuse.

krisk avatar krisk commented on June 12, 2024

@Abhinav1217, @fxck, I'm working on this right now. Pardon the delay.

from fuse.

fxck avatar fxck commented on June 12, 2024

Great!

from fuse.

smithamax avatar smithamax commented on June 12, 2024

👍

from fuse.

fxck avatar fxck commented on June 12, 2024

Hey @krisk, has it proven to be too difficult, or you are just too busy at the moment?

from fuse.

KarlosF13 avatar KarlosF13 commented on June 12, 2024

When's this comming?

from fuse.

FiloSottile avatar FiloSottile commented on June 12, 2024

+1

from fuse.

krisk avatar krisk commented on June 12, 2024

@fxck, @KarlosF13, @FiloSottile, @janraasch, and everyone else...

Sorry fellas, I've being extremely busy with everything else. The good news is that I have taken the entire month of December off work, so that should give me plenty of time to work on other things - this feature among them.

In the meantime, if anyone wants to have a go at it, feel free to submit a pull request!

from fuse.

fxck avatar fxck commented on June 12, 2024

Still looking forward to this :)

from fuse.

yeroshek avatar yeroshek commented on June 12, 2024

+1

from fuse.

hmvs avatar hmvs commented on June 12, 2024

+1 to return indices

from fuse.

mhkeller avatar mhkeller commented on June 12, 2024

In the meantime, for those looking to implement in a UI, this might help https://github.com/jharding/bearhug

from fuse.

fxck avatar fxck commented on June 12, 2024

That itself is not a problem. Problem is that

all indices of the matches in the form of [index, length].

this looks promising, it's a part of atom, made for nodejs, but could probably be used on client side as well, after few modifications

from fuse.

janraasch avatar janraasch commented on June 12, 2024

@fxck That is a very neat idea, thanks. Seems like the only hard nodejs dependency is the path module for require('path').sep. Should be easy to bundle up with browserifyor even simpler by providing a mock path module to be required.

Of course one has to be okay with the use case:

This library is used by Atom and so its focus will be on scoring and filtering paths, methods, and other things common when writing code. It therefore will specialize in handling common patterns in these types of strings such as characters like /, -, and _, and also handling of camel cased text.

from fuse.

fxck avatar fxck commented on June 12, 2024

any progress @krisk ? fusejs is still the only viable fuzzy search around, hightlight is the only thing it is missing :)

from fuse.

krisk avatar krisk commented on June 12, 2024

Wow. Long hiatus here. Feature was suggested almost 2 years ago. Mea culpa. Confession: still didn't get around to doing this, but clearly there's still some interest in this feature. Anyone interested in tackling this problem? As it stands, the underlying bitap algorithm would most likely have to change to support this.

from fuse.

Abhinav1217 avatar Abhinav1217 commented on June 12, 2024

This really still is a desirable feature. I haven't worked with fuse since the time I asked the question. As far as I remember, I chained the outputs to another function which used a regex script I found on gist, and modified it to insert <b></b> tag around the matched string, Before I displayed it.
That itself have a significant performance degradetion, but it was usable for project at thrashold of 0.0.

I would re-quote from my last reply here,

Do think about extending it with extra plugins. It will add shines to this already excellent masterpiece.

from fuse.

dalgard avatar dalgard commented on June 12, 2024

+1

What @fxck said – the library is great, highlighting is the only thing that's really missing.

from fuse.

AdamGaskins avatar AdamGaskins commented on June 12, 2024

I totally support this! This is the one thing this library is missing.

from fuse.

AdamGaskins avatar AdamGaskins commented on June 12, 2024

I ended up modifying fuse.js to get it to work the way I needed, although I'm using a custom searcher. I'm super busy, but maybe I'll fork and recreate the changes if I get a spare moment.

Basically what I did in the searcher is this:

    return {
        isMatch: true,
        score: totalScore,
        highlight: calculatedHighlightedText,
    };

Then I added an additional option to Fuse:

        search = new Fuse(books, {
            keys: k,
            searchFn: values,
            include: ['score', 'highlight']
        });

Very simple to implement, and I was only digging in the code for a few minutes. Then of course the main task would be to implement the highlighting efficiently in BitapSearcher, which I didn't really look at (since I'm using a custom searcher).

Hope that helps someone! If not, like I said, I may come along and implement this myself at some point.

from fuse.

mapsam avatar mapsam commented on June 12, 2024

+1 thanks!

from fuse.

marcoala avatar marcoala commented on June 12, 2024

Any news?

from fuse.

marcoala avatar marcoala commented on June 12, 2024

I digged a little bit into the code and I've something working, still need some test and better name for variables, but the basic idea is to have a charmask (not sure if is a thing) that represent the match, so:

marcoala@af05432

Search: rown
dan brown --> '000001111'
rob grant --> '11000010'

0 for not match, and 1 for match. Maybe the final can be represented with something in the form of [index, length] as pointed before, but I didn't figure out a way to do it without passing for the mask first.
So @kriskwhat do you think? If you like the approach I can proceed with writing test and submitting an actual pull request.

from fuse.

krisk avatar krisk commented on June 12, 2024

@marcoala, thanks for looking into this 👍

Instead of a character mask, I've made some changes to the codebase to use a match mask array, which simplifies the computation, and returns the matched indices:

var items = ['FH Mannheim', 'University Mannheim']
var fuse = new Fuse(items, {
  include: ['score', 'matchedIndices']
})
var result = fuse.search('Uni Mannheim')

// output
[{
  item: 1,
  matchedIndices: [ [ 0, 2 ], [ 4, 4 ], [ 7, 7 ], [ 10, 18 ] ],
  score: 0.32
}, {
  item: 0,
  matchedIndices: [ [ 1, 10 ] ],
  score: 0.25
}]

The matched indices are simply of the for [start, end], One can then use the indices as follow:

[ [ 0, 2 ], [ 4, 4 ], [ 7, 7 ], [ 10, 18 ] ] --> University Mannheim
[ [ 1, 10 ] ] --> FH Mannheim

Another approach could be to use [start, length]. I'm unsure as to which one is preferable. If anyone is particularly enamored with one approach, speak now :)

I've published a beta version. Thoughts?

Also to consider: the matched indices would be useless if there's no way to tell for which key did the searched object and indices belong. For example:

var items = [{
  name: 'FH Mannheim',
  title: 'a Uni'
}, {
  name: 'University Mannheim',
  title: 'b Uni'
}]

var fuse = new Fuse(items, {
  keys: ['name', 'title'],
  include: ['score', 'matchedIndices']
})

Working on this right now...

from fuse.

krisk avatar krisk commented on June 12, 2024

Ok, I'm thinking this instead:

Input:

var items = [{
  name: 'FH Mannheim',
  title: 's Uni'
}, {
  name: 'University Mannheim',
  title: 'd Uni'
}]
var fuse = new Fuse(items, {
  verbose: true,
  keys: ['name', 'title'],
  include: ['score', 'matches']
})
var result = fuse.search('Uni')

Output:

[{
  item: {
    name: 'University Mannheim',
    title: 'd Uni',
    author: {
      more: 'un'
    }
  },
  matches: [{
    key: 'name',
    indices: [[0, 2]]
  }, {
    key: 'title',
    indices: [[2, 4]]
  }],
  score: 0.11811111111111111
}, {
  item: {
    name: 'FH Mannheim',
    title: 's Uni',
    author: {
      more: 'un'
    }
  },
  matches: [{
    key: 'title',
    indices: [[2, 4]]
  }],
  score: 0.17666666666666667
}]

This way you'll know to which keys the matched indices belong.

from fuse.

janraasch avatar janraasch commented on June 12, 2024

Sounds great. Is that part of 2.2.0-beta?

from fuse.

krisk avatar krisk commented on June 12, 2024

@janraasch yes, I just updated it. Have not published to npm yet. Nonetheless, try it out. Let me know what you think. I still need to create a list of tests against it.

from fuse.

janraasch avatar janraasch commented on June 12, 2024

Alright, cool. I'll check it out.

from fuse.

janraasch avatar janraasch commented on June 12, 2024

Works great.

from fuse.

krisk avatar krisk commented on June 12, 2024

@janraasch glad to hear. I'll run through some tests, and perhaps intermittently there will be some feedback from this thread. If all goes well, we can expect version 2.2.0 before the end of the week.

from fuse.

janraasch avatar janraasch commented on June 12, 2024

I'd say, we wait for 29th, then this thread will be 3 years old. 🎯 👯

😉 (I am kidding of course, I am actually really glad, that this library is still under development and even older threads like this are alive and kickin!)

from fuse.

marcoala avatar marcoala commented on June 12, 2024

@krisk I like your of approach, mine was just a rude attempt to hack a solution, I just spent an hour in an unknown code.

As final user the couples [start,end] give me the opportunity to do nice things like "highlight only group of at least 3 characters" or "highlight at most 3 group of match", to avoid things like 'example string'.

EDIT: can't test this weekend, I'll try to give you some feedback next week.

from fuse.

krisk avatar krisk commented on June 12, 2024

@janraasch yes, 3 years later 😄. Mea culpa. It took @marcoala to kickstart it, so many thanks to him!

from fuse.

Abhinav1217 avatar Abhinav1217 commented on June 12, 2024

Wow. Works great. so far. I did get some unwanted results. Will post some results after I find some definite pattern.

from fuse.

krisk avatar krisk commented on June 12, 2024

@Abhinav1217 any word on the unwanted results? Thanks! 😄

from fuse.

krisk avatar krisk commented on June 12, 2024

Also, does anyone want to volunteer to create more test cases, and adding documentation to the README? I'd be forever indebted 😅

from fuse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.