About a year ago <a class="user-mention notranslate" data-hovercard-type="user" data-h

Update on this ticket: after landing <a class="issue-link js-issue-link" data-error-te

Speed up sorting / reduce overhead of sorting about carmen-cache HOT 1 OPEN

mapbox commented on August 16, 2024 1

Speed up sorting / reduce overhead of sorting

from carmen-cache.

Comments (1)

springmeyer commented on August 16, 2024

Update on this ticket: after landing #126 (and the concern that the prefiltering in #126 might be problematic faded away - #131) I think the bottleneck described above is likely somewhat mitigated. We saw that contextSortByRelev was a major hotspot when I presume, we were sorting hundreds or thousands of contexts. After the prefiltering in #126 I think we'll be sorting fewer. However this all should be tested. Someone interested in performance could:

profile with perf in production to see if contextSortByRelev is still a bottleneck (https://jvns.ca/perf-zine.pdf)
if it is then you could print out (or debug with lldb/gdb) how many context objects we're sorting now - are we still sorting an obscene amount for some queries? If so perhaps we could be even more aggressive than I was in #126 by doing something like:

diff --git a/src/coalesce.cpp b/src/coalesce.cpp
index 9f517d3..c6e2ae5 100644
--- a/src/coalesce.cpp
+++ b/src/coalesce.cpp
@@ -312,7 +312,7 @@ inline std::vector<Context> coalesceMulti(std::vector<PhrasematchSubq>& stack, c
                 } else if (covers[0].mask > covers[1].mask) {
                     context_relev -= 0.01;
                 }
-                if (maxrelev - context_relev < .25) {
+                if (maxrelev - context_relev < .15) {
                     contexts.emplace_back(std::move(covers), context_mask, context_relev);
                 }
             } else if (first || covers.size() > 1) {
@@ -333,7 +333,7 @@ inline std::vector<Context> coalesceMulti(std::vector<PhrasematchSubq>& stack, c
     // append coalesced to contexts by moving memory
     for (auto&& matched : coalesced) {
         for (auto&& context : matched.second) {
-            if (maxrelev - context.relev < .25) {
+            if (maxrelev - context.relev < .15) {
                 contexts.emplace_back(std::move(context));
             }
         }