Comments (5)
- There is no result for two of ten queries. Did I miss anything in the parameter settings?
I do not have any idea about that at this moment.
index.search(sc);
Could you exchange the line above for the line below and run again?
index.linearSearch(sc);
linearSearch just executes linear search without the use of the index. If the result is the same as the previous result, it is caused by invalid queries. If not, no result is caused by the index.
- Is there a way that we can reuse the objects to avoid allocating memory for wide-range and high-frequency queries?
You can move the line below to the outside of the for statement.
NGT::ObjectDistances objects;
The line below might be needed within the for statement.
objects.clear()
- Are there other parameters or compilation options that I can try to improve the CPU performance?
- You may want to use ONNG as below.
https://github.com/yahoojapan/NGT/tree/master/bin/ngt#onng - You may want to open the index with a read only flag as below to improve the search performance.
https://github.com/yahoojapan/NGT/blob/master/lib/NGT/Command.cpp#L402
from ngt.
Thank you, @masajiro! It is very helpful.
- There is no result for two of ten queries. Did I miss anything in the parameter settings?
I do not have any idea about that at this moment.
index.search(sc);
Could you exchange the line above for the line below and run again?
index.linearSearch(sc);
linearSearch just executes linear search without the use of the index. If the result is the same as the previous result, it is caused by invalid queries. If not, no result is caused by the index.
For the first issue, I found it is actually related to invalid query data. Thanks for your suggestions on debugging this situation.
- Is there a way that we can reuse the objects to avoid allocating memory for wide-range and high-frequency queries?
You can move the line below to the outside of the for statement.
NGT::ObjectDistances objects;
The line below might be needed within the for statement.
objects.clear()
I tried with your suggestions. I noticed that ObjectDistances
inherits from vector<ObjectDistance>
and clearing the vector incurs additional deallocation cost across all results βΒ it is substantial when we have tens of thousands results. On the other hand, the ObjectDistance
seems could be reused given it only includes id
and distance
pair.
If it is reasonable to you to reuse the ObjectDistance
for reducing memory cost, I can help with a pull request along the way.
- Are there other parameters or compilation options that I can try to improve the CPU performance?
- You may want to use ONNG as below.
https://github.com/yahoojapan/NGT/tree/master/bin/ngt#onng- You may want to open the index with a read only flag as below to improve the search performance.
https://github.com/yahoojapan/NGT/blob/master/lib/NGT/Command.cpp#L402
I will further explore, thanks.
At the same time, I noticed two small issues:
- The compiling warning: there are some repetitive declarations for variables that shadow variables in the parent scope. It is triggered some GCC warnings in a verbose mode, and it would be great if we could clear up as well.
- I noticed that we use
iostream
to output log outside of the Command file, and I have not found a flag that we can turn off the logging. It lags the perf and seems distracting. Did I miss anything? Or could we add a flag so that we can disable it when needed?
from ngt.
Did I miss anything? Or could we add a flag so that we can disable it when needed?
There is no flag to turn off the logging. I am going to add a kind of a flag to handle the logging in the near future.
from ngt.
- I noticed that we use iostream to output log outside of the Command file, and I have not found a flag that we can turn off the logging. It lags the perf and seems distracting. Did I miss anything? Or could we add a flag so that we can disable it when needed?
I have added arguments to Index and Optimizer to disable the logging.
from ngt.
*The compiling warning: there are some repetitive declarations for variables that shadow variables in the parent scope. It is triggered some GCC warnings in a verbose mode, and it would be great if we could clear up as well.
I removed most of the warnings that I found.
from ngt.
Related Issues (20)
- Using float16 in C API HOT 4
- PHP Library HOT 2
- Non-similar length vectors? HOT 3
- import ngt ok but import ngtpy show No module named 'ngtpy HOT 3
- RuntimeError: remove: cannot remove from tree. get: Not in-memory or invalid offset of node. HOT 3
- How to reduce grp / objpo / trei / trel files size? HOT 2
- How to create a QBG with Capi ? HOT 7
- Specify num_threads for searching HOT 3
- Linking issue HOT 2
- Add more ngt_insert_index_as_TYPE methods to C API HOT 2
- Feature request: Command line output option that doesn't require intensive deserialization
- Add new QBG methods to C API HOT 7
- Is there any benchmark result for NGT QG/QBG? HOT 4
- file descriptor leak on `index.build_index` HOT 2
- Missing functions and types in the C API HOT 14
- Python bindings for QG/QBG HOT 3
- bugs HOT 3
- Fixed seeds for deterministic results HOT 1
- Building with -DNGT_QBG_DISABLED=ON still trying to link with LAPACK and BLAS HOT 2
- How to update NGT from older version to new one? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ngt.