Code Monkey home page Code Monkey logo

k2nn's Introduction

Fastest CPU implementation of both a brute-force and a custom Multi-Index Hash Table accelerator system for matching 512-bit binary descriptors in 2NN mode, i.e., a match is returned if the best match between a query vector and a training vector is more than a certain threshold number of bits better than the second-best match.

Yes, that means the DIFFERENCE in popcounts is used for thresholding, NOT the ratio. This is the CORRECT approach for binary descriptors.

Both 8-bit and 16-bit MIH tables are supported. I currently recommend 16-bit.

All functionality is contained in the files K2NN.h and twiddle_table.h. 'main.cpp' is simply a sample test harness with example usage and performance testing.

Example initialization of Matcher class Matcher m(tvecs, size, qvecs, size, threshold, max_twiddles);

Options:

Brute-force complete (exact) match: m.bruteMatch();

Single twiddle pass for a very fast partial match, with no false positives (i.e. if a match is returned, it's truly the best match): m.fastApproxMatch();

Multi-index hash (MIH) complete (exact) match, with fall-back to brute force after max_twiddles passes: m.exactMatch();

Match until complete or until 'n' passes elapse (partial): m.approxMatchToNTwiddles(n);

Afterward, the finalized matches are waiting in the vector 'm.matches'.

k2nn's People

Contributors

komrad36 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

k2nn's Issues

Please document 'size' parameter

// Matcher<false> m(tvecs, size, qvecs, size, threshold, max_twiddles);

And for extra points, if you could kindly clarify threshold, is that the popcount difference (in %?) or the sum of hamming distance difference between any two 512bit binary descriptors?

Having trouble validating matches using opencv

Hi, and thank you very much for this code! I am using KORAL to get features and descriptors, and trying to match using K2NN. Using the below code:


	// ------------- KORAL ------------
	KORAL koral(scale_factor, scale_levels);
	KORAL koral2(scale_factor, scale_levels);

	int64 t0 = cv::getTickCount();

	koral.go(image.data, image.cols, image.rows, KFAST_thresh);
	koral2.go(image2.data, image2.cols, image2.rows, KFAST_thresh);

	int64 t1 = cv::getTickCount();
	double secs = (t1 - t0) / cv::getTickFrequency();
	std::cout << "features took: " << secs * 1000 << std::endl;


	cv::Mat image_with_kps, image_with_kps2;
	std::vector<cv::KeyPoint> converted_kps;
	std::vector<cv::KeyPoint> converted_kps2;

	for (const auto& kp : koral.kps) {
	
		const float scale = static_cast<float>(std::pow(scale_factor, kp.scale));
		converted_kps.emplace_back(scale*static_cast<float>(kp.x), scale*static_cast<float>(kp.y), 7.0f*scale, 180.0f / 3.1415926535f * kp.angle, static_cast<float>(kp.score));
	
	}
	for (const auto& kp : koral2.kps) {
	
		const float scale = static_cast<float>(std::pow(scale_factor, kp.scale));
		converted_kps2.emplace_back(scale*static_cast<float>(kp.x), scale*static_cast<float>(kp.y), 7.0f*scale, 180.0f / 3.1415926535f * kp.angle, static_cast<float>(kp.score));
	}


constexpr int runs = 50;
	constexpr int threshold = 5;
	constexpr int max_twiddles = 20;
	constexpr bool first_order = false;
	// --------------------------------
	
	Matcher<false> m(&koral.desc[0], static_cast<int>(koral.kps.size()), &koral2.desc[0], static_cast<int>(koral2.kps.size()), threshold, max_twiddles);
	std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();

	m.exactMatch();

	
	std::chrono::high_resolution_clock::time_point end = std::chrono::high_resolution_clock::now();

	const double sec = static_cast<double>(std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count()) * 1e-9 / static_cast<double>(runs);
	std::cout << std::endl << "Brute force K2NN found " << m.matches.size() << " matches in " << sec * 1e3 << " ms" << std::endl;



//	std::vector<cv::DMatch> dmatches;
	std::vector<cv::KeyPoint> _kps1;
	std::vector<cv::KeyPoint> _kps2;

	for (int i = 0; i < m.matches.size(); ++i) {
				
	//		dmatches.emplace_back(m.matches[i].q, m.matches[i].t, 0.0f);
			_kps1.push_back(converted_kps[m.matches[i].q]);
			_kps2.push_back(converted_kps2[m.matches[i].t]);
	}
	
	
	cv::drawMarker(image, _kps1[0].pt, cv::Scalar(255, 0, 0), cv::MARKER_CROSS, 8, 1, 8);
	cv::drawMarker(image2, _kps2[0].pt, cv::Scalar(255, 0, 0), cv::MARKER_CROSS, 8, 1, 8);

	cv::imshow("1", image);
	cv::imshow("2", image2);

I get 800+ matches, in a very short time.

What i am having trouble with, is validating the matches. The last section of the code above I would expect to show me two markers, on the two images, in the same place, but they are far apart instead. Am I doing something wrong here? or do i need to filter to get 'good' matches after running K2NN? Thank you again!

Returning entire <Match> results of training data set, not just the query set

Finally (I hope), in the examples you have say 500 descriptors in frame A and 500 descriptors in frame B = fairly straight forwards, works perfect, fast.

but..

Assuming that std::vector<uint8_t> tvecs(64 * 10000); and std::vector<uint8_t> qvecs(64 * 200);

(Way more training vectors than query vectors)

I see that the following

    Matcher<false> m(tvecs.data(), 10000, qvecs.data(), 200, threshold, max_twiddles);
    m.bruteMatch(true);

would only return at most qvecs.size() (query) of queries, never a complete search across the entire tvecs (training), am I approaching this correctly? My goal is to compare a single set of descriptors across perhaps hundreds of thousands of descriptors and find descriptors that match according to a threshold in that huge set

fastApprox match always gives same feedback as bruteForce() match

Not quite sure if I'm misunderstanding the threshold settings, so at https://github.com/dgtlmoon/ORB_LATCH_K2NN I have a pipeline of ORB / LATCH / K2NN, my results are that when I --search using fast-approx I was expecting that I would get more (or atleast some) descriptors that match that are not in the same set that I searched for.

My app scans images in a directory and appends all of their descriptors to a training.dat file for a K2NN search

I'm seeing that ONLY the same collection of descriptors in my query exists in the training and nothing else... even tho a lot of the images should have atleast a couple of similar descriptors.

also suspicious is that with fastApprox() all of the .t's (training vector index) is contiguous.. I would have expected that atleast some might be partial (threshold) matches

so hmm.. not sure how I would rank images by "similarity" (or by their K2NN ranking) here

When qvecs and tvecs are the same, there are no results

I set the byte vector to 255 (b11111111)

diff --git a/main.cpp b/main.cpp
index 8e80285..574ab25 100644
--- a/main.cpp
+++ b/main.cpp
@@ -76,8 +76,8 @@ int main() {
        void *qvecs = malloc(64 * size), *tvecs = malloc(64 * size);
        srand(36);
        for (int i = 0; i < 64 * size; ++i) {
-               reinterpret_cast<uint8_t*>(qvecs)[i] = static_cast<uint8_t>(rand());
-               reinterpret_cast<uint8_t*>(tvecs)[i] = static_cast<uint8_t>(rand());
+               reinterpret_cast<uint8_t*>(qvecs)[i] = 255;
+               reinterpret_cast<uint8_t*>(tvecs)[i] = 255;
        }
        // --------------------------------

Brute force K2NN found 0 matches in 100.563 ms
Throughput: 0.994399 billion comparisons/second.

time problem and right match size

hello! I have run your code,when I use KNN exactMatch metchod and bruteMatch.And I found time using exactMatch is more slower than using bruteMatch .the other problem is when I use fastApproxMatch to match,and result is show have no match.Thank you

Questions

  1. What is "size" in this context? is it that there are 10,000; rows/vectors with 64 bit values in each one? (each of the 10,000 has 64 0 or 1 values?) I'm confused how you mention that it's suitable for 512 but the 512 value is not used here (Or I really don't understand C++)
	// ------------- Generation of Random Data ------------
	// obviously, this is not representative of real data;
	// it doesn't matter for brute-force matching
	// but the MIH methods will be much faster
	// on real data
	void *qvecs = malloc(64 * size), *tvecs = malloc(64 * size);
	srand(36);
	for (int i = 0; i < 64 * size; ++i) {
		reinterpret_cast<uint8_t*>(qvecs)[i] = static_cast<uint8_t>(rand());
		reinterpret_cast<uint8_t*>(tvecs)[i] = static_cast<uint8_t>(rand());
	}
	// --------------------------------

  1. After matching, the m.matches contains an array of Match, how does that Match relate back to the tvecs ?

std::cout << std::endl << m.matches[34].q << std::endl; gives me the output of 242 (the 34 just a random number)

Warming up...
Testing...

242

Brute force K2NN found 1334 matches in 100.403 ms
Throughput: 0.995982 billion comparisons/second.

So what is that 242? is it an index so that like it would be tvecs[242] ?

So for example, if I loaded 2 sets of binary descriptors with 200 descriptors per set - it would have to scan 400 binary descriptors, 242 would mean its the 42nd descriptor of the 2nd set?

vector index going out of range in function exactmatch

Hi,
You have a really great set of tools here with LATCH and binary matcher implementations. I have tried k2NN matcher with LATCH, and it works. But I found the opencv bruteforce matcher faster than the regular bruteforce matcher from you. Since I wanted to try the MIH matcher to increase speed, I tried with the function exactmatch(). But the problem is, somehow the one async call to the function _remainderBruteMatch() drives the index of vector named "match_idxs" out of bounds. The exception occurs exactly on line 534 in the file K2NN.h.
Could it be possible that I'm doing something wrong? I tried varying the size of descriptor arrays, and also the number of twiddle tries before it goes to bruteforce. But the results are the same. Could you please check it out?
Thanks a lot.

Build error for fastApproxMatch()

Build error trying to use fastApproxMatch

diff --git a/main.cpp b/main.cpp
index 8e80285..b917772 100644
--- a/main.cpp
+++ b/main.cpp
@@ -85,13 +85,14 @@ int main() {
        Matcher<false> m(tvecs, size, qvecs, size, threshold, max_twiddles);
 
        std::cout << std::endl << "Warming up..." << std::endl;
-       for (int i = 0; i < warmups; ++i) m.bruteMatch();
+       for (int i = 0; i < warmups; ++i) m.fastApproxMatch();^M
        std::cout << "Testing..." << std::endl;
        high_resolution_clock::time_point start = high_resolution_clock::now();
-       for (int i = 0; i < runs; ++i) m.bruteMatch();
+       for (int i = 0; i < runs; ++i) m.fastApproxMatch();^M
        high_resolution_clock::time_point end = high_resolution_clock::now();
 
        const double sec = static_cast<double>(duration_cast<nanoseconds>(end - start).count()) * 1e-9 / static_cast<double>(runs);
        std::cout << std::endl << "Brute force K2NN found " << m.matches.size() << " matches in " << sec * 1e3 << " ms" << std::endl;
        std::cout << "Throughput: " << static_cast<double>(size)*static_cast<double>(size) / sec * 1e-9 << " billion comparisons/second." << std::endl << std::endl;
+^M
 }

g++ -c  -Wall -Wextra -Werror -Wshadow -pedantic -Ofast -std=gnu++17 -fomit-frame-pointer -mavx2 -march=native -mfma -flto -funroll-all-loops -fpeel-loops -ftracer -ftree-vectorize  main.cpp -o main.o
In file included from main.cpp:53:0:
K2NN.h: In instantiation of ‘void Matcher<eightbit>::_tabulate(int, int) [with bool eightbit = false]’:
K2NN.h:184:44:   required from ‘void Matcher<eightbit>::tabulate() [with bool eightbit = false]’
K2NN.h:378:11:   required from ‘void Matcher<eightbit>::fastApproxMatch() [with bool eightbit = false]’
main.cpp:88:54:   required from here
K2NN.h:191:30: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
   for (uint64_t i = start; i < start + count; ++i) {
                              ^
cc1plus: all warnings being treated as errors
Makefile:17: recipe for target 'main.o' failed
make: *** [main.o] Error 1

fastApproxMatch no results found

Well, atleast it's fast 💃

Warming up...
Testing...

Brute force K2NN found 0 matches in 11.2606 ms
Throughput: 8.88049 billion comparisons/second.

also, in this case you've switched the methods, so it should not say Brute Force..

std::cout << std::endl << "Brute force K2NN found " << m.matches.size() << " matches in " << sec * 1e3 << " ms" << std::endl;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.