tzaeschke / phtree Goto Github PK

View Code? Open in Web Editor NEW

118.0 13.0 21.0 3.58 MB

PH-Tree

License: Apache License 2.0

Java 100.00%

index multidimensional-index nearest-neighbor nearest-neighbor-search ph-tree phtree spatial spatial-index

phtree's People

Contributors

Stargazers

Watchers

phtree's Issues

More tests for IEEE converter

Add more tests for IEEE converter

Add alternative multimap implementation

What's the best strategy for use in concurrent applications?

First of all, thanks for sharing the source code of your research work, which I think demonstrates, in combination with your papers, an exemplary attitude for conducting and publishing research work!

My question: As far as I understand, this implementation is not thread-safe. Is this correct? If so, what's the best way to use this library in concurrent applications, e.g., global lock or is this data structure kind of reactive and enables transactional modifications?

Further, is the library's implementation of iterators thread-safe? If iterators are thread-safe, I would use a fine-grained synchronization like this (pseudo)-code:

public Set<T> query(...) {
    Iterator<T> it = null;
    Set<T> result = new HashSet<T>();
    synchronized (tree) {
        it = tree.query(...);
    }
    while (it.hasNext()) {
        result.add(it.next());
    }
    return result;
}

... otherwise this: (... which must, by the way, be similar to what would be required in the implementation of the iterators for storing a snapshot of the query result, if they are thread-safe but the data structure is not. The questions is just, what can I presume?)

public class ThreadsafeTree {
    private PHTree tree = new PHTree(...);

    public synchronized Set<T> query(...) {
        Iterator<T> it = tree.query(...);
        Set<T> result = new HashSet<T>();
        while (it.hasNext()) {
            result.add(it.next());
        }
        return result;
    }
    ...
}

Partial dimension queries?

Hi there, I apologize if this has an obvious answer, but I couldn't find it from reading the code or searching the papers. Is it possible to do a near-neighbours query in a PH-tree that omits some number of dimensions, but can still do a pseudo-table scan in the dimensions that are provided?

The APIs take longs/doubles so I guess the first thing I'd try would be to pass infinities for the doubles, or min/max values for longs. Do you think this would work? :)

Investigate excessive GC on update()

When running TinSpin, the update() test appears to cause a LOT of garbage collections.

kNN in 2.0.0 sometimes not finding the nearest neighbor

See TinSpin tests using v16

PH-tree - Project at National scientific computing laboratory - Brazil

Hello Mr. Zäschke,
I am a student of Information Technology and Communication in Brazil and currently work on a project titled Metric Indexing in Space Junctions at LNCC with Doctor Fábio Porto. In my research I worked on comparisons between data indexing methods like PH-tree, Slim-tree, Quatree. In these comparisons I observed the tree creation time and the search times to the neighbors in Range Queries. In short, I could see that PH-tree does not return all neighbors in their queries. I performed precision and recall calculations, and Slim-tree did better. Can you tell if the fact that PH-tree does not return all the neighbors is something normal of PH-tree's structure? It has something to do with the 'worst cases' mentioned in your article in '3.4 Spacial efficiency' or '3.5 Query efficiency'.

Thank you in advance.

André Muniz Demori - LNCC

Fix spurious error "array too large"

Fix spurious error messages:

[ERROR] Array too large: 7 - 4 = 3
[ERROR] Array too large: 7 - 4 = 3

Implement update() for 16HD

Consider using Java 17 Vector API (JEP 414)

Look at the new API and see whether it may improve performance.
Java already vectorizes code, so there may or may not be any benefit from using the new API.

PH-tree C++ version

Hi, i am trying to compile PH-tree C++ version for implement in another project. I'm trying to run the project on Clion. Do you have the CMakeList file? How you compile that? I'm trying to compile just the main file on terminal, but it's not working.

Best regards.

How PH-tree works with negative values?

Hello, another question I am studying that I did not find in the article is how binary conversion occurs on PH-tree with negative numbers. In the case of my project, I work with spatial positions, being right ascension and declination. Declination can have negative values. Can this affect something in the insertion and search in the PH-tree or does it come naturally with bit insertion?

Thank you in advance.

Am I using the best method to get distance between two 3d point clouds?

Great library, thank you!

I've got 2 3d point clouds of a few thousand points each. There are no correlated "pairs of points", so to find the minimum offset between the two clouds I have to run through 1 cloud, get the single nearest neighbor from the other cloud, and sum up all the distances.

This is slower than expected, I think partially because the phtree search is optimized for returning a series of nodes sorted by distance, rather than "the one closest output node for each input node"

My question: Is there a better way to call this library that reduces overhead, and get the closest node for a whole bunch of starting location nodes?

My code:

fun averageNeighborDistance(other: OrientedCloud) = getNearestNeighbors(other).entries.sumByDouble {
    it.key.distance(it.value)
} / size

fun getNearestNeighbors(other: OrientedCloud): Map<Vector3D, Vector3D> {
    val result = mutableMapOf<Vector3D, Vector3D>()
    oriented.forEach { myPoint ->
        result[myPoint] = other.tree.nearestNeighbour(1, *myPoint.toArray()).next()!!
    }
    return result
}

Those repeated calls to nearestNeighbour(1, is where my profiler says most of the time is spent. My uneducated guess is that some bulk function of "here is a a list of points from cloud0, please return the single nearest neighbor in cloud1 for each one" would possibly be faster.

However, maybe I am doing it the best way, and what looks like overhead with the NodeIteratorFullToList.init call is where the sort-by-distance happens, and isn't something that can be optimized around.

Improve update() for new multimap

The update() function for the new multimap could be more efficient.

Vectorization friendly w-query

Consider porting the algorithm for min/max mask in window queries from phtree-cpp. The algorithm is more vectorization friendly.

tzaeschke / phtree Goto Github PK

phtree's People

Contributors

Stargazers

Watchers

Forkers

phtree's Issues

Recommend Projects

Recommend Topics

Recommend Org