Code Monkey home page Code Monkey logo

Comments (6)

xz-liu avatar xz-liu commented on September 7, 2024

I believe it is common practice to set a range of matchable entities using a test set before computing similarity, which, in my view, relies on a 1-to-1 mapping assumption. This evaluation setting is widely used in other repositories as well. When running the evaluation code, the embeddings are typically sorted and filtered based on the test pairs before proceeding with the evaluation.

from clusterea.

AdFiFi avatar AdFiFi commented on September 7, 2024

Using ground-truth counterparts as candidates?

from clusterea.

xz-liu avatar xz-liu commented on September 7, 2024

Yes. In almost all papers, only the test pairs are considered when calculating the embeddings. Our paper introduces small blocks to allow for scalability, and this filtering process is implemented within these small blocks. This is equivalent to filtering globally during the evaluation.

You can find similar implementations in OpenEA and DualAMN. I believe this approach adheres to the assumption of 1-to-1 mapping.

If you are interested in exploring beyond the 1-to-1 mapping assumption, you may want to look into the paper on knowledge graph alignment with dangling cases.

Thank you so much for your interest in our work. We are open to questions at any time.

from clusterea.

AdFiFi avatar AdFiFi commented on September 7, 2024

But why aren't global_matrix and global_matrix_t in main.py square matrices? The size is the number of nodes in the source graph and the target graph, right?

from clusterea.

xz-liu avatar xz-liu commented on September 7, 2024

Yes. I recall that when I implemented that, I used a sparse matrix so that the similarity matrix would not include the filtered entries. This allowed for filtered evaluation even though the matrix size remains the full size. This was the most convenient way to implement it since we need sparse matrices to store the similarity between a large number of items anyway, and the matrix size is just metadata, not reflecting the actual size of the data.

You could help me check whether this implementation is correct. If not, by fixing it, you would probably achieve a better score than mine.

from clusterea.

AdFiFi avatar AdFiFi commented on September 7, 2024

Thank you for your answering and sharing, which help me understand this work better.

from clusterea.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.