Code Monkey home page Code Monkey logo

Comments (4)

achirkin avatar achirkin commented on May 24, 2024

Does cugraph run something in multiple processes there? The registry_ is a static, global per-process variable, and the error seems to be happening at __run_exit_handlers, which happens at the program exit.
A wild guess: could it be related to the order of destruction between the registry_ and the mutex_? What if the swap the two lines in here?

/** Global registry of thread-local cancellation stores. */
static inline std::unordered_map<std::thread::id, std::weak_ptr<interruptible>> registry_;
/** Protect the access to the registry. */
static inline std::mutex mutex_;

(update: no, this didn't work)

from raft.

cjnolet avatar cjnolet commented on May 24, 2024

@achirkin, I'm wondering if we could use somehting at the global level or non-member thread-local level like a std::atomic<bool> flag that could be flipped to true when the deleter is called on the registry_? Can you think of any other way we might be able to check for this state so we can handle it gracefully?

cc @jrhemstad for thoughts as well- it looks like we have a race condition happening in the callback for interruptible where it's trying to use registry_ after it's already been deleted. The order of deconstruction for static members seems to be inconsistent from behavior that @seunghwak has been seeing.

from raft.

seunghwak avatar seunghwak commented on May 24, 2024

As I mentioned in the slack thread,

This is due to the arbitrary destruction order between s and registry_.
https://github.com/rapidsai/raft/blob/branch-23.02/cpp/include/raft/core/interruptible.hpp#L134
https://github.com/rapidsai/raft/blob/branch-23.02/cpp/include/raft/core/interruptible.hpp#L182

It seems like there is no guaranteed destruction order between a thread local static variable defined inside a member function and static class member variables.

If s is destroyed after registry_ is destroyed, registry_ will be accessed after destruction.

I confirmed this is indeed happening by defining

class MyClass {
 public:
  MyClass() { std::cout << "MyClass() called" << std::endl; }
  ~MyClass() { std::cout << "~MyClass() called." << std::endl; }
};

and adding static inline MyClass my_class_; after registry_ and mutex_ (https://github.com/rapidsai/raft/blob/branch-23.02/cpp/include/raft/core/interruptible.hpp#L184).

Also added the printout statement in the custom deleter (https://github.com/rapidsai/raft/blob/branch-23.02/cpp/include/raft/core/interruptible.hpp#L213); this custom deleter will be called when s is destroyed.

In testing, the print statement from ~MyClass sometimes appears before the print statement from the custom destructor, meaning the custom destructor can access registry_ after destruction.

from raft.

achirkin avatar achirkin commented on May 24, 2024

Here's a rather ugly, but seemingly working, fix #1229 - wrap the registry_ in a shared pointer and use weak_ptr.lock() to avoid accessing it after it's been deleted.

from raft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.