Over and over I am faced with a similar problem: I have to perform two actions that are mostly unrelated, except that they need to share a mutex lock, at least for a moment. For example:

void action() {
    doStuff();
    cleanup();
}

//Somewhere far away:
std::mutex p;
void doStuff() {
    std::unique_lock lock(p);
    ...
}

//In some other, distant corner of code:
std::mutex q;
void cleanup() {
     std::unique_lock lock(q);
     ...
}

Now, when this is all compiled, it produces a sequence of operations like this:

void action() {
    lock p
    ... //doStuff
    unlock p
    // <-- bad things can happen here
    lock q
    ... //cleanup
    unlock q
}

The problem is the gap between unlocking p and locking q. It is unlikely, but something bad could happen right between those two. I would like to have those happen in the reverse order: first lock q and only then unlock p. Is there a way to organize code, a pattern perhaps, so that it could be achieved, while maintaining the code of doStuff and cleanup decoupled as much as possible?

Edit: you can assume that doStuff and cleanup are not going to be called in reverse order. In actual code this is not possible to do.


To give a concrete example of this happening: I have a class managing a file-based cache system, and it needs to “catch” a mutex/atomic from within the implementation of std::shared_ptr:

(possible implementation in std)
template<class T>
class shared_ptr_control_block {
    atomic_int refCount;
    atomic_int weakRefCount;
    T* value;
    std::function<void(T*)> deleter;
    
    void decrement() { //invoked by shared_ptr destructor
        if (--refCount == 0) {  //this is my (implicit) p lock
             deleter(value);
        }
        ...
    }
}

/////
class Cache {
public:
    std::shared_ptr<Data> get(std::string filename) {
         std::unique_lock lock(m);
         std::shared_ptr<Data> result = data_[filename].lock();
         if(!result) {
             Data* ptr = new Data;
             openActualFileAndReadIntoDataPtr(filename, ptr);
             result = std::shared_ptr<Data>(ptr, [this,filename](Data* ptr){
                 std::unique_lock lock(m);  //this is my q lock
                 storeDataPtrBackToFile(filename, ptr);
                 delete ptr;
             });
             data_[filename] = result;
         }
         return result;
    }
private:
    std::mutex m;
    std::unordered_map<std::string, std::weak_ptr<Data>> data_;
}

Cache is supposed to be thread-safe: Multiple threads can request the Data under the same name. One thread will open the actual file, and the second thread will reuse the same data pointer. Either thread may modify Data (it is Data’s job to maintain its thread-safety) and when the last thread destroys its Data shared pointer, it is again saved to a file.

However, there is a subtle gap: when refCount is decremented to 0, but before shared_ptr’s deleter is called, another thread may request the same file. What may end up happening, is that the corresponding weak_ptr will recognize the pointer as expired, and Cache will reread the old version of the file – before the deleter managed to save the newer file version.

A straightforward solution in this particular case is to convert the refCount atomic to {mutex, int} pair, and have the deleter invoked while the mutex is still held. As soon as Cache::m is locked, the refCount’s mutex could be released, but to achieve that I would have to actually pass the lock from shared_ptr into the deleter, so that it could be unlocked.

This straightforward solution practically requires reimplementing the whole shared_ptr. Not too difficult, but time-consuming and head-scratching with a question lingering – “isn’t there a better way?”

You’re asking multiple questions.

For the general case of calling multiple decoupled functions with correct mutual exclusion, the solution is to

  1. Expose the mutex associated with each function
  2. Split the function into
    • an outer, which gets std::unique_lock as now, and calls
    • the inner, which performs the work
  3. Have your compound operation use std::scoped_lock to acquire all the locks
  4. Call the inner functions, while all locks are held

If you’re doing this a lot, you can do a neat job of bundling these functions into Lockable objects with CRTP and automating the pattern.


Your second question, about the cache specifically, is more easily solved with a small change: have the deleter remove the weak_ptr from the map. Now you have three easily distinguished cases:

  1. Key isn’t in the map at all, so open the file and add it

  2. Key is in the map, and locking the weak pointer succeeds: no change

  3. Key is in the map but locking the weak pointer fails: the file is about to be written out in another thread and you can’t do anything to stop it (you can’t prevent the control block being deleted, for one thing).

    That means you need to temporarily release the mutex (so the deleter can run), wait for it to finish, and then reacquire the mutex and reopen the file as for case 1.

    This is exactly the pattern of waiting on a condition variable. So, add one of those and have the deleter notify it after removing the map entry.


Actually the easier solution is probably to change your cache into an LRU cache and write files out not when the last concurrent user finished, but when the next new file is requested. Since both happen under the same lock this is sufficient to fix the race condition.

However this changes the behaviour of your cache while the solution above only fixes the specific problem.

11

General question from the first part is solved by using a third mutex, that is always captured before working with any of lower level components. This ensures proper locking order and safety without invasive changes in lower level components.

Surprisingly, the example problem from second part has nothing to do with concurrency!

The abstraction leak you experience is caused by exposing the implementation detail (shared_ptr) to clients of the component.

Instead of a shared_ptr, return a wrapper object that knows about Cache’s mutex and handles weak dereferencing (capturing the mutex before disposal or dereference). Note, that in your context you are unlikely to need the full power of shared_ptr in client code, so expose the minimal interface client needs and keep implementation simple.

6