Does it make sense to have a user specified thread limit?

03/11/2022 softwareengineering

I’m developing a C++14 application and would like to take advantage of the new multithreading features, in particular std::async. I have seen a number of applications which allow the user to specify the maximum number of software threads that can be used for the duration of the program run. However, the recommend usage of std::async is default launch policy, which implies no control over the number of software threads that are actually created.

I assume the idea of explicit thread limits is to try control the number of cores used by the program, but I believe this to be misleading as the number of cores used will ultimately be determined by the OS. Is there any good reason to allow the program user to explicitly limit the number of software threads created by a program, or is it always preferable to let the implementation and OS handle threading?

Rather than allow the user to specify a thread limit, my intended solution was to just have a flag to allow enable threading, does this kind of user control even offer any advantage?

It is not generally sensible to limit the number of threads, if these threads are used only for concurrency. I.e. aside from the extra resource use, spawning threads is fine to manage blocking operations, increase responsiveness, …. A good example is a web crawler that might want to download multiple small resources, and uses multiple threads to compensate the latency of the requests. However, using explicitly asynchronous approaches such as async IO or event loops will have the same advantages and should be preferred if possible.

If your application uses parallelism to speed up CPU-intensive computations, then limiting the number of threads is very useful. You will see no advantages over spawning as many threads as processors are available. However, your program is not the only one running, and other CPU-intensive programs might be running in parallel.

You could either be very clever and adapt the number of worker threads to the current load of the system (which would lead to unpredictable performance of your program). Or you could let the user select the number of threads, which is safer. In particular, you would at least want an option to only use a single worker thread, which makes debugging much easier. In such a case, do what make does: a single thread is a good default, but some users need the full power of make -j$(nproc).

IMO threads can be split into three main categories.

Long running threads that monitor one thing per thread. Putting a limit on these will limit the number of things you can monitor which is likely not what you want to do. If you do put a limit it should be very high and mostly a sanity check.

Threads that spend most of their time on computation. You generally want a similar number of these to CPU cores.

Threads that work on processing requests with one request per thread at a time but after the request is completed the thread is either destroyed or waits for anothe request. Processing a request involves some CPU use but also a lot of waiting for disk/network/etc. Here you have a balancing act. Too many threads mean resource thrashing, too few means your request processing rate is limited by network latency and can mean that requests that could be served from disk-cache get queued up behind requests that actuallly need to hit the platters.

LEAVE A COMMENT Hủy