In std.parallelism, I have a simple need to block on all tasks in a pool being completed. (In my use case tasks don't return a value, and errors within a task are discarded.) Currently the only way to block would be to manually maintain a collection of all tasks added to a pool.
I propose that TaskPool.finish() take a backwards-compatible option to block:
finish(bool block=false)
Comment #1 by john — 2012-06-09T20:01:48Z
Looking at source I see the original version includes TaskPool.join() which has the desired behavior. However the method definition is disabled:
https://github.com/D-Programming-Language/phobos/blob/master/std/parallelism.d#L3097
In 2011 dsimcha mentioned the join function [1]:
"One thing Andrei mentioned that I'm really not sure about is what to do
with TaskPool.join(). My example for it is still terrible, because I
think it's an evolutionary artifact. It was useful in earlier designs
that were never released and didn't have high-level data parallelism
primitives. I never use it, don't have any good use cases for it and
would be inclined to remove it entirely. Andrei seems to have some good
use cases in mind, but he has not detailed any that I believe are
reasonably implementable and I'm not sure whether they could be solved
better using the higher-level data parallelism primitives."
and
"I put TaskPool.join() in a version(none) statement and removed all
references to it from the documentation. If anyone has a compelling
reason why this is useful (preferably with a good, short example), I'll
add it back. Otherwise, it's gone, since IMHO it's an evolutionary
artifact that makes sense in theory but that I can't think of a use for
anymore in practice."
Well, I have a use case :) I'm writing a D implementation of luaproc, the Lua share-nothing message passing framework described in "Exploring Lua for Concurrent Programming" [2]. The std.parallelism module nicely replaces several hundred lines of low level multithreading code in the original C implementation. The design requires a single queue of tasks serviced by multiple threads, where the tasks do not yield a return value (they only optionally schedule more tasks), and any errors within a task are discarded (i.e. exceptions don't need to be propagated to parent tasks). The design specifies a function which blocks on the completion of all tasks, and this is the only thing std.parallelism is missing.
[1] http://forum.dlang.org/thread/[email protected]
[2] http://www.inf.puc-rio.br/~roberto/docs/ry08-05.pdf
Comment #2 by john — 2012-06-09T20:39:07Z
I do prefer this being an option to finish() rather than a separate join() function. We all understand what finish() does and it's not a stretch at all to accept that there's blocking and non-blocking invocations (as evidenced by current doc needing to clearly state that it's non-blocking).
Also I don't think the docs need a big example for the blocking case. It should suffice to say that blocking invocation only makes sense when task results aren't needed (otherwise the corresponding Task.*Force() calls would provide the blocking).
The std.parallelism tests aren't passing reliably (though also not failing consistently either) any more, on most platforms after this was committed.
freebsd/32, linux/*, osx/64/32:
[email protected](4061): unittest failure
So far, no win/32 or freebsd/64 failures, but I suspect it's only a matter of time.
@braddr: would you kindly add me to while list for pull auto-tester?
Comment #8 by braddr — 2012-06-17T12:03:29Z
You were already. Both this pull and the previous one ran through the auto tester, and show(ed) failures. You're not done fixing things yet based on the current test results for this pull request.
Comment #9 by john — 2012-06-17T12:24:28Z
Ack. I've disabled the failing assert as I don't have a good solution offhand.
I don't feel too bad about disabling this particular assert since this was a test I added for existing library behavior.