Bug 5660 – yield syntax sugar

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2011-02-27T17:55:00Z
Last change time
2017-07-21T05:32:08Z
Assigned to
nobody
Creator
bearophile_hugs

Comments

Comment #0 by bearophile_hugs — 2011-02-27T17:55:27Z
Ranges are flexible and useful, but my practice with Python has shown me that many times you just want something simple that yields items lazily. To do this in D there is opApply() (that currently doesn't play well with most Phobos. Only std.array.array() and few other things are able to use them), but its syntax is awful, and even after years of usage I can't remember it and I need to look for an example of opApply() usage to copy & modify. opApply() requires a significant amount of boilerplate code that makes code quite longer, obfuscates the purpose of the code, and is bug-prone. This is a simple example (D2 code, works with dmd 2.052): // Program #1 import std.stdio; /// Sequence of moves to solve Towers of Hanoi OEIS A001511 immutable final class hanoiTower { static opCall() { return new typeof(this)(); } int opApply(int delegate(ref int) dg) { int result; int y = 1; result = dg(y); if (result) return result; foreach (x; hanoiTower()) { y = x + 1; result = dg(y); if (result) return result; y = 1; result = dg(y); if (result) return result; } return result; } int opApply(int delegate(ref int, ref int) dg) { int result; int i, y = 1; result = dg(i, y); i++; if (result) return result; foreach (x; hanoiTower()) { y = x + 1; result = dg(i, y); i++; if (result) return result; y = 1; result = dg(i, y); i++; if (result) return result; } return result; } } class genChar { char stop; this(char stop_) { stop = stop_; } static opCall(char stop_) { return new typeof(this)(stop_); } int opApply(int delegate(ref char) dg) { int result; for (char c = 'a'; c < stop; c++) { result = dg(c); if (result) return result; } return result; } int opApply(int delegate(ref int, ref char) dg) { int result, i; for (char c = 'a'; c < stop; c++) { result = dg(i, c); i++; if (result) return result; } return result; } } void main() { foreach (i, move; hanoiTower()) { writeln(i, " ", move); if (i >= 20) break; } writeln(); foreach (i, c; genChar('h')) writeln(i, " ", c); } So I suggest a syntax like this, that's just syntax sugar for the precedent code: // Program #2 import std.stdio; yield(int) hanoiTower() { yield(1); foreach (x; hanoiTower()) { yield(x + 1); yield(1); } } yield(auto) genChar(char stop) { for (char c = 'a'; c < stop; c++) yield(c); } void main() { foreach (i, move; hanoiTower()) { writeln(i, " ", move); if (i >= 20) break; } writeln(); foreach (i, c; genChar('h')) writeln(i, " ", c); } Different good modern used languages like Python, C# and Scala have a clean yield syntax, they show the way (if necessary I may translate this program #2 to those other three languages.) The normal syntax of opApply() is of course kept in D, this is just an additive change, that acts at syntax level (so just a lowering is needed to implement it). I suggest the generators with yield to be classes that define a "static opCall()" because their semantics is more flexible than structs. This causes a heap allocation, but situations where performance is so important are less common, and in such situation the programmer may just write the normal struct code with opApply(), or use a struct with more inlinable range protocol methods, or use something more lower level. The syntax uses yield(...) as return type to denote such iterable classes, and yield(auto) too supported, and it acts like auto return type for functions (all yields in a generator must yield the same type). Some disadvantages: - "yield" may need to become a kind of keyword. - Currently opApply() doesn't work well with std.algorithm and Phobos in general. This is true for the normal current usage of opApply(), but this proposal makes their usage more attractive and probably more widespread.
Comment #1 by dsimcha — 2011-02-27T18:08:32Z
opApply plus fibers should do what you need. It's inefficient, but so is pretty much any coroutine/yield-based way of doing things, including Python's. I was thinking a while back that something like this belonged in std.range, but I wanted to handle ref properly, so Bug 2443 got in my way. Example (Warning: Untested.): /** This must be a class or otherwise have reference/heap-allocated semantics. */ class OpApplyToRange(Iterable) { Fiber fiber; ForeachType!Iterable _front; bool _empty; Iterable iterable; void doLoop() { foreach(elem; iterable) { _front = elem; Fiber.yield(); } _empty = true; } this(Iterable iterable) { this.iterable = iterable; fiber = new Fiber(&doLoop); fiber.call(); } void popFront() { fiber.call(); } ForeachType!Iterable front() @property { return _front; } bool empty() @property { return _empty; } }
Comment #2 by bearophile_hugs — 2011-08-03T04:48:48Z
More programming in D shows me that syntax sugar like this one will be very useful to me: yield(int) foo() { yield(1); } So, is it possible to automatically convert such kind of code into a Range (istead of opApply and fibers)?
Comment #3 by bearophile_hugs — 2011-08-03T05:08:13Z
Just as an example, this is Python2.6 code: def process((i, j), a, b): if i == 0: yield (a, j) if j == 0: yield (i, b) if i == a: yield (0, j) if j == b: yield (i, 0) if j != b: yield (i + j - b, b) if (b < i + j) else (0, i + j) if i != a: yield (a, i + j - a) if (a < i + j) else (i + j, 0) And this is how ShedSkin compiles it to C++ (a bit edited): class __gen_process : public __iter<tuple2<int, int> *> { public: tuple2<int, int> *__2, *__3; int a, b, i, j; int __last_yield; __gen_process(tuple2<int, int> *__2,int a,int b) { this->__2 = __2; this->a = a; this->b = b; __last_yield = -1; } tuple2<int, int> * __get_next() { switch(__last_yield) { case 0: goto __after_yield_0; case 1: goto __after_yield_1; case 2: goto __after_yield_2; case 3: goto __after_yield_3; case 4: goto __after_yield_4; case 5: goto __after_yield_5; default: break; } if ((i==0)) { __last_yield = 0; __result = (new tuple2<int, int>(2,a,j)); return __result; __after_yield_0:; } if ((j==0)) { __last_yield = 1; __result = (new tuple2<int, int>(2,i,b)); return __result; __after_yield_1:; } if ((i==a)) { __last_yield = 2; __result = (new tuple2<int, int>(2,0,j)); return __result; __after_yield_2:; } if ((j==b)) { __last_yield = 3; __result = (new tuple2<int, int>(2,i,0)); return __result; __after_yield_3:; } if ((j!=b)) { __last_yield = 4; __result = (((b<(i+j)))?((new tuple2<int, int>(2,((i+j)-b),b))):((new tuple2<int, int>(2,0,(i+j))))); return __result; __after_yield_4:; } if ((i!=a)) { __last_yield = 5; __result = (((a<(i+j)))?((new tuple2<int, int>(2,a,((i+j)-a)))):((new tuple2<int, int>(2,(i+j),0)))); return __result; __after_yield_5:; } __stop_iteration = true; } }; __iter<tuple2<int, int> *> *process(tuple2<int, int> *__2, int a, int b) { return new __gen_process(__2,a,b); }
Comment #4 by bearophile_hugs — 2011-09-12T04:58:31Z
Alternative syntax (here inside the generator yield is used like the "return" statement): yield(int) hanoiTower() { yield(1); foreach (x; hanoiTower()) { yield x + 1; yield 1; } }
Comment #5 by bearophile_hugs — 2011-09-12T04:59:15Z
Alternative syntax (here inside the generator yield is used like the "return" statement): yield(int) hanoiTower() { yield 1; foreach (x; hanoiTower()) { yield x + 1; yield 1; } }
Comment #6 by witold.baryluk+d — 2012-02-14T21:18:51Z
I have library for advanced iterators using Fibers. Take look here https://github.com/baryluk/iterators_2 (it is not update, but I use it very often, and locally have many updates to it, not just commited). Library was created before D ranges was introduced, so few things should be improved, but it still works. But any way - no need to syntax sugars.
Comment #7 by bearophile_hugs — 2012-02-15T04:24:02Z
(In reply to comment #6) > But any way - no need to syntax sugars. This was a request for syntax sugar for opApply. No need for Fibers here (fibers are useful, but here I am not asking for them).
Comment #8 by destructionator — 2012-07-29T07:51:55Z
// this is your sugar! string yield(string what) { return `if(auto result = dg(`~what~`)) return result;`; } Now, every time you would write result = dg(y); if (result) return result; instead write: mixin(yield("y")); Or mixin(yield("i, y")); if you have multiple arguments. And the boilerplate is way down. Since you keep result around, you'll want to make it just if(result) instead of if(auto result), but the idea is the same. We might be able to make a class template or a mixn for yield(int) hanoiTower() too, but writing int opApply(...) isn't too bad.
Comment #9 by adrian — 2012-10-12T05:46:18Z
Upvoted! Sometimes the yield syntax is the most natural and gives the most readable code. And it should be lowered to an input range, as this is often the only kind of iteration accepted by Phobos.
Comment #10 by bearophile_hugs — 2014-01-07T18:32:09Z
I think a "yield" with an usage syntax like this (semantically similar to the Python "yield") that's syntax sugar for a ForwardRange is a good and handy addition to the language: yield(int) foo() { yield 1; yield 2; } void main() { import std.stdio, std.algorithm; foo.map!(x => x).writeln; } See also for an improvement: http://www.python.org/dev/peps/pep-0380/ That explains the desire for also a "yield foreach" (this is not a replacement for "yield", it's an addition): yield(int) bar() { yield foreach foo(); }
Comment #11 by bearophile_hugs — 2014-01-07T18:39:04Z
See also Issue 11880 for reserving the keyword "yield".
Comment #12 by dlang-bugzilla — 2017-07-21T05:32:08Z
I believe it has become an accepted fact that such constructs are done in library code (either using std.concurrency, or Vibe's implementation of fibers), and said implementations are generally satisfactory. In either case, I believe that today enhancement requests to the language itself need to be presented as a D Improvement Proposal: https://github.com/dlang/DIPs If you think this proposal still has merit today, please file a DIP. The current DIP manager can assist you through the process.