Comment #0 by bearophile_hugs — 2011-02-08T17:01:39Z
I suggest to add an enumerate() to Phobos std.range module. An alternative name is count().
enumerate() is a simple function, it takes one iterable and returns an iterable of (index, item) pairs:
>>> list(enumerate("abcd"))
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]
A second optional argument allows to define a start value different from zero:
>>> list(enumerate("abcd", 3))
[(3, 'a'), (4, 'b'), (5, 'c'), (6, 'd')]
This is an example usage of Python2.6 enumerate():
>>> comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1]
>>> [i for i,b in enumerate(comp) if not b]
[2, 3, 5, 7, 11, 13, 17, 19]
>>> comp = comp[2:]
>>> comp
[0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1]
>>> [i for i,b in enumerate(comp, 2) if not b]
[2, 3, 5, 7, 11, 13, 17, 19]
Here the input 'comp' is a list of bits (booleans) produced by a Sieve of Eratosthenes, that represents the composite numbers. The desired output (a sequence of primes) is a list of indexes where the value b is false, so it's prime.
This is a D2 translation in procedural style:
import std.stdio;
void main() {
auto comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1];
int[] result;
foreach (i, p; comp)
if (!p)
result ~= i;
writeln(result);
comp = comp[2..$];
result.length = 0;
foreach (i, p; comp)
if (!p)
result ~= i + 2;
writeln(result);
}
A translation in functional style:
import std.stdio, std.algorithm, std.range;
void main() {
auto comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1];
auto result1 = map!q{a[0]}(filter!q{!a[1]}(zip(iota(comp.length), comp)));
writeln(result1);
comp = comp[2..$];
auto result2 = map!q{a[0]+2}(filter!q{!a[1]}(zip(iota(comp.length), comp)));
writeln(result2);
}
An enumerate() allows to write simpler code:
import std.stdio, std.algorithm, std.range;
void main() {
auto comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1];
auto result1 = map!q{a[0]}(filter!q{!a[1]}(enumerate(comp)));
writeln(result1);
comp = comp[2..$];
auto result2 = map!q{a[0]}(filter!q{!a[1]}(enumerate(comp, 2)));
writeln(result2);
}
zip(iota(foo.length), foo) becomes even worse when the iterable foo is lazy and doesn't define a length:
zip(iota(walkLength(foo)), foo)
Comment #1 by bearophile_hugs — 2012-03-26T15:40:12Z
This is a basic implementation of enumerate() (it's only an InputRange, but probably a richer range too should be supported).
The demo code in main() shows the task of processing the Sieve of Eratosthenes flags without and with an enumerate(). The version with enumerate() is shorter and simpler to understand than the version with zip.
import std.stdio, std.algorithm, std.range, std.typecons, std.traits, std.array;
struct Enumerate(R) {
R r;
int i;
alias r this;
@property Tuple!(typeof(this.i), typeof(R.init.front)) front() {
return typeof(return)(i, this.r.front);
}
void popFront() {
this.r.popFront();
this.i++;
}
}
Enumerate!R enumerate(R)(R range, int start=0) if (isInputRange!R) {
return Enumerate!R(range, start);
}
void main() {
// not prime flags, from a Sieve of Eratosthenes.
// 0 = prime number, 1 = not prime number. starts from index 2.
auto flags = [0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1];
// not using enumerate
iota(2, int.max).zip(flags).filter!q{!a[1]}().map!q{a[0]}().writeln();
iota(int.max).zip(flags).filter!q{!a[1]}().map!q{a[0] + 2}().writeln();
// using enumerate
//flags.enumerate(2).filter!q{!a[1]}().map!q{a[0]}().writeln(); // error
filter!q{!a[1]}(flags.enumerate(2)).map!q{a[0]}().writeln();
}
Note: this produces a compilation error, I don't know why yet, maybe it's a bad interaction with alias this:
flags.enumerate(2).filter!q{!a[1]}().map!q{a[0]}().writeln();
Comment #2 by bearophile_hugs — 2012-03-26T17:20:14Z
The problem was caused by the alias this. This avoids the problem:
import std.stdio, std.algorithm, std.range, std.typecons, std.traits, std.array;
struct Enumerate(R) {
R r;
int i;
@property bool empty() {
return this.r.empty;
}
@property Tuple!(typeof(this.i), typeof(R.init.front)) front() {
return typeof(return)(i, this.r.front);
}
void popFront() {
this.r.popFront();
this.i++;
}
}
Enumerate!R enumerate(R)(R range, int start=0) if (isInputRange!R) {
return Enumerate!R(range, start);
}
void main() {
auto flags = [0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1];
flags.enumerate(2).filter!q{!a[1]}().map!q{a[0]}().writeln();
}
The last line is noisy but it's readable.
This is less noisy but longer:
flags.enumerate(2).filter!(t => !t[1])().map!(t => t[0])().writeln();
Comment #3 by bearophile_hugs — 2012-06-25T04:21:59Z
See also Issue 8155
----------
enumerate and AA.byPair() are also handy when you need to count associative array key-value pairs:
foreach (i, k, v; associativeArray.byPair.enumerate()) {...}
Comment #4 by bearophile_hugs — 2012-07-13T14:54:58Z
A well implemented enumerate() seems useful to avoid one of my recurrent D bugs. This is a simplified example to show the bug. I have to create a matrix like this:
0 10 20 30
0 11 21 31
0 12 22 32
There are many ways to inizialize a matrix like that, this is one way, but it's not complete:
import std.stdio;
void main() {
auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
foreach (r, row; M)
foreach (c, ref item; row)
item = c * 10 + r;
writefln("%(%s\n%)", M);
}
It outputs:
[0, 10, 20, 30]
[1, 11, 21, 31]
[2, 12, 22, 32]
To complete the algorithm, that is to not touch the first column, I can use just a slice in the second foreach, to not touch the first column:
import std.stdio;
void main() {
auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
foreach (r, row; M)
foreach (c, ref item; row[1 .. $])
item = c * 10 + r;
writefln("%(%s\n%)", M);
}
But this introduces the bug:
[0, 0, 10, 20]
[0, 1, 11, 21]
[0, 2, 12, 22]
Slicing 'row' from the second item I avoid to write on the first cells of each row, but the 'c' index doesn't get sliced, it starts from zero still. One correct version needs to increment c by one in the formula:
import std.stdio;
void main() {
auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
foreach (r, row; M)
foreach (c, ref item; row[1 .. $])
item = (c + 1) * 10 + r;
writefln("%(%s\n%)", M);
}
Another solution is to ignore the c==0:
foreach (r, row; M)
foreach (c, ref item; row)
if (c != 0)
item = c * 10 + r;
A well implemented enumerate() is one way to avoid that problem (other solutions are possible), now 'c' gets sliced, so it doesn't start from zero:
import std.stdio;
void main() {
auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
foreach (r, row; M)
foreach (c, ref item; enumerate(row)[1 .. $])
item = c * 10 + r;
writefln("%(%s\n%)", M);
}
Comment #5 by bearophile_hugs — 2012-07-16T04:15:47Z
A comment by Christophe Travert:
> enumerate could be useful with retro too. You may want to change the
> order of the enumeration, but not the order of the indices.
Comment #6 by bioinfornatics — 2012-11-20T05:05:05Z
I agree this is really an important feature when using Phobos range.
My need it is to get an enumerate isntance who lzazy evaluate (index, nth elements) to avoid copy/construct the new range which cost cpu time with a big range
Comment #7 by bearophile_hugs — 2013-03-04T17:48:13Z
An alternative idea is to introduce the "imap" and "ifilter" functions, similar to the "mapi" function of F# language:
import std.stdio: writeln;
import std.algorithm: imap;
void main() {
"abcd".imap!(t => t).writeln;
}
Should print:
[Tuple!(uint, dchar)(0, 'a'), Tuple!(uint, dchar)(1, 'b'), Tuple!(uint, dchar)(2, 'c'), Tuple!(uint, dchar)(3, 'd')]
This means that imap gives to the mapping function a tuple that contains the index and the item. ifilter works similarly (unfortunately D doesn't yet have a syntax for tuple unpacking in function signatures).
This is nice and handy, but enumerate() can be used in more situations. On the other hand imap is a little shorter:
"abcd".imap!(t => t).writeln;
"abcd".enumerate.imap!(t => t).writeln;
In the end I prefer enumerate.
Comment #8 by bearophile_hugs — 2013-03-30T14:07:25Z
(In reply to comment #7)
> In the end I prefer enumerate.
"enumerate.map" is more composable than "imap". But iFilter is handy. See Issue 9841
Comment #9 by jakobovrum — 2014-01-16T07:38:00Z
(In reply to comment #0)
> I suggest to add an enumerate() to Phobos std.range module. An alternative name
> is count().
Posted a pull request for a library implementation. The name `count` is already taken to mean something else.
https://github.com/D-Programming-Language/phobos/pull/1866
Comment #10 by bearophile_hugs — 2014-01-16T08:13:27Z
(In reply to comment #9)
> Posted a pull request for a library implementation. [...]
>
> https://github.com/D-Programming-Language/phobos/pull/1866
Seems nice.
A comment from the pull request:
> This could also be implemented in the language (though
> potentially ambiguous with automatic foreach unpacking),
> but here's a library solution for review.
In some cases I like language-level solutions, but in this case I think the library solution is good enough and safer.
Regarding your implementation, I suggest to add an optional "start" argument, as in Python enumerate():
auto data = [10, 20];
foreach (i; x; data.enumerate(3))
write(i, " ");
==>
3 4
Comment #11 by jakobovrum — 2014-01-16T09:32:06Z
(In reply to comment #10)
> > This could also be implemented in the language (though
> > potentially ambiguous with automatic foreach unpacking),
> > but here's a library solution for review.
>
> In some cases I like language-level solutions, but in this case I think the
> library solution is good enough and safer.
Review of a PR is probably best done on Github.
> Regarding your implementation, I suggest to add an optional "start" argument,
> as in Python enumerate():
>
> auto data = [10, 20];
> foreach (i; x; data.enumerate(3))
> write(i, " ");
>
> ==>
>
> 3 4
Thanks, I'll add it.
Comment #12 by bearophile_hugs — 2014-01-17T01:12:05Z
This should go in the unittests of enumerate():
void main() {
auto arr = [0];
foreach (immutable i, ref x; arr.enumerate)
x++;
assert(arr[0] == 1);
foreach (immutable i, ref x; arr.enumerate(10))
x++;
assert(arr[0] == 2);
}
And in the "documentation unittests" of enumerate I think it's a good idea to add (I explain this use case in the #Comment4):
void main() {
import std.stdio, std.range;
auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
foreach (immutable r, row; M)
foreach (immutable c, ref item; row.enumerate[1 .. $])
item = c * 10 + r;
writefln("%(%s\n%)", M);
}
With its output:
[0, 10, 20, 30]
[0, 11, 21, 31]
[0, 12, 22, 32]
Comment #13 by jakobovrum — 2014-01-17T01:34:19Z
(In reply to comment #12)
> This should go in the unittests of enumerate():
>
>
> void main() {
> auto arr = [0];
> foreach (immutable i, ref x; arr.enumerate)
> x++;
> assert(arr[0] == 1);
> foreach (immutable i, ref x; arr.enumerate(10))
> x++;
> assert(arr[0] == 2);
> }
>
As far as I can tell, this cannot be implemented; it is an error that the compiler accepts use of `ref` in foreach over a range that doesn't have assignable elements. It is a problem shared with many other higher order ranges.
> And in the "documentation unittests" of enumerate I think it's a good idea to
> add (I explain this use case in the #Comment4):
>
> void main() {
> import std.stdio, std.range;
>
> auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
> foreach (immutable r, row; M)
> foreach (immutable c, ref item; row.enumerate[1 .. $])
> item = c * 10 + r;
> writefln("%(%s\n%)", M);
> }
>
>
> With its output:
>
> [0, 10, 20, 30]
> [0, 11, 21, 31]
> [0, 12, 22, 32]
It's a good example, but I think it looks a little bit ugly without `ref`. I would also change [1 .. $] to .drop(1), but it is a minor stylistic issue.
Also, to be a documented unittest, it needs to use assertions instead of pipe I/O.
I still recommend you post on Github, it's likely that it will get more review and comments there. Also, it's only right that implementation comments go to the implementation proposed on Github, with the issue tracker dedicated to the problem statement in general...
Comment #14 by bearophile_hugs — 2014-01-17T01:59:48Z
(In reply to comment #13)
> As far as I can tell, this cannot be implemented; it is an error that the
> compiler accepts use of `ref` in foreach over a range that doesn't have
> assignable elements. It is a problem shared with many other higher order
> ranges.
I see. Instead of disallowing this quite useful feature, are there plans to improve the language to allow this feature in library code?
> I would also change [1 .. $] to .drop(1), but it is a minor stylistic issue.
Yes, in my code in a situation like that I'd like to use .dropOne, but I have used a simpler slicing because it's "simpler", it doesn't introduce more ranges from Phobos :-) When you explain one function in the documentation, you want to minimize the number of other functions used.
> Also, to be a documented unittest, it needs to use assertions instead of pipe
> I/O.
Yeah.
> I still recommend you post on Github, it's likely that it will get more review
> and comments there. Also, it's only right that implementation comments go to
> the implementation proposed on Github, with the issue tracker dedicated to the
> problem statement in general...
From my limited experience in commenting on GitHub, I've seen that currently the linear threads in those GitHub pages are even less handy than the linear threads here in Bugzilla, especially when the comments become longer than 2-3 lines of normal text. Here you can also add attaches.
Comment #15 by jakobovrum — 2014-01-17T02:09:51Z
(In reply to comment #14)
> (In reply to comment #13)
>
> > As far as I can tell, this cannot be implemented; it is an error that the
> > compiler accepts use of `ref` in foreach over a range that doesn't have
> > assignable elements. It is a problem shared with many other higher order
> > ranges.
>
> I see. Instead of disallowing this quite useful feature, are there plans to
> improve the language to allow this feature in library code?
I have no idea. Maybe it's possible in the current language to implement a `RefTuple` or something, but it's outside the scope of `enumerate`.
> > I would also change [1 .. $] to .drop(1), but it is a minor stylistic issue.
>
> Yes, in my code in a situation like that I'd like to use .dropOne, but I have
> used a simpler slicing because it's "simpler", it doesn't introduce more ranges
> from Phobos :-) When you explain one function in the documentation, you want to
> minimize the number of other functions used.
That is a good point, but `dropOne` is in the same module and a fairly basic algorithm, so I think it's more important to propagate idiomatic code. It should be fairly intuitive to understand even without foreknowledge of what `dropOne` does. Note that DDoc underlines `enumerate` in the examples, so it tends to be fairly readable even when "noise" such as other algorithms are used. Perhaps DDoc supports hyperlink referencing in example code to further ease the issue.
> > I still recommend you post on Github, it's likely that it will get more review
> > and comments there. Also, it's only right that implementation comments go to
> > the implementation proposed on Github, with the issue tracker dedicated to the
> > problem statement in general...
>
> From my limited experience in commenting on GitHub, I've seen that currently
> the linear threads in those GitHub pages are even less handy than the linear
> threads here in Bugzilla, especially when the comments become longer than 2-3
> lines of normal text. Here you can also add attaches.
Github line comments are a pretty good substitution for threaded comments.
Comment #16 by bearophile_hugs — 2014-01-17T02:19:38Z
(In reply to comment #15)
> Maybe it's possible in the current language to implement a
> `RefTuple` or something, but it's outside the scope of `enumerate`.
Even if a RefTuple will be implemented in Phobos, I think you will not be able to retrofit enumerate() to use it, because that will change the API of enumerate() :-(
Comment #17 by jakobovrum — 2014-01-17T02:22:20Z
(In reply to comment #16)
> Even if a RefTuple will be implemented in Phobos, I think you will not be able
> to retrofit enumerate() to use it, because that will change the API of
> enumerate() :-(
Depending on how compatible `Tuple` and `RefTuple` are, it may just be an additive change.
However, many other ranges share this problem, so fixing it just for `enumerate` is not a big win.
Comment #18 by github-bugzilla — 2014-09-20T12:14:04Z