Bug 9599 – File.byLine doesn't function properly with take
Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-02-26T23:46:00Z
Last change time
2013-08-19T11:22:08Z
Keywords
pull
Assigned to
monarchdodra
Creator
zshazz
Comments
Comment #0 by zshazz — 2013-02-26T23:46:15Z
Using 2.062, Regarding the following code:
---
import std.stdio, std.range;
void main() {
auto file = File.tmpfile();
file.write("1\n2\n3\n");
file.rewind();
auto fbl = file.byLine();
foreach(line; fbl.take(1)) writeln(line);
foreach(line; fbl) writeln(line);
}
---
The expected output for this would be:
---
1
2
3
---
but actual output:
---
1
3
---
Generalized observation: When take is used on a ByLine range, it takes the appropriate number of elements and then consumes one additional element preventing anything else from using it.
Comment #1 by monarchdodra — 2013-02-27T03:35:11Z
(In reply to comment #0)
> Using 2.062, Regarding the following code:
The bug is actually inside byLine itself, so we can remove take from the equation. The problem is that byLine is over-eager:
1) Creating a front element eagerly pops that element.
2) poping an element eagerlly parses the next, effectivelly popping it off too if it is never read:
Reduced test showing this:
//----
import std.stdio;
void main()
{
auto file = File.tmpfile();
file.write("1\n2\n3\n4\n5");
file.rewind();
auto fbl1 = file.byLine();
writeln(fbl1.front); //prints 1.
auto fbl2 = file.byLine();
writeln(fbl2.front); //prints 2... Wait. Who popped off 1?
fbl2.popFront(); //pops off 2, and consumes 3.
auto fbl3 = file.byLine();
writeln(fbl3.front); //prints 4.
}
//----
Ideally, byLine should be reworked to be a little more lazy, and better preserve the integrity of its underlying stream:
- "front means do NOT modify the referenced container"
- "pop means remove the CURRENT element, and stop there"
byLine is obviously not doing that. The fact that it is *just* an input range does not mean it gets to bypass standard rules.
Comment #2 by monarchdodra — 2013-02-27T10:09:30Z
(In reply to comment #1)
> (In reply to comment #0)
> > Using 2.062, Regarding the following code:
>
> The bug is actually inside byLine itself
byChunk is subject to the exact same issue.
The fact that they don't behave according to normal range semantics could be a potentially serious problems when not used linearly.
Comment #3 by monarchdodra — 2013-03-13T00:20:19Z
(In reply to comment #1)
> Ideally, byLine should be reworked to be a little more lazy, and better
> preserve the integrity of its underlying stream:
> - "front means do NOT modify the referenced container"
> - "pop means remove the CURRENT element, and stop there"
Either that, or take the
Comment #4 by nick — 2013-07-22T08:32:06Z
(In reply to comment #1)
> The bug is actually inside byLine itself, so we can remove take from the
> equation. The problem is that byLine is over-eager:
> 1) Creating a front element eagerly pops that element.
> 2) poping an element eagerlly parses the next, effectivelly popping it off too
> if it is never read:
https://github.com/D-Programming-Language/phobos/pull/1433
> auto file = File.tmpfile();
> file.write("1\n2\n3\n4\n5");
> file.rewind();
>
> auto fbl1 = file.byLine();
> writeln(fbl1.front); //prints 1.
>
> auto fbl2 = file.byLine();
> writeln(fbl2.front); //prints 2... Wait. Who popped off 1?
I think the above behaviour is understandable for a range like ByLine.
Comment #5 by github-bugzilla — 2013-08-19T10:50:18Z