Comment #0 by andrej.mitrovich — 2011-08-14T12:12:05Z
I can use readText, but if I want to use the byLine range with a call to array() to save the lines of a text I end up getting garbage (I know why):
auto file = File("foo.txt", "r");
auto lineBuff = array(file.byLine());
I don't know if it can be fixed, I mean array() is not to blame (it only exhausts a range), and byLine reuses a char[] buffer to stay efficient. But it looks like something that could be thought about.
As a workaround I can do:
lineBuff = readText(r"file.txt").splitLines;
So it's not a big issue, but I ran into it and hence this report.
Comment #1 by dlang-bugzilla — 2011-08-14T12:14:15Z
Another workaround:
array(map!`a.idup`(file.byLine()))
Comment #2 by andrej.mitrovich — 2011-08-14T12:33:03Z
(In reply to comment #1)
> Another workaround:
>
> array(map!`a.idup`(file.byLine()))
Thanks, first I tried .idup on the right side but that didn't work.
I think this falls under "suble things you should be aware of", not really a bug.
Comment #3 by bearophile_hugs — 2012-10-16T13:28:26Z
One solution to avoid this problem is to modify byLine to perform a copy on default and not perform it on request:
auto file = File("foo.txt", "r");
auto lineBuff = array(file.byLine());
==> good
auto file = File("foo.txt", "r");
auto lineBuff = array(file.byLine!false());
==> does't copy, lineBuff contains garbage
This makes the code safe (but a little slower) on default, when you don't know what you are doing, and offers a simple way to have faster code when you know what you are doing and you know you don't want a copy (this ides also follows the general safety philosophy of D).
Comment #4 by hsteoh — 2013-07-31T22:17:25Z
I like the idea of making byLine non-transient by default, but allowing a compile-time option to make it transient when you want to optimize your code (so the user is responsible for making sure transience doesn't break his code before turning it on).
Comment #5 by hsteoh — 2013-07-31T22:18:45Z
I don't like the use of a boolean flag, though. Maybe byLine!CopyBuffer() (default) vs. byLine!ReuseBuffer()?
Comment #6 by andrej.mitrovich — 2013-08-01T06:33:14Z
(In reply to comment #4)
> I like the idea of making byLine non-transient
This "non-transient" term is being used often lately but it has no definition on the website, do you think you could write up a pull for dlang.org so we can officially document it? Otherwise we're lost in the conversation. :)
Comment #7 by hsteoh — 2013-08-01T09:58:04Z
Transient just means the value of .front changes when you call .popFront(), so you cannot simply assign .front to a variable and expect it to retain its value after you popFront(). Non-transient just means that you can assign .front to a variable and rest assured that popFront() will not mutate its value from under you.
I'm not sure we want to put this on the website just yet -- it *is* a term that we coined for the sake of distinguishing between, say, the current byLine(), which reuses the .front buffer, and the proposed change where it calls .dup on every line returned.
Comment #8 by monarchdodra — 2013-08-01T10:15:05Z
(In reply to comment #7)
> Transient just means the value of .front changes when you call .popFront(), so
> you cannot simply assign .front to a variable and expect it to retain its value
> after you popFront(). Non-transient just means that you can assign .front to a
> variable and rest assured that popFront() will not mutate its value from under
> you.
>
> I'm not sure we want to put this on the website just yet -- it *is* a term that
> we coined for the sake of distinguishing between, say, the current byLine(),
> which reuses the .front buffer, and the proposed change where it calls .dup on
> every line returned.
Not *quite* as simple as that. If front returns a "non reference type" (eg an int) but by reference, is it transient?
struct S
{
int front;
void popFront()
{++front;}
enum empty = false;
}
int a = myRange.front; //Non transient
int* b = &myRange.front; //Transient
myRange.popFront();
assert(*b == a); //Fails
Is that range transient? Is the range "reference transient"?
Comment #9 by hsteoh — 2013-08-01T10:50:33Z
No, transience is a property of the *range*, not what you do with the data returned by the range. Taking the address of .front technically undefined, since you can't guarantee that that's even possible for arbitrary range types (in general, it's not possible, since .front may be implemented as a @property function). If .front were a ref function, then you may have a point.
Comment #10 by monarchdodra — 2013-08-01T11:16:36Z
(In reply to comment #9)
> Taking the address of .front technically undefined,
> since you can't guarantee that that's even possible for arbitrary range types
> (in general, it's not possible, since .front may be implemented as a @property
> function). If .front were a ref function, then you may have a point.
That's BS. That's like saying indexing is undefined because an arbitrary range isn't indexable. We have the `hasLValueElements` trait specifically for this. Ranges that offer references to their internal exist, and are a class of their own. Also, the "@property" thing is a non-argument: It is just buggy behaviour; The new implementation (as explained by Kenji) is that &propFun should and *will* return the address of what is returned after having called propFun.
That said, I'll *agree* that there is not a single trait that guarantees references remain valid after a pop, nor does *anybody* rely on it, so we can indeed safely say that my point was, well, pointless :)
Comment #11 by b2.temp — 2017-08-26T18:15:25Z
Isn't .byLineCopy fix this issue ?
Comment #12 by schveiguy — 2017-09-13T15:47:52Z
(In reply to b2.temp from comment #11)
> Isn't .byLineCopy fix this issue ?
Yes. Marking as WONTFIX as I think that's the closest to what happened.