Comment #0 by bearophile_hugs — 2010-12-26T07:11:51Z
This is the signature of File.byLine:
ByLine!(Char,Terminator) byLine(Terminator = char, Char = char)
(KeepTerminator keepTerminator = KeepTerminator.no, Terminator terminator = '\x0a');
But on Windows the line terminators are 2 chars long (CR+LF), see:
http://en.wikipedia.org/wiki/Newline#Representations
So I think the second argument of argument byLine() needs to be a string.
This is code I expected to use, that currently is not accepted:
import std.stdio;
void main() {
auto lines = File("test.txt").byLine(File.KeepTerminator.no, "\r\n");
}
----------------
After that bug report, a little enhancement request: generally on Windows I usually open files with Windows-style line terminators, while on Linux I open files with Unix-style line terminators, so if possible a better default for the second argument of byLine() is a string constant that changes according to the operating system.
----------------
A workaround is to open the file in text mode, but I don't know if this works well if you want to open a Windows-style file on Linux:
import std.stdio;
void main() {
auto lines = File("test.txt", "r").byLine();
}
Comment #1 by andrei — 2013-01-08T01:11:43Z
This is by design. The length name is special and defined to return size_t compulsively. You may want to choose a different name instead.
Comment #2 by andrei — 2013-01-08T01:12:15Z
Oops, wrong window.
Comment #3 by dlang-bugzilla — 2013-03-16T01:40:03Z
Would it be acceptable if we special-cased byLine to strip a trailing \r if the terminator is \n?
Often, the programmer doesn't know beforehand if the line terminator of a text file will be \r\n or \n. A behavior close to that of splitLines would be more useful than forcing the programmer to choose an exact terminator sequence.
Comment #4 by bearophile_hugs — 2013-03-17T17:44:06Z
(In reply to comment #3)
> Would it be acceptable if we special-cased byLine to strip a trailing \r if the
> terminator is \n?
Probably the problem presented in this issue has various solutions.
Comment #5 by andrej.mitrovich — 2013-03-18T12:11:08Z
*** Issue 9750 has been marked as a duplicate of this issue. ***
Default behaviour is unexpected, at least on Windows where most files contain lines ended in \r\n.
Use of lineSeparator enum for cross platform development does not guarantee that the file you are processing contain only the lineSeparator terminator.
If I ask byLine, I expect to obtain a line not something else, not a line ended with another line terminator. What if my file contains some lines ended in \r and other line ended in \r\n?
Default behaviour must strip terminators and must consume all known line separators, there is no point to discriminate between them:
- 0x0d
- 0x0d\0x0a
- 0x0a
- Unicode categories Zl, Zp.
Comment #9 by robert.schadek — 2024-12-01T16:13:45Z