Comment #0 by bearophile_hugs — 2011-05-10T16:30:55Z
This D2 program seems to go in infinte loop (dmd 2.053beta):
import std.string;
void main() {
split("a test", "");
}
------------------------
My suggestion is to add code like this in std.array.split():
if (delim.length == 0)
return split(s);
This means that en empty splitting string is like splitting on generic whitespace. This is useful in code like:
auto foo(string txt, string delim="") {
return txt.split(delim);
}
This means that calling foo with no arguments splits txt on whitespace, otherwise splits on the given string. This allows to use the two forms of split in foo() without if conditions. This is done in Python too, where None is used instead of an empty string.
The modified split is something like (there is a isSomeString!S2 because are special, they aren't generic arrays, splitting on whitespace is meaningful for strings only):
Unqual!(S1)[] split(S1, S2)(S1 s, S2 delim)
if (isForwardRange!(Unqual!S1) && isForwardRange!S2)
{
Unqual!S1 us = s;
if (isSomeString!S2 && delim.length == 0)
{
return split(s);
}
else
{
auto app = appender!(Unqual!(S1)[])();
foreach (word; std.algorithm.splitter(us, delim))
{
app.put(word);
}
return app.data;
}
}
Beside this change, I presume std.algorithm.splitter() too needs to test for an empty delim.
Comment #1 by bearophile_hugs — 2011-09-25T08:16:21Z
Alternative: throw an ArgumentError("delim argument is empty") exception if delim is empty.
Comment #2 by monarchdodra — 2012-10-22T02:42:42Z
*** Issue 8551 has been marked as a duplicate of this issue. ***
Comment #3 by monarchdodra — 2012-10-22T02:52:16Z
(In reply to comment #0)
> This D2 program seems to go in infinte loop (dmd 2.053beta):
>
>
> import std.string;
> void main() {
> split("a test", "");
> }
>
> ------------------------
>
> My suggestion is to add code like this in std.array.split():
>
> if (delim.length == 0)
> return split(s);
>
> This means that en empty splitting string is like splitting on generic
> whitespace. This is useful in code like:
>
> auto foo(string txt, string delim="") {
> return txt.split(delim);
> }
I think it is a bad idea on two counts:
1. If the user wanted that behavior, he'd have written it as such. If the user
actually passed a seperator that is an empty range, he probably didn't mean for it split by spaces.
2. I think it would also bring a deviation of behavior between strings and
non-strings. Supposing r is empty:
* "hello world".split(""); //Ok, split white
* [1, 2].split(r); //Derp.
(In reply to comment #1)
> Alternative: throw an ArgumentError("delim argument is empty") exception if
> delim is empty.
I *really* think that is a *much* saner approach. Splitting with an empty
separator is just not logic. Trying to force a default behavior in that scenario is wishful thinking (IMO).
I think it should throw an error. I'll implement this.
Comment #4 by hsteoh — 2013-01-03T20:28:42Z
FWIW, in perl, splitting on an empty string simply returns an array of characters. I think that better reflects the symmetry of join("", array).
Comment #5 by bearophile_hugs — 2013-11-18T02:46:27Z
After this pull:
https://github.com/D-Programming-Language/phobos/pull/1502
This program:
void main() {
import std.string, std.stdio;
auto r = split("a test", "");
pragma(msg, typeof(r));
r.writeln;
}
Gives:
string[]
["a", " ", "t", "e", "s", "t"]
And this program:
void main() {
import std.algorithm, std.stdio;
auto r = splitter("a test", "");
r.writeln;
}
Gives the same output:
["a", " ", "t", "e", "s", "t"]
It's different from what Python does:
>>> "a test".split("")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: empty separator
But it's much better than an infinite loop, it can be often useful, and I think it's acceptable, so I close down the issue.