Bug 9493 – std.algorithm.canFind returns true for empty string in array of integers
Status
RESOLVED
Resolution
WONTFIX
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
x86_64
OS
Windows
Creation time
2013-02-09T23:58:00Z
Last change time
2014-02-09T14:35:45Z
Assigned to
nobody
Creator
monkeyworks12
Comments
Comment #0 by monkeyworks12 — 2013-02-09T23:58:48Z
It seems that std.algorithm.canFind returns true when checking for the empty string within an array of integers. This seems like unintended behaviour to me, so I'm reporting it as a bug. This bug is present on at least DMD 2.061.
Example:
import std.algorithm;
void main()
{
//This assertion should fail, but doesn't
assert(canFind([1, 2, 3, 4], ""));
}
Comment #1 by monarchdodra — 2013-02-10T03:34:19Z
(In reply to comment #0)
> It seems that std.algorithm.canFind returns true when checking for the empty
> string within an array of integers. This seems like unintended behaviour to me,
> so I'm reporting it as a bug. This bug is present on at least DMD 2.061.
>
> Example:
> import std.algorithm;
>
> void main()
> {
> //This assertion should fail, but doesn't
> assert(canFind([1, 2, 3, 4], ""));
> }
I think this is inteded behavior. You are basically looking for instances of *nothing*, which, by definition, can be found inside everything (*). I'd expect a true returned here, and this would be consistent with the rest of the finds (AFAIK).
Are you getting a different behavior for, say arrays? eg "canFind([1, 2, 3], (int[]).init)" I don't have access to my compiler, so that's an actual question. If you *are* getting a different behavior, then I'd argue *that's* a bug.
*: The only ambiguous case I see is if hasytack is empty, in which case I could see it both ways, but I'd still lean for "true", since "empty is empty", so "empty can be found inside empty".
Comment #2 by issues.dlang — 2013-02-10T03:46:42Z
I think that there's a good chance that it's intended behavior that looking for an empty range with canFind always returns true. However, the fact that canFind is accepting a _string_ as the needle when the haystack is int[] seems very wrong.
Comment #3 by peter.alexander.au — 2014-02-09T10:06:12Z
int is comparable with dchar, so this is working as intended.
Note: you can also do this:
assert(canFind([1, 2, 3, 4], "\x02\x03")); // success
assert(canFind([1, 2, 3, 4], "\x02\x05")); // fail
Remember that the "needle" can be a range as well.
If there are no further issues, I'll resolve this as invalid.
Comment #4 by monkeyworks12 — 2014-02-09T13:04:11Z
I'm not particularly satisfied with that, as I filed this as the result of a bug this behaviour caused, but I can't think of a good solution, either. Thoughts?
Comment #5 by peter.alexander.au — 2014-02-09T13:12:47Z
(In reply to comment #4)
> I'm not particularly satisfied with that, as I filed this as the result of a
> bug this behaviour caused, but I can't think of a good solution, either.
> Thoughts?
My thoughts:
1. We cannot change the fact that int is comparable with dchar. It would break too much code.
2. We cannot change the fact that canFind works with two ranges of comparable elements. It was part of the design, and would break too much code anyway.
I don't see any way around this.
I admit that at a first glance it doesn't look like it should compile, but it does make sense once you consider what's happening.
Comment #6 by monkeyworks12 — 2014-02-09T13:26:30Z
(In reply to comment #5)
> My thoughts:
>
> 1. We cannot change the fact that int is comparable with dchar. It would break
> too much code.
> 2. We cannot change the fact that canFind works with two ranges of comparable
> elements. It was part of the design, and would break too much code anyway.
>
> I don't see any way around this.
>
> I admit that at a first glance it doesn't look like it should compile, but it
> does make sense once you consider what's happening.
I suppose, though maybe it should say in the documentation that such edge cases may occur, using this as an example.
Comment #7 by peter.alexander.au — 2014-02-09T13:42:32Z
(In reply to comment #6)
> I suppose, though maybe it should say in the documentation that such edge cases
> may occur, using this as an example.
Where do you draw the line with such things though? ints comparing with dchars is part of the language. If you add a warning about it in canFind then do you add it in every functions that does a comparison?
All of the following compile and the assertions hold.
// Note: 32 == ' ' in ASCII
assert(count([0, 16, 32, 64], ' ') == 1);
assert(commonPrefix([32, 32, 30], " x") == " ");
assert(equal([32], " "));
assert([32].filter!(x => x != ' ').empty);
There are many more situations where this can come up. I can't see the benefit of warning about it in the documentation. D programmers just need to be aware that characters are a form of integer and that they can be compared.
Comment #8 by monkeyworks12 — 2014-02-09T14:32:22Z
(In reply to comment #7)
> assert(commonPrefix([32, 32, 30], " x") == " ");
> assert(equal([32], " "));
> assert([32].filter!(x => x != ' ').empty);
This is bad and we should all feel bad that this C anachronism still persists in any language. I've never liked the fact that characters are really just numbers. Anyway, go ahead and close it.
Comment #9 by peter.alexander.au — 2014-02-09T14:35:45Z
(In reply to comment #8)
> (In reply to comment #7)
> > assert(commonPrefix([32, 32, 30], " x") == " ");
> > assert(equal([32], " "));
> > assert([32].filter!(x => x != ' ').empty);
>
> This is bad and we should all feel bad that this C anachronism still persists
> in any language. I've never liked the fact that characters are really just
> numbers. Anyway, go ahead and close it.
Maybe in D3? :-)