Bug 7716 – Add an indexed overload to countUntil

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-03-15T09:55:00Z
Last change time
2012-12-11T05:30:54Z
Assigned to
nobody
Creator
andrej.mitrovich

Comments

Comment #0 by andrej.mitrovich — 2012-03-15T09:55:27Z
Sometimes I want to find the index of a substring, but not necessarily the index of the first occurence. E.g. in the string "foo x foo y" I want to get the index of the second foo. Here's a simple implementation based on the existing countUntil: import std.algorithm : startsWith; import std.array; import std.traits; sizediff_t countUntil(alias pred = "a == b", R, N)(R haystack, N needle, size_t index) if (is(typeof(startsWith!pred(haystack, needle)))) { size_t count; static if (isNarrowString!R) { // Narrow strings are handled a bit differently auto length = haystack.length; for (; !haystack.empty; haystack.popFront()) { if (startsWith!pred(haystack, needle)) { if (count == index) return length - haystack.length; else count++; } } } else { typeof(return) result; for (; !haystack.empty; ++result, haystack.popFront()) { if (startsWith!pred(haystack, needle)) { if (count == index) return result; else count++; } } } return -1; } void main() { string s = "foo x foo y"; auto idx1 = s.countUntil("foo", 0); auto idx2 = s.countUntil("foo", 1); auto idx3 = s.countUntil("foo", 2); assert(idx1 == 0); assert(idx2 == 6); assert(idx3 == -1); } Equivalent functionality could be added to std.string.indexOf to allow case-unsensitive searches.
Comment #1 by andrej.mitrovich — 2012-12-10T14:51:28Z
This niche feature belongs in user-space libraries, not Phobos.
Comment #2 by bearophile_hugs — 2012-12-10T15:28:19Z
This belongs in Phobos. See Python string functions, that have optional start-end: >>> help(str.find) Help on method_descriptor: find(...) S.find(sub [,start [,end]]) -> int Return the lowest index in S where substring sub is found, such that sub is contained within s[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure. >>> help(str.index) Help on method_descriptor: index(...) S.index(sub [,start [,end]]) -> int Like S.find() but raise ValueError when the substring is not found.
Comment #3 by monarchdodra — 2012-12-11T05:30:54Z
(In reply to comment #0) > Sometimes I want to find the index of a substring, but not necessarily the > index of the first occurence. E.g. in the string "foo x foo y" I want to get > the index of the second foo. This would conflict with this other request: http://d.puremagic.com/issues/show_bug.cgi?id=5507 which would allow counting until multiple needles, eg: //---- string s = "foo x bar y"; size_t index = s.countUntil("bar", 'x'); assert(index == 4); //---- So what you are asking for may have to be implemented as "countUntilN" or something: //----- string s = "foo x bar y"; size_t index = s.countUntilN(1, "bar", 'x'); assert(index == 6); //First found 'x', then found "bar". //----- > Here's a simple implementation based on the > existing countUntil: Just want to point out that that implementation is based on an old and buggy implementation of countUntil. 2 problems: 1. No support for reference type ranges. 2. Incorrect support of unicode. The issue 1 is not that big of an issue, but is unacceptable of a standard library. Issue 2 depends on what you want. Do you want the *index* in the string, or the logical *range_position*: "日本語".countUntil(本); //1 popFront once "日本語".countUntil(本); //3 slice index: "日本語"[3 .. $] If you plan on using countUntil(N), then I'd recommend basing your implementation on https://github.com/D-Programming-Language/phobos/pull/951 Finally, keep in mind that the standard function "count" skips over overlaps: "ababab".count("abab"); //produces 1. Only 1 match. However, your countUntil implementation would find that second match: "ababab".countUntilN(1, "abab"); //Produces 2 This would be kind of weird: "Given a range which holds 1 instance of a needle, countUntilN is capable of finding the second occurrence of said needle :/"