Bug 10717 – std.ascii.toLower and toUpper should return char instead of dchar and avoid me to use a bad cast(char)
Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-07-26T05:06:00Z
Last change time
2013-09-22T12:43:57Z
Keywords
pull
Assigned to
nobody
Creator
bearophile_hugs
Comments
Comment #0 by bearophile_hugs — 2013-07-26T05:06:14Z
If I use functions from std.ascii I am stating that I am working with ASCII chars.
Also std.ascii docs say: "All of the functions in std.ascii accept unicode characters but effectively ignore them."
So if I use std.ascii.toLower on an ASCII char, in most times (all times so far) I want the result to be a char. If std.ascii.toLower returns a dchar it forces me to use a cast(char) in most (or all) cases. Such cast doesn't increase safety at all, it actually decreases it.
So I'd like std.ascii.toLower and std.ascii.toUpper to return char.
(This has a severity 'normal' instead of 'enhancement' because I regard that as a bug in the design of those two functions.)
Comment #1 by issues.dlang — 2013-07-26T05:17:49Z
I really don't think that this is a bug. By taking dchar, they work with ranges of dchar and allow you to operate on strings that contain Unicode in cases where all you care about are particular ASCII characters (like when you only care about ASCII whitespace in a string - not Unicode whitespace - but the string contains Unicode characters). And as soon the functions accept dchar, they must return dchar, or they'll destroy any Unicode characters that get passed in.
A reasonable enhancement request would be to overload these functions with overloads which take char and return char, but there is actual value in having them accept and return dchar, and it would break existing code if they stopped. So, the functions that are there are there to stay, but you may get overloads which operate specifically on char.
Comment #2 by bearophile_hugs — 2013-07-26T07:33:37Z
(In reply to comment #1)
> So, the functions that are there are there to stay, but you may get overloads
> which operate specifically on char.
OK. Thank for your comments and corrections Jonathan.
(In reply to comment #4)
> (In reply to comment #3)
> > https://github.com/D-Programming-Language/phobos/pull/1436
>
> Thank you again Jonathan :-)
The new semantics are:
//----
If $(D c) is an uppercase ASCII character, then its corresponding lowercase
letter is returned. Otherwise, $(D c) is returned.
$(D C) can be any type which implicitly converts to $(D dchar). In the case
where it's a built-in type, $(D Unqual!C) is returned, whereas if it's a
user-defined type, $(D dchar) is returned.
//----
Does this fit the bill, or do you see anything else that needs to be addressed?
Comment #6 by bearophile_hugs — 2013-07-31T05:15:58Z
(In reply to comment #5)
> Does this fit the bill, or do you see anything else that needs to be addressed?
I have tried it, and the semantics seems OK.
But the implementation is:
auto toLower(C)(C c) if(is(C : dchar)) {
static if(isScalarType!C)
return isUpper(c) ? cast(Unqual!C)(c + 'a' - 'A') : cast(Unqual!C)c;
else
return toLower!dchar(c);
}
So a call to toLower can cause two function calls. toLower() is a tiny function that could be called many many times, so perhaps in debug mode (without inlining) doubling the number of function calls slows down the user code a bit.
Comment #7 by monarchdodra — 2013-07-31T12:26:58Z
(In reply to comment #6)
> So a call to toLower can cause two function calls. toLower() is a tiny function
> that could be called many many times, so perhaps in debug mode (without
> inlining) doubling the number of function calls slows down the user code a bit.
Well, it can cause two function calls only for user defined types, so the case should not be that common. Doing it this way means you'll only end up instanciating `toLower!dchar` for all user types. Which is also a plus.
One of the implementation I had suggested was:
////--------
//Public template that only filters and changes return type
auto toUpper(C)(C c) @safe pure nothrow
if (is(C : dchar))
out(result)
{
assert(!isLower(result));
}
body
{
alias Ret = Select!(isScalarType!C, C, dchar);
return cast(Ret) toUpperImpl(c);
}
//Non template that does actual job.
private dchar toUpperImpl(dchar c) @safe pure nothrow
{
return isLower(c) ? cast(dchar)(c - ('a' - 'A')) : c;
}
////--------
Comment #8 by github-bugzilla — 2013-08-02T01:49:55Z