Bug 11837 – String literals should convert to const(void)*

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-12-28T08:08:00Z
Last change time
2016-02-06T20:34:34Z
Keywords
pull
Assigned to
nobody
Creator
yebblies

Comments

Comment #0 by yebblies — 2013-12-28T08:08:31Z
Code like this is perfectly valid: memcmp(ptr, "abc"); But it fails in D because although string literals convert to const(char)*, and const(char)* converts to const(void)*, string literals do not convert to const(void)*
Comment #1 by bearophile_hugs — 2013-12-28T08:33:27Z
(In reply to comment #0) > Code like this is perfectly valid: > > memcmp(ptr, "abc"); > > But it fails in D because although string literals convert to const(char)*, and > const(char)* converts to const(void)*, string literals do not convert to > const(void)* What are the advantages, disadvantages and possible risks of this change?
Comment #2 by yebblies — 2013-12-28T08:37:19Z
Code like this will compile: memcmp(ptr, "abc", 3);
Comment #3 by monarchdodra — 2013-12-28T12:54:42Z
(In reply to comment #2) > Code like this will compile: > > memcmp(ptr, "abc", 3); What's wrong with `memcmp(ptr, "abc".ptr, 3)`? I seem to remember there is an issue with null termination in this kind of useage?
Comment #4 by yebblies — 2013-12-28T19:53:09Z
(In reply to comment #3) > (In reply to comment #2) > > Code like this will compile: > > > > memcmp(ptr, "abc", 3); > > What's wrong with `memcmp(ptr, "abc".ptr, 3)`? > Adding .ptr looses the guarantee that the string will be 0-terminated. eg // enum x = "abc"; // immutable x = "abc"; auto x = "abc"; memcmp(ptr, x.ptr, 4); // oops, no guarantee x is 0-terminates, but the compiler has no way to know that's what you wanted. > I seem to remember there is an issue with null termination in this kind of > useage? ...? The fact that string literals don't convert to const(void)* is IMO an annoying special case. This works: const(char)* x = "askjldfg"; const(void)* y = x; But this doesn't: const(void)* y = "askjldfg"; Unless there's a good reason this has to be prevented...
Comment #5 by yebblies — 2013-12-28T21:24:04Z
Comment #6 by monarchdodra — 2013-12-29T09:33:33Z
(In reply to comment #4) > (In reply to comment #3) > > I seem to remember there is an issue with null termination in this kind of > > useage? > > ...? What I meant here is what you explained just above: > > What's wrong with `memcmp(ptr, "abc".ptr, 3)`? > > > > Adding .ptr looses the guarantee that the string will be 0-terminated. > > eg > // enum x = "abc"; > // immutable x = "abc"; > auto x = "abc"; > > memcmp(ptr, x.ptr, 4); // oops, no guarantee x is 0-terminates, but the > compiler has no way to know that's what you wanted. This may be a bit off topic, but what is the rationale behind this behavior? Why can't *all* string literals be 0 terminated, even if you explicitly extract a pointer out of them with ".ptr" ? > The fact that string literals don't convert to const(void)* is IMO an annoying > special case. > > This works: > const(char)* x = "askjldfg"; > const(void)* y = x; > > But this doesn't: > const(void)* y = "askjldfg"; > > Unless there's a good reason this has to be prevented... If a string literal implicitly casts to "const(char)*", then it absolutely 100% must be implicitly castable to "const(void)*". It only makes sense. Though personally, I find that the fact that you can *implicitly* extract any pointer from a string literal to be suboptimal :/
Comment #7 by yebblies — 2013-12-29T09:55:36Z
(In reply to comment #6) > > > > memcmp(ptr, x.ptr, 4); // oops, no guarantee x is 0-terminates, but the > > compiler has no way to know that's what you wanted. > > This may be a bit off topic, but what is the rationale behind this behavior? > Why can't *all* string literals be 0 terminated, even if you explicitly extract > a pointer out of them with ".ptr" ? > All string literals are guaranteed to be 0 terminated, even if you use .ptr on them. The think is, manifest constants that expand to string literals also behave like this, so if this compiles you know it is safe: printf(formatstr, ...); But in this case, you can't tell: printf(formatstr.ptr, ...); // was it really a string literal? > > If a string literal implicitly casts to "const(char)*", then it absolutely 100% > must be implicitly castable to "const(void)*". It only makes sense. > Ok, good. This is pretty much just convenience for porting c/c++ code, and removing what I see as an unnecessary limitation. > Though personally, I find that the fact that you can *implicitly* extract any > pointer from a string literal to be suboptimal :/ If it's safe, I don't see the harm.
Comment #8 by bugzilla — 2013-12-29T23:09:38Z
There's code in dmd to specifically disallow this. I believe the reason is because of function and template overloading. I'm not content with this change simply passing the existing test suite. It's a more subtle, substantive change than that.
Comment #9 by yebblies — 2013-12-29T23:37:33Z
(In reply to comment #8) > There's code in dmd to specifically disallow this. I believe the reason is > because of function and template overloading. Do you know what the actual problem is/was? The code in dmd that disallows this is ancient, and may well address a problem that no longer exists. Can you remember why you disabled it in the first place? Did you document this anywhere? As for overloading, this code works as expected as the conversion to const(char)* is preferred. import core.stdc.stdio; void call(const(char)* str) { printf("const(char)*\n"); } void call(const(void)* str) { printf("const(void)*\n"); } void call(const(int)[] arr) { printf("const(int)[]\n"); } void call(const(void)[] arr) { printf("const(void)[]\n"); } void main() { call("abc"); // prints const(char)* call([1, 2, 3]); // prints const(int)[] } > I'm not content with this change > simply passing the existing test suite. It's a more subtle, substantive change > than that. What _would_ you be content with? Unless someone can come up with an actual problem, putting this on hold is simply a waste of time. If this does turn out to cause a regression, it can trivially be rolled back.
Comment #10 by bugzilla — 2014-01-12T11:35:38Z
Changing the way overloading works can have far reaching consequences, including issues like template matching, virtual functions, covariance, contravariance, and __traits(compiles). I am not at all comfortable with just throwing it in with the idea that it can be backed out. This proposal has not received much of any discussion. I also don't see memcmp usage as a compelling must-have use case.
Comment #11 by yebblies — 2014-01-12T19:27:10Z
(In reply to comment #10) > Changing the way overloading works can have far reaching consequences, > including issues like template matching, virtual functions, covariance, > contravariance, and __traits(compiles). Irrelevant as I'm not changing the way overloading works. Listing parts of the compiler is not the same as pointing out and actual problem. > I am not at all comfortable with just > throwing it in with the idea that it can be backed out. Your response consisted of "there could be problems" without specifying any actual problems. In the face of unspecified and potentially non-existent problems, putting the code in the compiler and waiting for feedback seems completely reasonable to me. > This proposal has not > received much of any discussion. That's what we're doing now... > I also don't see memcmp usage as a compelling must-have use case. Given that A converts to B, and B converts to C, why doesn't A convert to C? memcmp is a symptom of this strange limitation.
Comment #12 by bugzilla — 2014-02-24T01:52:41Z
(In reply to comment #11) > Irrelevant as I'm not changing the way overloading works. Listing parts of the > compiler is not the same as pointing out and actual problem. You are changing the way overloading works. Any time the implicit conversion rules are changed, and this is an implicit overloading rule change, the overloading changes, because overloading is all about implicit conversions. > Your response consisted of "there could be problems" without specifying any > actual problems. In the face of unspecified and potentially non-existent > problems, putting the code in the compiler and waiting for feedback seems > completely reasonable to me. > > > This proposal has not > > received much of any discussion. > > That's what we're doing now... It's pretty much just you and I, hardly representative. > > I also don't see memcmp usage as a compelling must-have use case. > > Given that A converts to B, and B converts to C, why doesn't A convert to C? > memcmp is a symptom of this strange limitation. Changing overloading rules can have unexpected and far reaching consequences. It is not very knowable in advance. I have severe reservations about doing this just for memcmp(). Need a better reason.
Comment #13 by yebblies — 2014-02-24T03:04:19Z
(In reply to comment #12) > > > > I also don't see memcmp usage as a compelling must-have use case. > > > > Given that A converts to B, and B converts to C, why doesn't A convert to C? > > memcmp is a symptom of this strange limitation. > > [snip] I have severe reservations about doing this > just for memcmp(). Need a better reason. I gave you a reason, in fact, you quoted it. A converts to B, and B converts to C, but A doesn't convert to C. Why shouldn't A convert to C????? I'm not proposing a new special case, I'm trying to remove one that was introduced for reasons forgotten. > Changing overloading rules can have unexpected and far reaching consequences. > It is not very knowable in advance. The same reasoning could be used to block every change to the compiler. Every non-trivial change could potentially affect something unintended. The only way forward is to do your best to identify problems, then implement it and wait for regression reports.
Comment #14 by andrei — 2014-03-08T21:22:18Z
I agree it's an exception that "str" converts to const(char)* but not subsequently to const(void)*. However, the conversion to char* is already a known concession for the sake of C string APIs. I don't think we need to go all the way into the rabbit hole. (Also the example is obscure.) @yebblies sorry I'll close this and the pull request. Feel free to reopen if you feel strongly about this.
Comment #15 by yebblies — 2014-03-08T23:46:08Z
(In reply to comment #14) > I agree it's an exception that "str" converts to const(char)* but not > subsequently to const(void)*. However, the conversion to char* is already a > known concession for the sake of C string APIs. I don't think we need to go all > the way into the rabbit hole. (Also the example is obscure.) > You seem to be saying it's not worth the effort to fix, if I understand correctly. We've already spent a lot more time arguing about it than I spent fixing it, so I'd really like to know why you think preventing the fix is worth all this effort? > @yebblies sorry I'll close this and the pull request. Feel free to reopen if > you feel strongly about this. The usual arguments for rejecting an enhancement are that it breaks existing code, or it complicates the language. So far it seems this does neither, in fact it simplifies the language. Just like Walter, you've failed to provide a single reason why this special case should exist. I just can't accept that - it doesn't make any sense. I appreciate you taking the time to look at this, but without any evidence that this is a bad change I think you are drawing the wrong conclusion.
Comment #16 by ibuclaw — 2014-03-09T04:37:16Z
The way I see it, relying on implicit conversion should be avoided where possible. And where implicit conversion is allowed, enforce that only one path could be taken. In this example, that means string -> const(char*), or string -> const(void*), but not both. Assuming this is for DDMD, then I'd suggest either grin and bear it, this kind of code will be cleaned up. Or use strcmp, which IIRC takes a const(char*) as its parameters - and if the operation is comparing a (void*) with a (char*), then explicitly cast the (void*) up. As for Walter and Andrei's reasoning. I am not opined in that way, but I would suggest that you prove that this change doesn't break eg: int foo(in void*); int foo(in char*); And if it does break overloading, provide good reasoning why this should be invalid.
Comment #17 by andrei — 2014-03-09T09:48:56Z
(In reply to comment #15) > (In reply to comment #14) > > I agree it's an exception that "str" converts to const(char)* but not > > subsequently to const(void)*. However, the conversion to char* is already a > > known concession for the sake of C string APIs. I don't think we need to go all > > the way into the rabbit hole. (Also the example is obscure.) > > > > You seem to be saying it's not worth the effort to fix, if I understand > correctly. We've already spent a lot more time arguing about it than I spent > fixing it, so I'd really like to know why you think preventing the fix is worth > all this effort? I'm saying we shouldn't have a compromise force others after it. Conversion to untyped pointers is bad and should be avoided. So I'm arguing against what I believe is a bad thing. Also the supporting examples are specious and non-idiomatic D. > > @yebblies sorry I'll close this and the pull request. Feel free to reopen if > > you feel strongly about this. > > The usual arguments for rejecting an enhancement are that it breaks existing > code, or it complicates the language. So far it seems this does neither, in > fact it simplifies the language. It makes the language worse. > Just like Walter, you've failed to provide a single reason why this special > case should exist. > > I just can't accept that - it doesn't make any sense. > > I appreciate you taking the time to look at this, but without any evidence that > this is a bad change I think you are drawing the wrong conclusion. Untyped pointers are bad. We are providing conversion to immutable(char)* as a compromise. Please let's not make nice string literals implicitly convert all the way down to void*. I'll leave it to you to close this.
Comment #18 by andrei — 2014-03-09T10:03:44Z
A few more thoughts. Conversion of string literals to const(char)* is irregular and inconsistent. No other array literals convert that way, and variables, enums etc also don't convert that way. It's a kludge - albeit a clever one - that we accept as convenience for the reality we must occasionally interface with C strings and it's nice to not need to add the .ptr. Now in invoking language consistency "post-kludge" we are in fact being consistent with the kludge more than the language rules that it disobeys.
Comment #19 by andrei — 2016-02-06T20:34:02Z
Related discussion: https://github.com/D-Programming-Language/dmd/pull/5411. I'll close this now.