Bug 9498 – Range violation using AA

Status
RESOLVED
Resolution
INVALID
Severity
regression
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-02-12T11:26:00Z
Last change time
2013-07-08T21:52:32Z
Assigned to
nobody
Creator
timvol

Comments

Comment #0 by timvol — 2013-02-12T11:26:48Z
The following worked in DMD 2.060 (and previous versions): string[string] myValues; ref string getValue(string v) { return myValues[v]; } void main() { getValue("myValue") = "myString"; } But in 2.061 it throws a range violation.
Comment #1 by maxim — 2013-02-12T11:49:32Z
This is neither a bug nor a regression. An element is accessed in return statement, which makes it an rvalue, so allocation if absence does not applies here. It works as clarified in TDPL. From TDPL (p. 115): "To read a value off an associative array given a key, just read aa [ key] . (The compiler distinguishes reads from writes and invokes slightly different functions.)" - it is implemented in druntime as aaGetValueX and aaGetRvalueX or something like that. Further: "If you try to read the value for a key not found in the associative array, a range violation exception is thrown."
Comment #2 by yebblies — 2013-06-29T20:32:45Z
Note the ref return from getValue, which makes 'myValues[v]' an lvalue. It could still be invalid, but not for the reasons stated.
Comment #3 by maxim — 2013-06-29T21:59:13Z
(In reply to comment #2) > Note the ref return from getValue, which makes 'myValues[v]' an lvalue. It > could still be invalid, but not for the reasons stated. This is irrelevant. The fact that function returns by ref indicates that value should not be copied but returned by ref. If "myValue" was present that would mean that pointer to the string is returned. Irrespective of how value is returned, myValues[v] is invalid value here.
Comment #4 by yebblies — 2013-06-29T22:15:17Z
(In reply to comment #3) > (In reply to comment #2) > > Note the ref return from getValue, which makes 'myValues[v]' an lvalue. It > > could still be invalid, but not for the reasons stated. > > This is irrelevant. The fact that function returns by ref indicates that value > should not be copied but returned by ref. If "myValue" was present that would > mean that pointer to the string is returned. Irrespective of how value is > returned, myValues[v] is invalid value here. This code works because indexing an AA with a key that does not exist is valid in an lvalue context: int[string] aa; aa["asd"] += 2; The following code also indexes an AA with a missing key in an lvalue context: int[string] aa; ref int get() { return aa["asd"]; } get() += 2; The assumption here is that indexing an AA with a non-existing key in a context where an lvalue is expected does not result in a range violation. Do you know of any reason why this is an incorrect assumption?
Comment #5 by maxim — 2013-06-29T22:34:57Z
(In reply to comment #4) > This code works because indexing an AA with a key that does not exist is valid > in an lvalue context: > > int[string] aa; > aa["asd"] += 2; > > The following code also indexes an AA with a missing key in an lvalue context: > > int[string] aa; > ref int get() { return aa["asd"]; } > get() += 2; > > The assumption here is that indexing an AA with a non-existing key in a context > where an lvalue is expected does not result in a range violation. > > Do you know of any reason why this is an incorrect assumption? Actually D does not have notion of lvalue context - TDPL says about read and write operations. According to it, returning from function is a read operation. And function refness does not guarantee that original lvalue will be modified. The use case could be: int a = get(); which means refness is wiped out and operation is effectively a read operation. Situation could be follows: ref int foo() { return aa["asd"]; } int bar() { return aa["asd"]; } not only this is confusing and inconsistent, but it defeats the purpose of having aaGetRvalueX - anyone who want allocation would be just putting ref to function declaration to make things works.
Comment #6 by yebblies — 2013-06-29T22:59:48Z
(In reply to comment #5) > > Actually D does not have notion of lvalue context - TDPL says about read and > write operations. According to it, returning from function is a read operation. > TDPL does not completely define the language. Unfortunately neither does the spec, it is defined through a combination of TDPL, the spec, and DMD. When these disagree none are a definitive source. The compiler most certainly has a notion of lvalue context for index expressions. > And function refness does not guarantee that original lvalue will be modified. > The use case could be: > > int a = get(); > > which means refness is wiped out and operation is effectively a read operation. > Situation could be follows: > > ref int foo() { return aa["asd"]; } > int bar() { return aa["asd"]; } > > not only this is confusing and inconsistent, but it defeats the purpose of > having aaGetRvalueX - anyone who want allocation would be just putting ref to > function declaration to make things works. I'm not sure how any of this is relevant.
Comment #7 by maxim — 2013-06-29T23:43:51Z
(In reply to comment #6) > > And function refness does not guarantee that original lvalue will be modified. > > The use case could be: > > > > int a = get(); > > > > which means refness is wiped out and operation is effectively a read operation. > > Situation could be follows: > > > > ref int foo() { return aa["asd"]; } > > int bar() { return aa["asd"]; } > > > > not only this is confusing and inconsistent, but it defeats the purpose of > > having aaGetRvalueX - anyone who want allocation would be just putting ref to > > function declaration to make things works. > > I'm not sure how any of this is relevant. Than you should read again if you don't see the relevance. Function returning AA value as discussed here is a case which should throw according to TDPL (spec is silent unfortunately) and current D implementation in case of non-ref functions. Presence of ref attribute does not matter because function performs read operation in the first place. How value is returned (by ref or not) is of secondary importance.
Comment #8 by yebblies — 2013-06-29T23:49:40Z
(In reply to comment #7) > Than you should read again if you don't see the relevance. Function returning > AA value as discussed here is a case which should throw according to TDPL (spec > is silent unfortunately) and current D implementation in case of non-ref > functions. Presence of ref attribute does not matter because function performs > read operation in the first place. How value is returned (by ref or not) is of > secondary importance. TDPL is not an authoritative specification, so the fact this is in line with the way TDPL presents AAs does not conclusively prove this is correct behavior.
Comment #9 by k.hara.pg — 2013-07-08T21:52:32Z
The old behavior was wrong-code bug, and fixed in 2.061. Introduced commit/codeline: https://github.com/D-Programming-Language/dmd/commit/9553d0c66337b477375cb77743defcdd79b8064d#L4L3995 AA indexing + assign syntax is a special form in D. If the acquired payload from AA is _immediately_ set by assignment, it is treated as an insertion for non-existing key and won't throw RangeError. Otherwise, AA indexing should be always treated as a read of existing key - even if it is on Lvalue-context. I'd like to show more specific example. struct S { int value = 42; } S[string] myValues; ref S getValue(string v) { return myValues[v]; } void main() { import std.stdio; writeln(getValue("myValue").value); // what will be displayed? } In 2.060 and earlier, myValues[v] had returned a reference to *uninitialized* memory. From 2.061, it correctly throws RangeError. That's @safe behavior.