Comment #0 by bearophile_hugs — 2010-10-06T14:43:49Z
Generally it's not a good practice to use global values (or values from outer scopes, D has nested functions too, so names may come from the outer function too), but passed arguments increase the amount of used stack and they may slow down the code a little where high-performance is very important.
So in some situations the programmer may need to use global/outer names. But allowing functions to freely access global scope as in C language may lead to bugs, because there is no control over the flow of information between the subsystems of the program, and also because accidental masking of an outer name is allowed:
int x = 100;
int foo(int y) {
int x = 5;
return x + y; // silently uses local x
}
void main() {
assert(foo(10) == 15);
}
For this (and for other purposes) D has introduced the 'pure' attribute for functions that disallows the access to mutable outer state. But 'pure' is a blunt tool, and in some situations it can't be used. To avoid bugs in such situations, caused by unwanted usage of outer state, an attribute may be defined, it may be named "@outer".
The purpose of the (optional) @outer attribute is similar to the 'global' attribute in the SPARK language:
# global in out CallCount;
A D function that is annotated with @outer must specify all global variables it uses, and if each of them is just read (in), written to (out), or both (inout).
An example of its possible syntax:
int x = 100;
int y = 200;
@outer(in x, inout y)
int foo(int z) {
y = x + z;
return y;
}
Here the compiler enforces that foo() uses only the x and y outer defined variables, that x is just read and y is both read and written inside foo(). This tidies up the flow of information.
The @outer attribute is optional, and you may avoid its usage in small script-like D programs. But in situations where the D code must be very reliable, a simple automatic code review tool may require the usage of @outer by all functions/methods.
The @outer(...) need to be shown both in the documentation produced by -D and -X (Json too) dmd compilation switches.
Comment #1 by bus_dbugzilla — 2010-10-06T18:34:38Z
I like the general idea, but why specify the globals you're going to use? Why not something like this:
--------------------
module foo;
int globalVar;
class Foo()
{
int instanceVar;
static int classVar;
@explicitLookup // Name subject to change
void bar()
{
int globalVar; // Error
int instanceVar; // Error
int classVar; // Error
globalVar = 1; // Error
instanceVar = 1; // Error
classVar = 1; // Error
.globalVar = 1; // Ok
this.instanceVar = 1; // Ok
Foo.classVar = 1; // Ok
}
}
--------------------
And, of course, let it also be used like like this:
--------------------
module foo;
@explicitLookup: // Applies to all code below
int globalVar;
class Foo()
{
int instanceVar;
static int classVar;
void bar()
{
globalVar = 1; // Error
instanceVar = 1; // Error
classVar = 1; // Error
.globalVar = 1; // Ok
this.instanceVar = 1; // Ok
Foo.classVar = 1; // Ok
}
}
--------------------
Comment #2 by bearophile_hugs — 2010-10-06T19:24:03Z
(In reply to comment #1)
> but why specify the globals you're going to use?
It's like in Contract Programming, where your contracts state what are the conditions on the function inputs and outputs (and when the "old" will be available the contracts will also be able to specify at high level some of the changes).
The @outer is like a contract that specifies what's the allowed flux of information in and out of a function. Reducing unwanted and unforeseen flux of information between subsystems is a very good way to reduce the complexity of the whole design.
So @outer() is similar to a second signature of the function. Beside the normal signature that states the types and in/out/ref nature of the explicit function arguments, the @outer() allows to specify the names and in/out/ref nature of the implicit (== from outer scopes) names used by the function.
> this.instanceVar = 1; // Ok
Many programmers don't like this (despite it's the way Python code is written).
> Foo.classVar = 1; // Ok
The need to prefix static members with the class/struct name is something I'd like to be enforced on default.
Comment #3 by bearophile_hugs — 2011-06-24T17:16:06Z
Comment #5 by bearophile_hugs — 2014-02-19T15:07:18Z
It seems this idea of mine isn't so crazy. This is from the SPARK 2014 sublanguage:
http://people.cs.kuleuven.be/~dirk.craeynest/ada-belgium/events/14/140201-fosdem/03-ada-spark.pdf
Clarify access to global variables:
with Global => null; -- Not reference to global items
with Global => V; -- V is an input of the subprogram
with Global => (X, Y, Z); -- X, Y and Z are inputs of the subprogram
with Global => (Input => V); -- V is an input of the subprogram.
with Global => (Input => (X, Y, Z)); -- X, Y and Z are inputs of the subprogram
with Global => (Output => (A, B, C)); -- A, B and C are outputs of the subprogram
with Global => (In_Out => (D, E, F)); -- D, E and F are both inputs and outputs of
-- the subprogram
with Global => (Proof_In => (G, H)); -- G and H are only used in assertion
-- expressions within the subprogram
with Global => (Input => (X, Y, Z),
Output => (A, B, C),
In_Out => (P, Q, R),
Proof_In => (T, U));
-- A global aspect with all types of global specification
Clarify information flow:
procedure P (X, Y, Z : in Integer; A, B, C : in out Integer; D, E out Integer)
with Depends => ((A, B) =>+ (A, X, Y),
C =>+ null,
D => Z,
E => null);
-- The "+" sign attached to the arrow indicates self-dependency
-- The exit value of A depends on the entry value of A as well as the entry
-- values of X and Y.
-- Similarly, the exit value of B depends on the entry value of B as well as
-- the entry values of A, X and Y.
-- The exit value of C depends only on the entry value of C.
-- The exit value of D depends on the entry value of Z.
-- The exit value of E does not depend on any input value.
Comments:
- Spark is even more strict, but this is expected, because it has to prove the code formally.
- Probably for most usages an @outer() as explained, where listed outer scoped variables are "in" (not mutable) on default is enough, and it's easy to remember and use.
- In the years I have had many bugs in the code caused by unwanted interactions with global variables.
- The usage of @outer() is optional. There are many kinds of D code: small script-like programs, GUIs, videogames, heavy numeric software, large network applications, and so on. Some of such kinds of code don't need much contracts and they don't need @outer. But in larger programs, or programs where you want a partial integrity (where you can also use Ada), it's useful.
Comment #6 by bearophile_hugs — 2014-03-09T05:41:50Z
The purpose of the "with Global" and "with Depends" annotations of SPARK2014 is to help mathematically prove that a function is correct. While the lighter and optional @outer() annotation I have suggested for D is useful to write unit tests and reason informally about code. If you have a function that reads and/or writes data from global (or outer) scope, it's not easy to set the global state to have reproducible unit tests. Writing unit tests could be hard. If such D functions are annotated with @outer() then writing unit tests becomes faster, safer and simpler.
Example: if you need to translate some BCPL code like this to D, full of global (untyped) variables, and you want (need) to add unit tests to make sure the translation is correct, you learn very quickly to appreciate an annotation like @outer(), that enforces what each function accesses from the global scope (later, when the D code works, you can refactor the code, moving most of those global variables in structs, putting them inside functions, passing them as arguments, etc. But it's not wise to make such changes as first step, because this could easily break the code):
GLOBAL {
xupb : 200
yupb : 201
spacev : 202
spacet : 203
spacep : 204
boardv : 205
knownv : 206
xdatav : 207
ydatav : 208
xfreedomv: 209
yfreedomv: 210
change : 211
tracing : 212
rowbits : 213
known : 214
orsets : 215
andsets : 216
count : 217
debug : 218
}
AND blobs(v, upb) = VALOF
{ LET res = 0
FOR i = 0 TO upb DO
{ LET p = v!i
UNTIL !p=0 DO { res := res+!p; p := p+1 }
}
RESULTIS res
}
AND freedom(p, upb) = VALOF
{ IF !p=0 RESULTIS 0
upb := upb - !p
{ p := p+1
IF !p=0 RESULTIS upb+1
upb := upb - !p - 1
} REPEAT
}
AND allsolutions() BE
{ UNLESS solve() RETURN // no solutions can be found from here
{ LET b = VEC 31
LET k = VEC 31
LET pos, bit = 0, 0
// save current state
FOR i = 0 TO 31 DO b!i, k!i := boardv!i, knownv!i
FOR i = 0 TO yupb DO
{ LET bits = NOT knownv!i
UNLESS bits=0 DO
{ pos, bit := i, bits & -bits
BREAK
}
}
If you read D code written by another person that uses few global variables, you will be glad if the code is annotated with @outer() because it makes it easier and faster to understand and modify the code.
Comment #7 by bearophile_hugs — 2014-06-08T21:05:25Z
If you have to refactor and clean up some old C/D code you can use a tool (like an IDE) that tags every function with the appropriate @outer(), that specifies what every function reads/writes/readwrites from outer scopes. Then with this information it's quite simpler to understand what every function does, and pass some of those globals as function arguments, move some globals inside functions, etc. For performance critical functions you sometimes don't want to pass all data a function uses, but in most cases you can remove globals, pass down values through arguments, make them constant, put them as class/struct instance values, etc.
So @outer() is a tool to increase code readability, help refactor code, make code safer and keep still some globals for efficiency in a safer way. Not all code is fit for @outer(), you probably don't want to use it for small D script-like programs or in other situations, but for some situations, like when you need higher integrity code, or you need to refactor legacy code, it seems an useful improvement for D. And it's a pure addition, it breaks no existing D code.
Optionally some kind of annotation or switch could be used to require all functions and nested functions of a module or package to have a @outer annotation.
Comment #8 by bearophile_hugs — 2014-09-27T23:28:04Z
Walter Bright has commented on @outer():
> I suggest using 'pure' and passing the globals you actually
> need as ref parameters.
This is what I usually do in D, but it has some small disadvantages:
- It increases the number of function arguments (they are 8 bytes
each), increasing the size of the function, adding some stack
management instructions. This slows the function a little, if the
function is performance-critical.
- Sometimes you can't use "pure", for various reasons, like
Phobos functions not yet pure, I/O action in your function, or
other causes.
- If your pure function foo receives two global argument as out
and ref, the function bar that calls it needs to access to global
variables or it too needs those two ref/out arguments. The
current design of @outer() is not transitive, so only foo needs
to state what global variables are in/out/inout.
- @outer is more DRY, because you don't need to specify the type
of the global variable received by ref, you just need to know its
name.
- With @outer you can tighten some old code, without changing the
signature of a function. If you have an old D module (or a C
function converted to C) you often can't (or you don't want) to
change the function signature to add the global arguments passed
by ref. With @outer() the function signature doesn't change, so
you can improve your legacy code. It allows a simpler refactoring
of code.
More notes:
- SPARK language has added a feature similar to @outer, but more
verbose. See comment 5.
- @outer() is optional and it's fiddly because not it's not meant
for small D script-like programs, it's meant as a help for
medium-integrity D programs (where you may think about using Ada
language instead).
Comment #9 by robert.schadek — 2024-12-13T17:53:47Z