Bug 15086 – import doesn't verify module declaration

Status
REOPENED
Severity
critical
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2015-09-18T19:18:11Z
Last change time
2024-12-13T18:44:49Z
Keywords
accepts-invalid
Assigned to
No Owner
Creator
ag0aep6g
See also
https://issues.dlang.org/show_bug.cgi?id=18517
Moved to GitHub: dmd#19045 →

Comments

Comment #0 by ag0aep6g — 2015-09-18T19:18:11Z
foo.d: ---- import bar; pragma(msg, bar.name); ---- bar.d: ---- module baz; /* not bar */ enum name = __MODULE__; ---- `dmd -c foo.d` prints "baz". It should fail with "Error: module baz from file bar.d must be imported with 'import baz;'" as happens with `dmd -c foo.d bar.d`.
Comment #1 by dlang-bugzilla — 2015-10-15T17:10:28Z
I think this is a dupe of some ancient bug... IIRC, the gist of the situation is that it would preclude object_.d from compiling. Oh, but object_.d is now called object.d, so this might be relevant again.
Comment #2 by timothee.cour2 — 2018-01-26T06:24:48Z
just ran into this again, this is a serious hole in the D module system. Here was the example i was gonna report: ```dmd -I. main.d main bar // BUG! was imported as foo.bar (inconsistent) foo.bar2 asdf.wrong // BUG! was imported as foo.bar (inconsistent) ``` ```pragma(msg, __MODULE__); import foo.bar; import foo.bar2; import foo.bar3; void main(){} cat foo/bar.d pragma(msg, __MODULE__); cat foo/bar2.d cat foo/bar2.d module foo.bar2; pragma(msg, __MODULE__); cat foo/bar3.d module asdf.wrong; pragma(msg, __MODULE__); ``` LINKS: https://github.com/dlang/dmd/pull/7778 Bug fix: dmd allows importing modules with wrong name #7778
Comment #3 by andrei — 2018-02-12T04:11:35Z
(In reply to Timothee Cour from comment #2) > just ran into this again, this is a serious hole in the D module system. > > Here was the example i was gonna report: > > > ```dmd -I. main.d > main > bar // BUG! was imported as foo.bar (inconsistent) > foo.bar2 > asdf.wrong // BUG! was imported as foo.bar (inconsistent) > ``` > > ```pragma(msg, __MODULE__); > import foo.bar; > import foo.bar2; > import foo.bar3; > void main(){} > > cat foo/bar.d > pragma(msg, __MODULE__); > cat foo/bar2.d > > cat foo/bar2.d > module foo.bar2; > pragma(msg, __MODULE__); > > cat foo/bar3.d > module asdf.wrong; > pragma(msg, __MODULE__); > ``` > > > > LINKS: > https://github.com/dlang/dmd/pull/7778 Bug fix: dmd allows importing modules > with wrong name #7778 Can you please reorganize the examples? It is difficult to understand which files have which content. Thanks!
Comment #4 by bugzilla — 2018-02-12T19:42:11Z
Why is this "critical"? I'm not even convinced it's a bug at all. The language does, after all, allow module names to be different from file names.
Comment #5 by issues.dlang — 2018-02-12T20:07:46Z
(In reply to Walter Bright from comment #4) > Why is this "critical"? I'm not even convinced it's a bug at all. The > language does, after all, allow module names to be different from file names. Well, it's allowing you to import a module with the wrong name. So, I don't see how it can really be argued that it isn't a bug, but I agree that it isn't particularly critical. Honestly, I think that it was a mistake to allow the module declaration to not match the file name, but that can be argued, and we're stuck with it at this point, since it's doubtful that changing it would be worth the breakage. Either way, being allowed to import the same module with two different names doesn't seem like it improves things, and the module declaration is supposed to be giving the module its name, so as long as it's allowed to not match the file name, I would fully expect importing using the file name to fail if it doesn't match the module declaration.
Comment #6 by johnnymarler — 2018-02-12T23:09:55Z
Note that there are 3 names to consider for every import. 1. The import name 2. The file name 3. The module name This issue addresses the case where the import name matches the filename, but does not match the module name. > Note that it does not cover the case where the import matches the module name but does not match the filename (a case I think is useful). The problem is that this is an error when all the modules are compiled together, but the error disappears if you compile the modules seperately. This puts a coupling between the module system and how the source files are compiled, and adds support for an odd use case that has no utility.
Comment #7 by johnnymarler — 2018-02-13T18:25:06Z
Andrei and Waltered have rules this as not a bug: https://github.com/dlang/dmd/pull/7878#issuecomment-365354972 > The case for this PR rests on the notion there should be some invariance of build results with changing build conditions. E.g. "Either both cases should fail or both should pass" (and let me add: "... and create the same binary"). That actually doesn't seem to be a strong argument...
Comment #8 by andrei — 2018-02-13T18:32:58Z
I think at the least we can do a vastly better job at issuing error messages. Per the discussion in : === This error occurs becase the module foo/bar.d is being loaded twice with 2 different names. The first time it is loaded because it is given on the command line, and since it has no module declaration, the compiler gives it the name bar. But then the module gets imported with the name foo.bar, however, the compiler cannot change the name of the module and all existing reference to it so an error must be asserted. [...] [this bug is allowing] weird abuses which in my opinion have no utility and only serve to add confusion to the module system. === I do agree matters can be confusing especially during project organization and reorganization. I've done my share of head scratching. I think we'd go a long way with better error messages (e.g. explaining the matter with two access paths above), and better online documentation for the module directive.
Comment #9 by andrei — 2018-02-13T18:33:32Z
Meant to write: Per the discussion in https://github.com/dlang/dmd/pull/7778
Comment #10 by johnnymarler — 2018-02-13T18:43:53Z
> I think at the least we can do a vastly better job at issuing error messages. I'm limiting focus to this issue alone. Do you have an idea on a way to improve the error message for this case? CURRENT BEHAVIOR ---------------------------- dmd -c foo.d bar.d Error: module baz from file bar.d must be imported with 'import baz;' PROPOSED BEHAVIOR ---------------------------- dmd -c foo.d bar.d Error: module baz from file bar.d must be imported with 'import baz;' or must be compiled separately
Comment #11 by andrei — 2018-02-13T18:59:03Z
(In reply to Jonathan Marler from comment #10) > > I think at the least we can do a vastly better job at issuing error messages. > > I'm limiting focus to this issue alone. Do you have an idea on a way to > improve the error message for this case? > > CURRENT BEHAVIOR > ---------------------------- > dmd -c foo.d bar.d > Error: module baz from file bar.d must be imported with 'import baz;' > > PROPOSED BEHAVIOR > ---------------------------- > dmd -c foo.d bar.d > Error: module baz from file bar.d must be imported with 'import baz;' or > must be compiled separately To make clear what the use case is, I'll paste the files again: foo.d: ---- import bar; ---- bar.d: ---- module baz; /* not bar */ ---- In this case, attempting to compile them together would be mistaken. I suggest: Error: file bar.d masquerades as module `baz`, so it must be compiled separately or imported with `import baz;` This way we clarify where the discrepancy is and even mildly disapprove of it. It's also clear that if no mismatch, this problem would disappear.
Comment #12 by andrei — 2018-02-13T19:01:02Z
Also enclosing the file name in double quotes might be nice.
Comment #13 by timothee.cour2 — 2018-02-13T19:55:37Z
I don't see what's controversial about the fact that this is broken and breaks the module system. Hopefully this other example will convince you, where all I do is change the order of imports and goes from CT error to compiling. ./main.d pragma(msg, __FILE__," ",__MODULE__); import foo.bar; import foo.bar2; version(A){ import foo.bar4; import foo.bar3; } else version(B){ import foo.bar3; import foo.bar4; } void main(){} ./foo/bar2.d module foo.bar2; pragma(msg, __FILE__," ",__MODULE__); ./foo/bar4.d module foo.bar4; pragma(msg, __FILE__," ",__MODULE__); import asdf.wrong; ./foo/bar.d pragma(msg, __FILE__," ",__MODULE__); ./foo/bar3.d module asdf.wrong; pragma(msg, __FILE__," ",__MODULE__); ./asdf/wrong.d pragma(msg, __FILE__," ",__MODULE__); ``` dmd -version=A -run main.d main.d(7): Error: module `asdf.wrong` from file foo/bar3.d conflicts with another module wrong from file asdf/wrong.d dmd -version=B -run main.d main.d main foo/bar.d bar foo/bar2.d foo.bar2 foo/bar3.d asdf.wrong foo/bar4.d foo.bar4 ``` This causes many other weirdnesses: * seperate compilation vs all-at-once compilation is broken * this makes `./foo/bar.d` and `foo/bar.d` appear as different paths: ``` dmd foo/bar.d -run main.d main.d(2): Error: module `bar` from file foo/bar.d must be imported with 'import bar;' dmd ./foo/bar.d -run main.d main.d(2): Error: module `bar` from file foo/bar.d conflicts with another module bar from file ./foo/bar.d ```
Comment #14 by ag0aep6g — 2018-02-13T20:10:14Z
Reopening. INVALID is not an acceptable resolution for this. Per the spec, a module must be imported by its module name [1]. And the module name is set by the module declaration, if there is one [2]. Importing a module by its file name is not in the spec. Of course, I may be missing how the spec allows DMD's current behavior. If so, someone please point it out. If I'm not missing anything in the spec, this issue (like all compiler bugs) can be resolved by changing the compiler to match the spec, or by changing the spec to match the compiler. It could also be closed as WONTFIX, I guess. But that would be disappointing. DMD and the spec should agree on the rules. [1] https://dlang.org/spec/module.html#import-declaration [2] https://dlang.org/spec/module.html#module_declaration
Comment #15 by timothee.cour2 — 2018-02-13T20:17:01Z
the example I just posted breaks this spec: > The order in which ImportDeclarations occur has no significance.
Comment #16 by andrei — 2018-02-13T20:38:02Z
The example in which the order of imports makes or breaks the build is compelling. Even better would be a bug whereby the project builds both ways but does different things. That would be the smoking gun. OK to leave this open. At the minimum we must fix the order dependency and make the spec clearer. Thanks!
Comment #17 by johnnymarler — 2018-02-13T20:58:50Z
I was able to reproduce Timothee's example. The error only occurs if the modules are specified in a particular order. I also tested this using PR (https://github.com/dlang/dmd/pull/7878#issuecomment-365364824) and confirmed that it will assert an error in both cases regardless of the order the imports appear in.
Comment #18 by timothee.cour2 — 2018-02-13T21:13:09Z
> Even better would be a bug whereby the project builds both ways but does different things. That would be the smoking gun. could shorten but this shows it: ./asdf/wrong.d pragma(msg, __FILE__," ",__MODULE__); extern(C) int baz(){ return 42; } ./main.d static if(__traits(compiles, {import foo.bar6;})) { import foo.bar6; } extern(C) int baz(); void main(){ import std.stdio; writeln("baz:", baz); } ./foo/bar6.d module foo.bar6; version(A){ import foo.bar4; import foo.bar3; } else version(B){ import foo.bar3; import foo.bar4; } ./foo/bar2.d module foo.bar2; pragma(msg, __FILE__," ",__MODULE__); ./foo/bar5.d module foo.bar5; pragma(msg, __FILE__," ",__MODULE__); import asdf.wrong; ./foo/bar4.d module foo.bar4; pragma(msg, __FILE__," ",__MODULE__); import asdf.wrong; ./foo/bar.d pragma(msg, __FILE__," ",__MODULE__); ./foo/bar3.d module asdf.wrong; pragma(msg, __FILE__," ",__MODULE__); extern(C) int baz(){ return 41; } dmd -i -version=A -run main.d foo/bar4.d foo.bar4 asdf/wrong.d wrong baz:42 dmd -i -version=B -run main.d foo/bar3.d asdf.wrong foo/bar4.d foo.bar4 baz:41
Comment #19 by ag0aep6g — 2018-02-13T21:17:18Z
(In reply to Andrei Alexandrescu from comment #16) > The example in which the order of imports makes or breaks the build is > compelling. Even better would be a bug whereby the project builds both ways > but does different things. That would be the smoking gun. My take on it: foo.d: ---- version (A) { static import baz; static import bar; } version (B) { static import bar; static import baz; } void main() { import std.stdio; writeln(bar.thing); writeln(baz.thing); } ---- bar.d: ---- module baz; enum thing = "this is baz from bar.d"; ---- baz.d: ---- module bar; enum thing = "this is bar from baz.d"; ---- `dmd foo.d -version=A && ./foo`: ---- this is bar from baz.d this is bar from baz.d ---- `dmd foo.d -version=B && ./foo`: ---- this is baz from bar.d this is baz from bar.d ----
Comment #20 by johnnymarler — 2018-02-13T21:43:16Z
Confirmed ag0aep6g's example. Also confirmed the proposed PR treats the example as an error. version=A foo.d(3): Deprecation: module `bar` from file baz.d must be imported with 'import bar;' version=B foo.d(8): Deprecation: module `baz` from file bar.d must be imported with 'import baz;'
Comment #21 by bugzilla — 2018-02-14T01:04:49Z
(In reply to ag0aep6g from comment #19) Thank you for providing a simple and clear test case to illustrate the issue. It is indeed a problem that the order of imports is mattering, as this breaks a fundamental tenet of D's import system. I have not thought about this thoroughly, but I suspect the core of a fix can be along the lines of detecting that an explicit import of bar.d is done using two different names in the same module. This is a lot more focused in effect than the PR.
Comment #22 by timothee.cour2 — 2018-02-14T02:46:05Z
I don't understand why the current behavior is desired. Is there any single use case that can't be achieved without existing option such as -mv? -mv=<package.module>=<filespec> use <filespec> as source file for <package.module>
Comment #23 by johnnymarler — 2018-02-14T03:52:49Z
> I have not thought about this thoroughly, but I suspect the core of a fix can be along the lines of detecting that an explicit import of bar.d is done using two different names in the same module. I think you'll find that this is "easier said than done". The name that was used to import a module is no longer available once the import is processed. The compiler just adds the module to the proper location in thy symbol table and moves on. Essentially, you need a way to reverse a symbol lookup. You need to be able to take a Module object and lookup the name that was used to import it. This will require either a new list field for all imports, or a new list field for all "importees", or you will need to walk the entire symbol table tree every time you want to check if there is a conflict. The place in the code where you would perform this check is in `dimport.d` in the load function. Near the top of the funciton there is a branch `if (s.isModule()), and you can insert the check right there. However, I don't know why we would go through all the trouble to support this. You'd be adding a fair amount of runtime overhead for every import statement or more memory to cache the reverse lookups. Not to mention the extra complexity this is going to introduce. And the use case where this actually works is so narrow, it only works if the module are not compiled in the same compiler invocation...
Comment #24 by johnnymarler — 2018-02-14T05:18:22Z
> I have not thought about this thoroughly, but I suspect the core of a fix can be along the lines of detecting that an explicit import of bar.d is done using two different names in the same module. After having a bit of a think I realized this still doesn't solve the problem. I can change the example to show that module order also changes compilation...observe: --- main.d import ibar, ibaz; void main() { ibarfunc(); ibazfunc(); } --- ibar.d import bar, std.stdio; void ibarfunc() { writeln(thing); } --- ibaz.d import baz, std.stdio; void ibazfunc() { writeln(thing); } --- bar.d module baz; enum thing = "this is baz from bar.d"; --- baz.d module bar; enum thing = "this is bar from baz.d"; dmd ibar.d ibaz.d -run main.d OUTPUT: this is baz from bar.d this is baz from bar.d dmd ibaz.d ibar.d -run main.d OUTPUT: this is bar from baz.d this is bar from baz.d Note that in the case the same module is not imported with different names in the same file. So fixing that won't fix this issue.
Comment #25 by ag0aep6g — 2018-02-14T13:30:56Z
(In reply to Walter Bright from comment #21) > I suspect the core of a fix > can be along the lines of detecting that an explicit import of bar.d is done > using two different names in the same module. That would fix a problem, but not this Bugzilla issue. If you want to attack the issue at hand in the spec, then I suggest we get to that before fixing related-but-not-identical cases. An attempt at amending the spec is likely to reveal more interesting cases. Also, so far we focused on accepts-invalid cases where we'd like DMD to throw an error instead, but the lack of verification can also lead to wrong code when the wrong module is accepted before the right one is attempted. Example: pkg/hello_world.d: ---- module pkg.hello_world; import std.stdio; /* Should import Phobos's std.stdio. */ void main() { writeln("Hello, world!"); /* Should print "Hello, world!". */ } ---- pkg/std/stdio.d: ---- module pkg.std.stdio; /* NOTE: Not std.stdio. */ void writeln()(string s) /* Template makes for a shorter example. */ { import core.stdc.stdio; printf("Goodbye, cruel world!\n"); } ---- `cd pkg; dmd hello_world.d && ./hello_world; cd ..`: ---- Goodbye, cruel world! ---- Note that I have no custom module std.stdio. DMD incorrectly picks up pkg.std.stdio as std.stdio, because it ignores the module declaration. Instead, it should look at the module declaration, see that pkg.std.stdio is not std.stdio, and then continue the search for the real std.stdio. Maybe this too can be fixed without addressing the core problem of missing verification (e.g. with a "best match" kind of thing). But I'm afraid this way leads down a rabbit hole of special cases. In my opinion, it would be better to have a clean cut, maybe with a deprecation period.
Comment #26 by johnnymarler — 2018-02-14T16:45:24Z
After more pondering I believe I've been able to pin down the fundamental problem. TLDR: The current import semantics fundamentally break import order and module order invariance. To demonstrate let me describe current import semantics. import foo; When the compiler sees this, it first checks if a module has been loaded whose "module name" is `foo`. The important part is "if a module has been loaded". If this check fails, then it falls back to searching the fileystem for a file matching the import name. Note that when this occurs, the file that ends up matching the import may not have a "module name" matching the name used to import it. And this is the fundamental problem. What this means is that based on what modules have previously been loaded (either via previous modules or previous imports in the same file) the semantics of the import change. In one case, the import causes a previously loaded module to be imported, and in the other case, it will query the file system for the module. Since module names don't have to match filenames, this means that the same import can result in loading different modules. The important distinction is that this change isn't caused by different command line arguments, it's cause by changing the import order or the module order, which breaks order invariance.
Comment #27 by johnnymarler — 2018-02-16T00:52:06Z
Ok. Walter/Andrei have decided they don't want to drop support to allow an import to match the filename without matching the module name. Luckily, we have found a way to detect when order invariance is violated while still supporting existing semantics. So the fix to this issue is to change the spec. An import name may represent a module name, or a filename, whose module name may not match the file it is contained in. The solution for the order invariance problem is to detect and assert an error when a module is imported with different names.
Comment #28 by timothee.cour2 — 2018-02-16T01:05:37Z
Re-asking since no-one answered this: Is there any single use case that current (broken) behavior allows that couldn't be done with existing option such as -mv? ``` -mv=<package.module>=<filespec> use <filespec> as source file for <package.module> ``` If the answer is 'no but we don't want to break code that relied on this broken behavior', could we at least start deprecation process for this or issue a warning to use -mv instead?
Comment #29 by destructionator — 2022-12-24T23:11:28Z
A rule I've written before is any module you import MUST have a module declaration. The language doesn't require this but it really should, the implicit module name hack has never worked well.
Comment #30 by bugzilla — 2022-12-25T08:17:19Z
Comment #31 by ag0aep6g — 2023-06-30T20:22:54Z
I just ran into this one again in the wild. Step 1: Start with a program that prints some data as an HTML table. Step 2: Add another module to print the same data in CSV format. Do this by copying html.d to csv.d and adjust as needed. Make sure you forget updating the module declaration. The resulting code (heavily simplified, of course): --- main.d module main; void main() { const mode = "html"; if (mode == "csv") { import csv; print(); } else if (mode == "html") { import html; print(); } } --- csv.d module html; /* oopsie daisy */ void print() { import std.stdio; writeln("foo,bar\nbaz,qux"); } --- html.d module html; void print() { import std.stdio; writeln("<table><tr><td>foo</td><td>bar</td></tr>" ~ "<tr><td>baz</td><td>qux</td></tr></table>"); } --- Step 3: Run with `dmd -i -run main.d`. Be surprised that it prints CSV even though `mode` is clearly set to "html". Scratch your head. Stare at the screen. Question your sanity. Finally, figure out that you forgot to update the module declaration. Wonder why the compiler didn't complain. Remember that bug you filed seven(!) years(!) ago.
Comment #32 by robert.schadek — 2024-12-13T18:44:49Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/19045 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB