Bug 11365 – Allow D source file names to have no extension (or an arbitrary extension) when -run is used

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-10-26T14:10:00Z
Last change time
2013-11-17T17:48:05Z
Keywords
pull
Assigned to
nobody
Creator
bugzilla

Comments

Comment #0 by bugzilla — 2013-10-26T14:10:21Z
---- eles writes ---- This forces scripts to bear the .d extension. For example, if you write a script on Linux named "git-test" and you put at the top: #!rdmd rdmd will pass its name to dmd, and dmd will try to compile... "git-test.d", which does not exist. Now, you have either to rename the "git-test" into "git-test.d", or to create a hardlink named "git-test.d" that points towards "git-test" so that dmd finally gets satisfied its ".d" hungriness. The solution with the hardlink carries the well-known burden of redundancy, let's not even say its idiot and makes back-up-ing a mess. OTOH, renaming the original script into "git-test.d" has the undesirable effect wrt to git software. git uses some nice convention that you can extend its command list by writing your own "git-command1", "git-command2" scripts and they are invoked automatically by git when you type: "git command1" (this will invoke "git-command1") etc. The problem with being forced to rename "git-command1" into "git-command1.d" is that, afterwards, you have to type the following command for git: "git command1.d" (in order to have the "git-command1.d" invoked, as "git-command1" simply does not exist or, if it would exist, dmd would be blind about it). SO, you cannot type "git command1" and to have a "git-command1" script invoked, because git won't search for "git-command1.d", while dmd won't compile "git-command1". So you need both "git-command1" and "git-command1.d" doing the same thing, just to be able to type "git command1" (not even say that this allows you to invoke, also "git comman1.d", which is ugly and undesired redundancy). Now, immagine yourself having to type: "git checkout.d ." "git commit.d" "git log.d" instead of "git checkout ." "git commit" "git log" and tell me that ".d" is not an issue. ---------------------- To that end, I propose that for: dmd foo that it will treat 'foo' as the source file if it does not find foo.d or foo.di.
Comment #1 by dlang-bugzilla — 2013-10-26T15:04:00Z
I should note that "auto-correcting" file names has security implications. Let's suppose that there exists an upload script file, written in D, called "upload", in the root of a web server's public directory. The upload script goes like this: #!rdmd (code follows) The upload script allows users to upload files with any name to the same directory. Naturally, for security reasons, none of the uploaded files can be executable, and it's not possible to overwrite the upload script by uploading a file with the same name. Now, what happens if someone uploads a file called "upload.d"? The webserver runs "upload", which runs "rdmd upload", which runs "dmd upload", which compiles teh file "upload.d", and not "upload". The uploader successfully got their code running on the server. Possible solutions: 1) deprecate then remove all name auto-correction features from dmd and rdmd 2) forbid compilation if an ambiguity exists due to name auto-correction (although now this turns from an RCE vulnerability into a DOS vulnerability) 3) remove auto-correction features from rdmd; make rdmd pass a flag to dmd that disable name auto-correction --------------------------------------------------- Another problem with this suggestion: echo 'void main(){}' > foo.d dmd foo rm foo.d dmd foo dmd will now try to parse a compiled binary file as an executable.
Comment #2 by dlang-bugzilla — 2013-10-26T15:06:08Z
One thing I forgot to mention regarding name auto-correction. Perhaps, the most famous security problem caused by such a mis-feature, is the "MultiViews" feature in the Apache web server. When enabled, a request for foo.php could execute foo.php.txt if foo.php was not found. This allowed bypassing upload script validation checks. Search the web for "MultiViews vulnerability" for more details.
Comment #3 by doob — 2013-10-27T03:12:32Z
(In reply to comment #1) > 3) remove auto-correction features from rdmd; make rdmd pass a flag to dmd that > disable name auto-correction That won't fix the problem if one is using "dmd -run".
Comment #4 by leandro.lucarella — 2013-10-27T05:05:52Z
I just updated the title of the issue, arbitrary extension names should be allowed for the same reason. I also agree with Vladimir, the compiler shouldn't add any extension when the file is not found.
Comment #5 by andrej.mitrovich — 2013-10-30T08:34:33Z
Btw, no extensions might be fine, but I'm totally against D sources having arbitrary extensions. People will start doing the same thing C++ programmers do and start inventing 20 different extensions for D sources, so you end up with extensions like: .cpp .cxx .cp .cc .c++ See also: http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Overall-Options.html This would just make creating software that deals with D files harder, with no benefits.
Comment #6 by andrej.mitrovich — 2013-10-30T08:36:07Z
(In reply to comment #4) > I just updated the title of the issue, arbitrary extension names should be > allowed for the same reason. So you're the one adding this. What benefit do you see with arbitrary extensions?
Comment #7 by leandro.lucarella — 2013-10-30T08:41:42Z
(In reply to comment #6) > (In reply to comment #4) > > I just updated the title of the issue, arbitrary extension names should be > > allowed for the same reason. > > So you're the one adding this. What benefit do you see with arbitrary > extensions? First, it worth mention that the extension problem with C++ only happened to C++ for historical reasons. There is no reason to think it will happen to D as it doesn't happen in any other language that is flexible in terms of naming files. The reason of having an arbitrary file NAME (the extension is just an artificial separation of a file name) is the same mentioned in the issue description. The compiler have no reason to limit how can I name files. Why if I want to create a script that's called "dlang.org". For some reason I might have a system to fetch stuff from websites and call the scripts after the host name. The moment D tries to pretend it can be used for scripting is the moment D lost its right to place limitations on file naming. Is that simple.
Comment #8 by bearophile_hugs — 2013-10-30T08:53:19Z
(In reply to comment #5) > Btw, no extensions might be fine, but I'm totally against D sources having > arbitrary extensions. People will start doing the same thing C++ programmers do > and start inventing 20 different extensions for D sources, so you end up with > extensions like: > > .cpp > .cxx > .cp > .cc > .c++ +1. If you offer programmers some freedom, someone will inevitably use it, often with chaotic/confusing results (I see it every day in D.learn). Giving freedom should be done only where there is a large advantage of doing it.
Comment #9 by bearophile_hugs — 2013-10-31T07:42:41Z
Having a standard extension for D code is useful for programs like "cloc" that count lines of code, with editors that open .d files with correct D colorization, for my scripts that select files with .d suffix to test incompatibilities across different compiler versions. I have testing scripts that test .d files differently from .py files looking in directories. And it's not just a matter of my own code, it also mattes from D libraries from other people.
Comment #10 by public — 2013-10-31T07:48:57Z
As I have already mentioned in NG, the very idea that file extension should have any relation with its content is just plain wrong and needs to be discouraged, as well as any arbitrary limitations that may impose.
Comment #11 by public — 2013-10-31T07:50:47Z
In other words, it is not a as much of a problem of DMD codebase that is uses ".c" for C++ code, it is a problem of IDE's/tools that assume it is a C code without providing any convenient way to override that assumption.
Comment #12 by pro.mathias.lang — 2013-10-31T10:08:55Z
Why should we enforce this ? We enforce things to prevent obvious mistakes. D language plays well in this field. It ensures what it is sure needs to be ensured, and give you the tools to build your own rules, with the least burdens. It's not a mistake to have a source file with an arbitrary extension, or no extension at all. DMD will still now it's argument is a source file, whatever its name is. And they're some valid use cases where you would not want a .d[i] extension, as eles noticed in the quoted comment above, and in the NG. (In reply to comment #9) > Having a standard extension for D code is useful for programs like "cloc" that > count lines of code, with editors that open .d files with correct D > colorization, for my scripts that select files with .d suffix to test > incompatibilities across different compiler versions. I have testing scripts > that test .d files differently from .py files looking in directories. And it's > not just a matter of my own code, it also mattes from D libraries from other > people. As you point, there are also some use cases where some tool require a specific extension. But that's none of our business, the tool should ensure it, not DMD. The real problem for those tools is to know what the file holds. We don't have such problems with DMD. For the record, good editors solve the problem easily, like vim or emacs: # vim: syntax=d ts=4 sw=4 sts=4 sr noet # -*- d-mode -*-
Comment #13 by leandro.lucarella — 2013-10-31T10:50:57Z
I quickly tried to implement this by only disabling the extension checks when the `-run` option is passed, but I failed miserably because the automatic extension appending is some deeply in the module code, and then object.d isn't found because the .d isn't added to the module name.
Comment #14 by leandro.lucarella — 2013-10-31T11:38:36Z
Comment #15 by andrej.mitrovich — 2013-10-31T12:08:58Z
(In reply to comment #12) > Why should we enforce this ? We enforce things to prevent obvious mistakes. D > language plays well in this field. It ensures what it is sure needs to be > ensured, and give you the tools to build your own rules, with the least > burdens. > > It's not a mistake to have a source file with an arbitrary extension, or no > extension at all. DMD will still now it's argument is a source file, whatever > its name is. That's not true. It can't have a .lib extension, or an .obj/.o extension. Arbitrary extensions means import switches will not work, the compiler won't know which files it has to inspect to find D code. So this feature will be useful for scripts and in cases where you're explicitly passing all files to DMD. (In reply to comment #10) > As I have already mentioned in NG, the very idea that file extension should > have any relation with its content is just plain wrong and needs to be > discouraged, as well as any arbitrary limitations that may impose. That's exactly what happens when you allow arbitrary extensions, tools end up inventing their own semantics *based on* the extension: http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Overall-Options.html: file.c C source code that must be preprocessed. file.i C source code that should not be preprocessed. file.ii C++ source code that should not be preprocessed. file.tcc C++ header file to be turned into a precompiled header or Ada spec. See what I mean? If it's only .d/.di and for [1] extensionless files we allow then we make everything simple. > In other words, it is not a as much of a problem of DMD codebase that is uses > ".c" for C++ code, it is a problem of IDE's/tools that assume it is a C code > without providing any convenient way to override that assumption. So now every tool in existence has to do heuristics on text files? The benefit of having known and defined extensions is to make it easier to figure out what a file is without having to open it, to make it easier to filter through a directory of files and organize them based on their extension. Using .c for C++ files is Walter's fault and nobody else's. There are no excuses here.
Comment #16 by andrej.mitrovich — 2013-10-31T12:14:39Z
(In reply to comment #12) > For the record, good editors solve the problem easily, like vim or emacs: > # vim: syntax=d ts=4 sw=4 sts=4 sr noet > # -*- d-mode -*- You call that a solution? Arbitrary tools adding an arbitrary amount of HEADER information they've invented? So then other tools have to be able to interpret these lines too. This doesn't scale. It's not a solution.
Comment #17 by public — 2013-10-31T13:27:13Z
(In reply to comment #15) > That's not true. It can't have a .lib extension, or an .obj/.o extension. This is purely a problem of how DMD argument list is designed, not meaningful limitation. And yet another example of what apps shouldn't do. > Arbitrary extensions means import switches will not work, the compiler won't > know which files it has to inspect to find D code. So this feature will be > useful for scripts and in cases where you're explicitly passing all files to > DMD. Exactly. And someone who wants to use arbitrary extensions will be aware that he is stepping aside from common naming convention and thus losing some convenience offered by compiler. It is perfectly expected. > (In reply to comment #10) > > As I have already mentioned in NG, the very idea that file extension should > > have any relation with its content is just plain wrong and needs to be > > discouraged, as well as any arbitrary limitations that may impose. > > That's exactly what happens when you allow arbitrary extensions, tools end up > inventing their own semantics *based on* the extension: > ... > See what I mean? It is exactly what happens when _someone_ (compiler, tools, whatever) decides to strictly couple some behavior exclusively to extension. See what I mean? :) > > In other words, it is not a as much of a problem of DMD codebase that is uses > > ".c" for C++ code, it is a problem of IDE's/tools that assume it is a C code > > without providing any convenient way to override that assumption. > > So now every tool in existence has to do heuristics on text files? Yes if it is important (there are standard tools for that like famous "file" command). In most cases though it should just try interpret input as if it is legal file and fail in process if it has garbage. Similar to how it will fail if you put garbage into .d file. And context of interpretation should be defined by compiler switches, configuration files or some other external thing. Using default interpretation defined by convention like file extension is also fine if it can be overridden with a manual option. > The benefit > of having known and defined extensions is to make it easier to figure out what > a file is without having to open it, to make it easier to filter through a > directory of files and organize them based on their extension. As I have said, crazy DOS legacy. Luckily, most Linux file managers don't do this and actually explore file metadata. > Using .c for C++ files is Walter's fault and nobody else's. There are no > excuses here. There are no excuses but there is also no disaster. It is bad to break common practice but any sane IDE will allow to trivially configure mapping of .c files to C++ semantics. Just as they should do.
Comment #18 by leandro.lucarella — 2013-11-01T07:56:44Z
(In reply to comment #16) > (In reply to comment #12) > > For the record, good editors solve the problem easily, like vim or emacs: > > # vim: syntax=d ts=4 sw=4 sts=4 sr noet > > # -*- d-mode -*- > > You call that a solution? Arbitrary tools adding an arbitrary amount of HEADER > information they've invented? So then other tools have to be able to interpret > these lines too. This doesn't scale. It's not a solution. Just a comment about this, even when is irrelevant to my proposed solution: You only need to add extra information when you depart from the default. Is like D itself. Do you need to write all your code in ASM? No, but when you need it is there. It will be painful and you won't get lots of features, but you can do it. You are a grown up and know what's best for you.
Comment #19 by bugzilla — 2013-11-09T11:18:25Z
Some points: 1. dmd foo will compile 'foo.d' and create 'foo.exe' on Windows. So if 'foo' is allowed to be a D source file, this works fine. But on Linux, compiling source code 'foo' and creating executable file 'foo' will overwrite the source file. You'll have to specify -of, too. 2. If you wish to name a script file 'foo.bar', D uses the file name as the module identifier. D does not allow non-identifier characters in the module identifier. This means you cannot also have scripts named '7foo(!)%^bar' either.
Comment #20 by bugzilla — 2013-11-09T14:18:36Z
Comment #21 by leandro.lucarella — 2013-11-11T03:09:03Z
(In reply to comment #19) > Some points: > > 1. dmd foo > will compile 'foo.d' and create 'foo.exe' on Windows. So if 'foo' is allowed to > be a D source file, this works fine. But on Linux, compiling source code 'foo' > and creating executable file 'foo' will overwrite the source file. You'll have > to specify -of, too. This is another (critical!) issue to fix to claim DMD can replace scripting languages: https://d.puremagic.com/issues/show_bug.cgi?id=5243 > 2. If you wish to name a script file 'foo.bar', D uses the file name as the > module identifier. D does not allow non-identifier characters in the module > identifier. This means you cannot also have scripts named '7foo(!)%^bar' > either. This is another problem. What about creating a random module identifier for the file being passed to DMD when -run is used (and no module name has been set explicitly)? As I said in the pull requests, for DMD to fully support the scripting domain, it needs to be able to accept arbitrary names. A script is a binary and DMD should not limit how you can name binaries, is like limiting what you can pass to -of.
Comment #22 by github-bugzilla — 2013-11-17T17:20:05Z
Commits pushed to master at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/f43d021d1ad23782a57e75162d3c8e7744b83ab4 fix Issue 11365 - Allow D source file names to have no extension (or an arbitrary extension) https://github.com/D-Programming-Language/dmd/commit/237362e61ffc7502f24eae018de5fff906386238 Merge pull request #2731 from WalterBright/fix11365 fix Issue 11365 - Allow D source file names to have no extension (or an arbitrary extension)