Created attachment 1203
repro case
Here is a list of all things wrong with export:
32 & 64 bit issues:
1) Exporting a global variable leads to a linker error
2) Exporting thread local variables should be an error (at least it is in c++)
3) The module info should be exported as soon the module has any exported
symbols
4) __gshared members of a class are not exported
5) The TypeInfo Object of the TestClass is not exported
6) The TypeInfo Object of TestStruct is not exported
See attached repro case. If this repro case actually compiles and runs dll
support should be sufficient to support a shared runtime.
Comment #1 by code — 2013-04-06T19:05:59Z
(In reply to comment #0)
> Created an attachment (id=1203) [details]
> repro case
>
> Here is a list of all things wrong with export:
>
> 32 & 64 bit issues:
> 1) Exporting a global variable leads to a linker error
What error?
> 2) Exporting thread local variables should be an error (at least it is in c++)
Yes.
> 3) The module info should be exported as soon the module has any exported
> symbols
> 4) __gshared members of a class are not exported
Bug.
> 5) The TypeInfo Object of the TestClass is not exported
> 6) The TypeInfo Object of TestStruct is not exported
>
Any compiler generated symbol referenced by an exported symbol should be exported too, e.g. ModuleInfos, TypeInfos, opEquals and such.
> See attached repro case. If this repro case actually compiles and runs dll
> support should be sufficient to support a shared runtime.
Nice.
Comment #2 by code — 2013-04-06T21:12:36Z
(In reply to comment #1)
> (In reply to comment #0)
> > 2) Exporting thread local variables should be an error (at least it is in c++)
> Yes.
Actually no, at least ELF supports linking/accessing a TLS value of another shared library.
AFAIK Windows also uses an index per DLL for TLS access so it should be possible to link/relocate against them.
Comment #3 by code — 2013-04-07T21:32:10Z
(In reply to comment #0)
> 1) Exporting a global variable leads to a linker error
The problem seems to be that export uses C like rules to distinguish declarations and definitions. For declarations export means __declspec(dllimport) and for definitions __declspec(dllexport).
export void foo(); // this treated as import
export void foo() {} // this treated as export
For global variables to be an export you need an initializer.
export __gshared int bar; // this treated as import
export __gshared int bar = 0; // this treated as export
I don't think that rule makes sense as it's actually defining a variable and is default initalized.
To mark a declaration-only extern should be used.
Furthermore this behavior always requires a header file with only declarations for any importing module.
Comment #4 by code — 2013-05-28T09:31:31Z
> 1) Exporting a global variable leads to a linker error
fixed by bug 10059
> 2) Exporting thread local variables should be an error (at least it is in c++)
Can't you link against a DLL containing the definition of a thread local variable?
> 3) The module info should be exported as soon the module has any exported
symbols
Makes sense to me.
> 4) __gshared members of a class are not exported
fixed by bug 10059
> 5) The TypeInfo Object of the TestClass is not exported
> 6) The TypeInfo Object of TestStruct is not exported
If a user defined type is exported all compiler-generated metadata, vtables, TypeInfo, RTInfo!Type etc., should be exported too.
I'm not positive if exporting a UDT should imply that all it's members are exported too.
To add export annotations in druntime/phobos we need to fix bug 922.
Comment #5 by code — 2013-05-29T06:21:17Z
> Can't you link against a DLL containing the definition of a thread local
variable?
I can. But I thought that it is better to make something an error it is hard to implement and maybe later allow it then to allow it and not implement it.
Comment #6 by code — 2013-05-29T06:22:40Z
Sorry, I wanted to say that I'm not sure if it is possible to make thread local variables work across dll boundaries so I think we should make it a error first and might be able to allow it later.
Comment #7 by r.sagitario — 2013-08-27T00:07:48Z
A few random comments:
- I think in a situation where you want to use the same code for static and dynamic linking, "export" is not usable. I'd propose to export every public symbol at module granularity depending on a compile switch.
- but then, there is no easy way to tell whether symbols in a module are imported from a static or dynamic library. This distinction is necessary though, as the code is different for both situations. Maybe a versioned pragma at module level can control this.
- Using TLS variables from other DLLs is possible, but probably not with the current tool chain. What needs to be exported is the offset of the symbol in the TLS section and the address of the _tls_index variable in the DLL that exports the symbol. The code to read the variable could then look like this
mov EAX,[_imp_variable.tls_index]; // read address of tls_index in import table
mov EAX,[EAX]; // read tls_index of DLL
mov EBX,FS:[2C]; // tls_array
mov EBX,[EBX+4*EAX]; // tls_start of DLL
mov ECX,[_imp_variable.offset]; // read offset of variable in TLS of DLL
mov EDX,[EBX+ECX]; // read variable
If the offset is not exportable, _tls_start is also needed for the DLL.
Comment #8 by code — 2013-08-27T09:41:35Z
Wouldn't it be easier to implement a getter function for each TLS variable which just returns the address of the variable value?
Comment #9 by r.sagitario — 2013-08-28T23:29:36Z
Your're right, a function would be simpler. It might be a little less efficient because of the indirect jump, but avoids the two indirect data accesses through the import table.
Comment #10 by code — 2013-08-29T07:49:41Z
(In reply to comment #7)
> A few random comments:
>
> - I think in a situation where you want to use the same code for static and
> dynamic linking, "export" is not usable. I'd propose to export every public
> symbol at module granularity depending on a compile switch.
I'm not a big of conflating protection and export. For example you could split a library in two DLLs in which case you might need to export/import private and package protected symbols.
> - but then, there is no easy way to tell whether symbols in a module are
> imported from a static or dynamic library. This distinction is necessary
> though, as the code is different for both situations.
Can you tell a bit more about this.
Comment #11 by code — 2013-08-29T08:28:06Z
When a variable is accessed which is linked in through a static library the compiler generates a direct access. If it is linked in through a dynamic library however the compiler needs to generate another level of indirection through the import table. Which is done by referencing _imp_ symbol instead of the original symbol directly. Thats wyh the compiler has to know if the symbol is imported from a dynamic or static library.
Nice blog post. I have implemented something similar to "auto-import" by adding some additional relocation data for data accesses. At program start the addresses that are relocated to the import table are patched.
Unfortunately this does not work for 64-bit applications, because relocations inside a dmd generated binary are 32-bit pc-relative only. You cannot put the address to a variable insde another DLL there as it might be anywhere in the 64-bit address space and out of reach for the 32-bit relative address.
Comment #14 by r.sagitario — 2013-08-29T11:12:17Z
(In reply to comment #10)
> I'm not a big of conflating protection and export. For example you could split
> a library in two DLLs in which case you might need to export/import private and
> package protected symbols.
I guess you are referring to "public symbols" only. Actually anything is fine for me, but I guess it doesn't really make sense for private symbols as you are not allowed to access them from another module anyway.
Comment #15 by code — 2013-08-29T11:38:04Z
(In reply to comment #11)
> When a variable is accessed which is linked in through a static library the
> compiler generates a direct access. If it is linked in through a dynamic
> library however the compiler needs to generate another level of indirection
> through the import table. Which is done by referencing _imp_ symbol instead of
> the original symbol directly. Thats wyh the compiler has to know if the symbol
> is imported from a dynamic or static library.
Maybe my understanding of the Windows mechanism is wrong, but I don't think that the exported symbol has to be different. All the runtime binding happens in the import library and which is similar to the PLT for ELF. Code that refers to an imported symbol will then statically link against the import library.
Comment #16 by code — 2013-08-29T11:44:50Z
> Maybe my understanding of the Windows mechanism is wrong, but I don't think
> that the exported symbol has to be different. All the runtime binding happens
> in the import library and which is similar to the PLT for ELF. Code that refers
> to an imported symbol will then statically link against the import library.
Yes for functions symbols this is true, but data symbols require special handling as described before. What are you trying to suggest with this statement?
Comment #17 by code — 2013-08-29T11:54:03Z
(In reply to comment #16)
> Yes for functions symbols this is true, but data symbols require special
> handling as described before. What are you trying to suggest with this
> statement?
Well, the question is, whether we can annotate symbols with "export" and still create static libraries.
Comment #18 by code — 2013-08-29T11:58:13Z
> Nice blog post. I have implemented something similar to "auto-import" by adding
> some additional relocation data for data accesses. At program start the
> addresses that are relocated to the import table are patched.
>
> Unfortunately this does not work for 64-bit applications, because relocations
> inside a dmd generated binary are 32-bit pc-relative only. You cannot put the
> address to a variable insde another DLL there as it might be anywhere in the
> 64-bit address space and out of reach for the 32-bit relative address.
A optimization like this would be nice, but walter already stated that he preferes the classical way (another level of indirection). Maybe we should first concentrate on the other issues at hand before discussing optimizations.
The most important beeing:
- when does export mean dllimport and when dllexport. Newsgroup discussion brought up that we could enable dllimport/dllexport per module (including all submodules) via a command line switch.
- do we want a export all public symbols feature (discussion on the newsgroup brought up that c++ is trying to move away from this, maybe we should too)
- Should exporting of entire classes / structs be possible? E.g. marking a class es export, exports its vtable, typeinfo, all protected / public functions,
static members, etc.
- Which informations about modules need to be exported? Will they be automatically exported as soon as the module has a single exported function / class / variable?
> Well, the question is, whether we can annotate symbols with "export" and still
> create static libraries.
At the moment: no. But we should create a solution where this very case will work. Proposed solution. 'export' is always a no-op unless you specifiy otherwise using a command line switch to the compiler.
Comment #19 by code — 2013-08-29T15:05:57Z
(In reply to comment #18)
> - when does export mean dllimport and when dllexport. Newsgroup discussion
> brought up that we could enable dllimport/dllexport per module (including all
> submodules) via a command line switch.
There are some problems with the current implementation.
export void foo() {} // definition => dllexport
export void foo(); // declaration => dllimport
export int a = 0; // definition => dllexport
export int a; // declaration => dllimport // fails because it's actually a definition
export extern int a; // declaration => dllimport
It would be great if we could avoid extra .di headers.
> - do we want a export all public symbols feature (discussion on the newsgroup
> brought up that c++ is trying to move away from this, maybe we should too)
Please let's try to go into the other direction on Unix too.
You can find more about the reasoning here.
http://gcc.gnu.org/wiki/Visibilityhttp://people.redhat.com/drepper/dsohowto.pdf
> - Should exporting of entire classes / structs be possible? E.g. marking a
> class es export, exports its vtable, typeinfo, all protected / public
> functions,
> static members, etc.
Yes, because there is no way to annotate compiler generated data.
Once again please stay away from abusing protection for export because it creates too many problems for future language extensions, e.g. maintaining private symbols for ABI compatibility, module definitions in multiple files/objects, partial classes. Linking and symbol protection are fundamentally different concepts and we should offer orthogonal control.
> - Which informations about modules need to be exported? Will they be
> automatically exported as soon as the module has a single exported function /
> class / variable?
Yes, it's hidden compiler data and you might need to link against the ModuleInfo and some other symbols.
>
> > Well, the question is, whether we can annotate symbols with "export" and still
> > create static libraries.
>
> At the moment: no. But we should create a solution where this very case will
> work. Proposed solution. 'export' is always a no-op unless you specifiy
> otherwise using a command line switch to the compiler.
That sounds like a reasonable solution when we can't do better.
Also see the alias discussion on the newsgroup http://forum.dlang.org/post/[email protected].
Comment #20 by code — 2013-08-29T15:37:26Z
(In reply to comment #19)
> > > Well, the question is, whether we can annotate symbols with "export" and still
> > > create static libraries.
> >
> > At the moment: no. But we should create a solution where this very case will
> > work. Proposed solution. 'export' is always a no-op unless you specifiy
> > otherwise using a command line switch to the compiler.
>
> That sounds like a reasonable solution when we can't do better.
> Also see the alias discussion on the newsgroup
> http://forum.dlang.org/post/[email protected].
That wouldn't work in the case where you create a DLL that both exports symbols and imports symbols from another DLL.
Comment #21 by code — 2013-08-29T19:29:11Z
To summarize the alias proposal.
For every exported function definition we also emit an alias symbol _imp_funcname.
For every exported data definition we emit a weakly linked read-only pointer T* _imp_var = &var.
Whenever an exported symbol is called or accessed this is done using the _imp_* symbol.
When such code gets linked with an import library it will correctly work with a DLL.
When such code gets linked with a static library it will reference the correct definitions.
The simple export attribute is sufficient for all use-cases, no worries about dllimport/dllexport/no-op.
If we were able to use whole program optimization the linker could optimize away the additional data access indirection when linking statically.
I don't think the last point is too critical because exporting data is rarely done and rather a bad practice.
Also this only applies to the API boundary which shouldn't be a performance hotspot.
For ELF export should simply make a symbol visible, otherwise symbols should be hidden by default.
Any ideas about Mach-O? Same as ELF?
Comment #22 by r.sagitario — 2013-08-29T23:40:47Z
(In reply to comment #19)
> > - do we want a export all public symbols feature (discussion on the newsgroup
> > brought up that c++ is trying to move away from this, maybe we should too)
>
> Please let's try to go into the other direction on Unix too.
> You can find more about the reasoning here.
Do you want to annotate all of phobos and druntime with "export" to build a shared version? I think we are already in annotation hell with nothrow, pure, @safe, etc...
Comment #23 by code — 2013-08-29T23:59:07Z
(In reply to comment #22)
> Do you want to annotate all of phobos and druntime with "export" to build a
> shared version? I think we are already in annotation hell with nothrow, pure,
> @safe, etc...
Yes, that's what I intend. Doing this manually is important, because exporting a symbol means commiting to ABI stability, which should be a longterm goal.
Because exported symbols are the known entry points of a library this reopens the question whether we could use much more inference for non-exported symbols.
Comment #24 by code — 2013-08-30T00:32:50Z
> I don't think the last point is too critical because exporting data is rarely
> done and rather a bad practice.
> Also this only applies to the API boundary which shouldn't be a performance
> hotspot.
The digital mars linker is not capable of doing so. Also as stated on the newsgroup we will most likely not be able to use LTO of the mircosoft linker.
Also this will not only affect to API bounadries, it will affect all accesses to global data: __gshared variables, shared variables, and all compiler internal global data symbols like module info, type info, vtables, etc. This is because without knowing what symbols are imported from a DLL we have to add the indirection to all of them. If we decide to use the alias solution we would have to accept the additional level of indirection for all global data accesses and keep in mind that this is most likely going to stay that way for a long time.
> Yes, that's what I intend. Doing this manually is important, because exporting
> a symbol means commiting to ABI stability, which should be a longterm goal.
> Because exported symbols are the known entry points of a library this reopens
> the question whether we could use much more inference for non-exported symbols.
I fully agree here. We might still want to provide a -exportall switch for convenience.
Comment #25 by code — 2013-08-30T00:45:28Z
> That wouldn't work in the case where you create a DLL that both exports symbols
> and imports symbols from another DLL.
It would work. Because the command line switch to the compiler would specifiy a module name, so it won't turn all 'export' into dllexport. E.g. if you build phobos you would add "-export std" to the command line of the compiler. This would turn all 'export' inside any of the std modules into dllexport. So this would work very for DLLs that both export and import symbols. For dllimport there would be a equivalent command line switch e.g. "-import std" which needs to be specified when linking against a shared phobos library. These command line switches could then be added to the default sc.ini so users don't have to specify them manually.
Comment #26 by code — 2013-08-30T05:11:45Z
(In reply to comment #24)
> > I don't think the last point is too critical because exporting data is rarely
> > done and rather a bad practice.
> > Also this only applies to the API boundary which shouldn't be a performance
> > hotspot.
>
> The digital mars linker is not capable of doing so. Also as stated on the
> newsgroup we will most likely not be able to use LTO of the mircosoft linker.
> Also this will not only affect to API bounadries, it will affect all accesses
> to global data: __gshared variables, shared variables, and all compiler
> internal global data symbols like module info, type info, vtables, etc. This is
> because without knowing what symbols are imported from a DLL we have to add the
> indirection to all of them.
No! This only applies to data that is marked as export. So you do know quite well what could be imported.
Comment #27 by code — 2013-08-30T05:26:36Z
And the metadata of exported UDTs (vtable, rtti) and moduleinfos for modules with exported members. Also note that because Windows doesn't support symbol interposition it's safe to access the data directly from within a module.
Comment #28 by code — 2013-08-30T06:07:52Z
(In reply to comment #24)
> I fully agree here. We might still want to provide a -exportall switch for
> convenience.
A compiler switch makes sense for the case where you have the source
code but don't want to modify it, e.g. a downloaded package that was
not annotated. Here it might make sense to offer -export=public,package,private
because nobody maintains the ABI anyhow.
(In reply to comment #27)
> And the metadata of exported UDTs (vtable, rtti)
Actually not vtables because they are indirectly addressed through the class instance already.
Comment #29 by code — 2013-08-30T07:49:43Z
> No! This only applies to data that is marked as export. So you do know quite
> well what could be imported.
You are correct, I didn't take into account that we can leverage export in that way.
The only question remaining is, how big the performance impact is when doing cross DLL calls. When the compiler knows that the function is located inside a DLL it can directly generate a jmp. If it doesn't know it, it has to call the function stub which itself does the jmp. That means using this aliasing all cross DLL function call would have a additional call instruction overhead. If you use a object oriented API this overhead should be minimal because most function calls go through vtables anyway. But how big is it for functional libraries, like phobos?
Comment #30 by code — 2013-08-30T08:16:09Z
(In reply to comment #29)
> The only question remaining is, how big the performance impact is when doing
> cross DLL calls. When the compiler knows that the function is located inside a
> DLL it can directly generate a jmp. If it doesn't know it, it has to call the
> function stub which itself does the jmp. That means using this aliasing all
> cross DLL function call would have a additional call instruction overhead. If
> you use a object oriented API this overhead should be minimal because most
> function calls go through vtables anyway. But how big is it for functional
> libraries, like phobos?
The situation is much worse on linux because ALL symbols are exported by default and interposition forces indirection through PLT or GOT even within the same shared library.
IIRC Andrei operated with a number of 5-10% performance penalty for such PIC code.
If we can shift to only make exported symbols visible we might reduce that penalty significantly and be on par with the Windows, so it's a likely a few percent.
Comment #31 by code — 2013-09-01T02:50:44Z
I updated the DIP with all discussed changes and implementation details. Please give feedback in the newsgroup.
http://wiki.dlang.org/DIP45
Martin Kinkelin said that "export everything in a module" runs into a problem that only 64K exported symbols are allowed, and naturally people hit this limit.
Comment #34 by bugzilla — 2023-06-05T07:05:35Z
> export int a; // declaration => dllimport // fails because it's actually a definition
That's been fixed.
Comment #35 by robert.schadek — 2024-12-13T18:05:23Z