Bug 9655 – Two functions with identical implementations are allowed to have the same address

Status
NEW
Severity
enhancement
Priority
P4
Component
dlang.org
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-03-05T21:54:35Z
Last change time
2024-12-15T15:22:03Z
Keywords
pull
Assigned to
No Owner
Creator
Walter Bright
Moved to GitHub: dlang.org#3937 →

Comments

Comment #0 by bugzilla — 2013-03-05T21:54:35Z
In regards to: http://d.puremagic.com/issues/show_bug.cgi?id=9623 Related C++ article "Can Two Functions Have the Same Address?" http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=561 This quote pretty much sells me: -- Additionally, Google's compiler team have experimented with Identical Code Folding (ICF) and reported that "[d]etailed experiments on the x86 platform show that ICF can reduce the text size [the program section in which functions' code is stored, DK] of some Google binaries, whose average text size is 50 MB, by up to 7%." -- We should settle the issue by updating the D spec to explicitly allow functions to have the same address.
Comment #1 by bearophile_hugs — 2013-03-06T02:05:52Z
(In reply to comment #0) > In regards to: > > http://d.puremagic.com/issues/show_bug.cgi?id=9623 > > Related C++ article "Can Two Functions Have the Same Address?" > http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=561 > > This quote pretty much sells me: > > -- > Additionally, Google's compiler team have experimented with Identical Code > Folding (ICF) and reported that "[d]etailed experiments on the x86 platform > show that ICF can reduce the text size [the program section in which functions' > code is stored, DK] of some Google binaries, whose average text size is 50 MB, > by up to 7%." > -- > > We should settle the issue by updating the D spec to explicitly allow functions > to have the same address. 1) What are the downsides of such folding in D? 2) Since recently, LLVM (used by LDC) folds identical functions if you use a compiler switch. IF you use such switch, where the compiler sees two identical functions, replaces one of them with just a jump to the other. So their address is distinct, but the amount of wasted space in the binary is minimal. 3) Often templates generate not just identical functions, but functions that differ only in a small part, for only few asm instructions. So a good D compiler could try to split those functions in virtual chunks (maybe if the parts are not inside a loop), keep only one copy of the shared part. I presume this is not easy to do in general.
Comment #2 by hsteoh — 2014-09-19T18:39:02Z
ping I fully support this proposal. If two functions compile to identical code, there is no reason to expect them to have different addresses. As already mentioned, this is an important part of reducing template bloat. So, the question is, where on the website should this be documented?
Comment #3 by yebblies — 2014-10-06T14:50:16Z
(In reply to hsteoh from comment #2) > So, the question is, where on the website should this be documented? A note in http://dlang.org/function.html should be fine.
Comment #4 by hsteoh — 2014-10-30T17:29:02Z
Comment #5 by github-bugzilla — 2014-10-31T09:12:45Z
Commit pushed to master at https://github.com/D-Programming-Language/dlang.org https://github.com/D-Programming-Language/dlang.org/commit/e5d39c811d080ad8aae8903e96711f7f7715ca99 Merge pull request #684 from quickfur/issue9655 Issue 9655: Functions with identical bodies are allowed to be merged by compiler.
Comment #6 by bearophile_hugs — 2014-10-31T10:29:46Z
(In reply to github-bugzilla from comment #5) > Commit pushed to master at > https://github.com/D-Programming-Language/dlang.org > > https://github.com/D-Programming-Language/dlang.org/commit/ > e5d39c811d080ad8aae8903e96711f7f7715ca99 > Merge pull request #684 from quickfur/issue9655 > > Issue 9655: Functions with identical bodies are allowed to be merged by > compiler. This is not enough. What do you have to do if you want to be certain to have distinct D functions pointers even if the function body may or may not be the same? (There is C code out there that relies on this guaranteed, like some evolutionary algorithm that breeds functions). Are D functions tagged with extern(C) exempt from this optimization?
Comment #7 by yebblies — 2014-10-31T10:38:54Z
(In reply to bearophile_hugs from comment #6) > > This is not enough. What do you have to do if you want to be certain to have > distinct D functions pointers even if the function body may or may not be > the same? (There is C code out there that relies on this guaranteed, like > some evolutionary algorithm that breeds functions). Are D functions tagged > with extern(C) exempt from this optimization? No, you shouldn't rely on this ever.
Comment #8 by ketmar — 2014-10-31T10:44:39Z
(In reply to bearophile_hugs from comment #6) > This is not enough. What do you have to do if you want to be certain to have > distinct D functions pointers even if the function body may or may not be > the same? (There is C code out there that relies on this guaranteed, like > some evolutionary algorithm that breeds functions). Are D functions tagged > with extern(C) exempt from this optimization? that code is foobared. please, don't use it.
Comment #9 by bearophile_hugs — 2014-10-31T12:27:47Z
(In reply to yebblies from comment #7) > No, you shouldn't rely on this ever. Why? I think the C standard requires those functions to have different addresses. So I think that C code is correct. (And indeed as far as I know GCC replaces equal function implementations with a jump, to keep addressed distinct).
Comment #10 by yebblies — 2014-10-31T12:36:01Z
(In reply to bearophile_hugs from comment #9) > (In reply to yebblies from comment #7) > > > No, you shouldn't rely on this ever. > > Why? Because in D it's specified that the functions may not have distinct addresses. > I think the C standard requires those functions to have different > addresses. So I think that C code is correct. (And indeed as far as I know > GCC replaces equal function implementations with a jump, to keep addressed > distinct). In a language without templates, code folding is much less useful.
Comment #11 by schveiguy — 2014-10-31T13:14:21Z
(In reply to bearophile_hugs from comment #9) > Why? I think the C standard requires those functions to have different > addresses. I don't think this is true. (In reply to bearophile_hugs from comment #9) > (In reply to yebblies from comment #7) > > > No, you shouldn't rely on this ever. > > Why? I think the C standard requires those functions to have different > addresses. So I think that C code is correct. (And indeed as far as I know > GCC replaces equal function implementations with a jump, to keep addressed > distinct). From the C standard: Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space. So it appears, from the "if and only if", that bearophile is right. But D does not have to follow C rules. Even if we define an extern(C) function in D, it does not mean we have to follow those rules. I would say the issues that might occur because of this change are astronomically small. Consider that a piece of code that depends on distinct functions having distinct addresses may still work just fine even with ICF. However, it should be noted on the spec that we deviate from those requirements. It currently does not address this point from what I could find.
Comment #12 by schveiguy — 2014-10-31T13:26:23Z
(In reply to Steven Schveighoffer from comment #11) > I don't think this is true. I was supposed to delete this part of the comment, when I found the spec quote. Sorry :)
Comment #13 by robert.schadek — 2024-12-15T15:22:03Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dlang.org/issues/3937 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB