Per this thread http://forum.dlang.org/post/[email protected] , it was decided to try and go the path of removing opEquals, opCmp, toHash, and toString from Object, as they don't need to be there, and the fact that they're there is causing issues with attributes such as const ( issue# 1824 ).
This issue is specifically for toHash so that any work towards making it unnecessary and removing it can be referenced here, whereas the other 3 have separate bugzilla entries, as the work for them isn't necessarily related to each other. opEquals, opCmp, and toHash in particular may require substantial work to be done on the built-in AAs before they can actually be removed from Object, as the AA implementation is not currently templated like it should be.
This probably can't be implemented until we replace AA's with a library template, since otherwise it would make it impossible for e.g. classes to be used as AA keys.
Once we have the library type, though, we can probably make use of UFCS to provide default implementations of toHash for all types.
Comment #3 by issues.dlang — 2013-03-22T19:09:41Z
> This probably can't be implemented until we replace AA's with a library
> template
I figured that that was probably the case, and that will certainly slow this down, but it still needs to be done.
> Once we have the library type, though, we can probably make use of UFCS to
> provide default implementations of toHash for all types.
I'd be very much against providing a UFCS toHash function which used compile-time reflection to figure out how to generate a hash for an arbitrary object. That's just begging for trouble, particularly since the behavior of opEquals and toHash need to match. They need to be controlled by the object itself and not 3rd party code. Providing a mixin to which does that would definitely be useful for avoiding boilerplate, but then it's properly under the control of the person writing the class or struct.
But providing toHash functions for built-in types via CTFE would certainly be good.
Comment #4 by hsteoh — 2013-03-22T19:28:25Z
I wouldn't go as far as using introspection to make a default toHash, but I think there's value in providing a default toHash for classes that just hashes the object's address (I think that's what the current toHash does).
But if we're going to be forcing users to manually define opEquals and toHash, then we should make rt.util.hash available to user code. The only thing worse than having a wrong default opEquals/toHash is a morass of duplicated user code implementing poorly-designed hashing algorithms for each user-defined type.
Comment #5 by hsteoh — 2013-03-22T19:30:38Z
Also, if the default behaviour of == for structs is to do bitwise comparison, then I think it justifies defining toHash for structs that just hashes the bit representation of the struct. It certainly beats having to define toHash for every single little helper struct that you write, just so you can use it in an AA. (That's what really turned me off in C++11's unordered_map -- so much verbosity for something that's supposed to be simple.)
Comment #6 by hsteoh — 2013-03-22T19:32:00Z
P.S. I meant *default* toHash for structs, of course. Obviously the user should be able to override the default behaviour (hence the usefulness of using UFCS for toHash, since a member function always takes precedence over a UFCS function).
Comment #7 by issues.dlang — 2013-03-22T20:29:03Z
Whether it makes sense or not to use the bit representation for hashing depends on the struct, as according TDPL, bitwise comparison is _not_ supposed to be the default for structs (though as bug# 3789 indicates, that's currently buggy). However, it wouldn't be all that hard to come up with a default toHash which corresponded to the default opEquals by having it simply use the toHash methods of the member variables to generate an overall hash. So, we could definitely make it so that toHash is generated for structs as long as opEquals hasn't been defined.
Comment #8 by doob — 2013-03-23T07:34:51Z
(In reply to comment #4)
> I wouldn't go as far as using introspection to make a default toHash, but I
> think there's value in providing a default toHash for classes that just hashes
> the object's address (I think that's what the current toHash does).
I think that will conflict with having a moving garbage collector.
Comment #9 by hsteoh — 2013-03-23T08:20:16Z
(In reply to comment #8)
> I think that will conflict with having a moving garbage collector.
Good point, maybe the correct approach is to hash the respective hashes of each member of the class?
@Jonathan: I agree, the default toHash should not be generated if the class/struct defines opEquals. It's the user's responsibility to keep the two consistent. So basically, the default toHash should correspond to whatever == corresponds with by default. If == is bitwise comparison, then toHash also should be bitwise hash, etc..
Comment #10 by doob — 2013-03-23T08:33:35Z
(In reply to comment #9)
> Good point, maybe the correct approach is to hash the respective hashes of each
> member of the class?
Jonathan didn't seem to like that idea.
Comment #11 by issues.dlang — 2013-03-23T12:09:46Z
> Jonathan didn't seem to like that idea.
Given that the default opEquals is supposed to compare each member variable using ==, I think that it would make good sense for the default toHash to call toHash on each member variable and generate a new hash from those. What doesn't make sense is blindly using reflection to go over all of member variables and generating hashes for them regardless of how they're implemented. Some folks in the newsgroup liked the idea of using external functions to use reflection to go through a type's member variables recursively to generate a hash, and given how opEquals and toHash are related, I think that that's an extremely bad idea, and that's the main thing that I'm objecting to. But if we're talking about a default-generated toHash that's only generated when opEquals is default-generated, then we know some things about what opEquals looks like and can generate a toHash which correctly corresponds to it, which would generally mean calling toHash on each member variable and generating a new hash from that - though if the compiler decided that it could do a bitwise comparison of the struct (e.g. because all of the member variables were integral types), then generating a hash in a similar manner would make sense.
Comment #12 by doob — 2013-03-23T15:29:36Z
(In reply to comment #11)
> Given that the default opEquals is supposed to compare each member variable
> using ==, I think that it would make good sense for the default toHash to call
> toHash on each member variable and generate a new hash from those. What doesn't
> make sense is blindly using reflection to go over all of member variables and
> generating hashes for them regardless of how they're implemented. Some folks in
> the newsgroup liked the idea of using external functions to use reflection to
> go through a type's member variables recursively to generate a hash, and given
> how opEquals and toHash are related, I think that that's an extremely bad idea,
> and that's the main thing that I'm objecting to. But if we're talking about a
> default-generated toHash that's only generated when opEquals is
> default-generated, then we know some things about what opEquals looks like and
> can generate a toHash which correctly corresponds to it, which would generally
> mean calling toHash on each member variable and generating a new hash from that
> - though if the compiler decided that it could do a bitwise comparison of the
> struct (e.g. because all of the member variables were integral types), then
> generating a hash in a similar manner would make sense.
I see, so only when opEquals is default generated.
Comment #13 by issues.dlang — 2013-03-23T19:12:53Z
> I see, so only when opEquals is default generated.
Yes. We could provide helper functions or mixins to make it easier to write hash functions if you define opEquals (or on classes which won't have any kind of default opEquals - let alone the useless one that they have now), but as soon as opEquals has been defined by the programmer, it has to be up to the programmer to define toHash correctly, as in order for hashing to work correctly, every object which is equal must have the same hash. And if the programmer defined opEquals, then the compiler will no longer have enough information to define a toHash which guarantees that all equal objects will have the same hash.
Comment #14 by code — 2013-06-03T08:15:21Z
(In reply to comment #2)
> This probably can't be implemented until we replace AA's with a library
> template, since otherwise it would make it impossible for e.g. classes to be
> used as AA keys.
That very simple to solve, let the compiler pass two functions (hash, equals) to the AA implementation instead of RTTI.
> Once we have the library type, though, we can probably make use of UFCS to
> provide default implementations of toHash for all types.
I would not want to bet that this will happen.
The current situation where the AA is partly implemented in the compiler and in the runtime is unfortunate to say the least.
Also performance-wise a generic AA implementation will never be on par with one that's suited for a special purpose. So it makes a lot of sense to have a lean builtin implementation and provide more configurable ones in std.container.
Comment #15 by issues.dlang — 2013-10-13T16:17:28Z
Relevant discussion on how to transition away from having these functions on Object (the thread got broken up a bit - probably by the mailman bug):
http://forum.dlang.org/post/[email protected]http://forum.dlang.org/post/[email protected]
The key suggestion is to change the compiler so that those 4 functions can be free functions in object_.d, similar to how the compiler already does special stuff for == rather than simply calling Object's opEquals.
Comment #16 by robert.schadek — 2024-12-07T13:32:29Z