When time.d queries the POSIX API for clock resolution, it assumes that values of 1 microsecond or greater are wrong, and uses a hard-coded granularity of 1 nanosecond instead:
https://github.com/dlang/dmd/blob/26a4e395e8853de8f83c7c56341066f98e6a8d4f/druntime/src/core/time.d#L2580
The comment on that code claims it's to tame systems that report resolutions of 1 millisecond or worse while updating the clock more frequently than that. However:
- It activates on reported resolutions >= 1us; a thousand times finer than the claimed 1ms.
- It does no test at all to validate its assumption that the reported resolution is wrong.
- It gives no indication to calling code that the returned value is a lie.
- It is most likely to activate on coarse clocks, with a hard-coded value representing a very fine resolution, making the result not merely a lie, but an egregious lie.
I wrote a program that prints MonoTimeImpl!(ClockType.coarse).currTime.ticks() at regular intervals, along with the clock resolutions reported by both POSIX clock_getres() and D ticksPerSecond(). Here is the output on a Ryzen 7000-series CPU running linux 6.7.9:
ticks/sec nsecs/tick
Libc: 250 4000000
Dlib: 1000000000 1
Sampling clock to see how often it actually changes...
0 nsecs: 0
1000000 nsecs: 0
2000000 nsecs: 3999934 (changed)
3000000 nsecs: 3999934
4000000 nsecs: 3999934
5000000 nsecs: 3999934
6000000 nsecs: 7999869 (changed)
7000000 nsecs: 7999869
8000000 nsecs: 7999869
9000000 nsecs: 7999869
10000000 nsecs: 11999804 (changed)
11000000 nsecs: 11999804
12000000 nsecs: 11999804
13000000 nsecs: 11999804
14000000 nsecs: 15999737 (changed)
15000000 nsecs: 15999737
16000000 nsecs: 15999737
17000000 nsecs: 15999737
As we can see, the clock updates exactly as often as the POSIX call claims, yet D's clock init code ignores that, and reports an arbitrary resolution instead.
I wonder:
What circumstance led the author of that init code to second-guess the system? I don't see a problematic OS, libc, or architecture mentioned the comments. Can that circumstance be reproduced anywhere today?
Why is it applied to all POSIX systems, instead of being a special case for known-bad systems?
Was the init code's test for (ts.tv_nsec >= 1000) meant to be (ts.tv_nsec >= 1000000) as stated in the comment above it?
Is having the D runtime silently replace values reported by the system really a good idea? After all, if the OS or its core libs are misbehaving, the bug will presumably be fixed in a future version, and until then, application code (which knows its own timing needs) is better prepared to decide how to handle it.
Comment #1 by forestix — 2024-03-21T04:10:05Z
Created attachment 1912
test code
Comment #2 by robert.schadek — 2024-12-07T13:43:22Z