This is the main reason why CTFE is so slow.
int bug6498(int x)
{
int n = 0;
while (n < x)
++n;
return n;
}
static assert(bug6498(10_000_000)==10_000_000);
--> Fails with an 'out of memory' error.
Comment #1 by clugdbug — 2012-11-26T07:14:57Z
Upgrading severity. I've done several commits to move towards a solution but I still need to do more restructuring to properly fix this.
Don: Is there a Github PR or branch for your changes or are these things normally kept secret because this issue has a bounty?
Comment #4 by ibuclaw — 2014-06-28T16:08:41Z
FYI, all PR's have been merged in.
I won't bother listing them all (there's a lot that was done over 2012/2013). There has been no work on this since June 2013 IIRC.
https://github.com/D-Programming-Language/dmd/pull/1778#issuecomment-19964496
What should be focused on (thanks to Walter's idea of allocating but not freeing memory) is to limit just how much memory is allocated from CTFE. By possibly find ways to re-use and not re-allocate memory, or maybe giving CTFE its own allocator (it is a backend in its own right, afterall).
Comment #5 by razvan.nitu1305 — 2022-06-09T14:27:33Z
This seems to have been fixed. On my machine it takes 5 seconds to run this and it appears to use 2-3% of my 16 GB RAM. Should we close this?
Comment #6 by maxhaton — 2022-06-10T05:16:30Z
The memory usage has improved a lot but this is still ridiculously slow.
Compare with a soon to be upstream-ed -preview=newCTFE: https://asciinema.org/a/zTHuVmXbsZ4ryWGfCd2bXoJG5 (roughly 10x faster)
SDC does this in about 0.04 sec on my machine so 50x to 80x faster
Comment #7 by ibuclaw — 2022-06-10T11:45:45Z
Metrics of the code in this report ran by v2.080:
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 6.44
System time (seconds): 0.29
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.75
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1104116
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 274715
Voluntary context switches: 1
Involuntary context switches: 256
Swaps: 0
File system inputs: 246
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---
As of v2.085.0 - when most of dinterpret had been converted over to returning UnionExp on the stack.
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 6.64
System time (seconds): 0.19
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.84
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 636044
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1
Minor (reclaiming a frame) page faults: 157878
Voluntary context switches: 1
Involuntary context switches: 231
Swaps: 0
File system inputs: 386
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---
As of v2.089.0 - when a ctfeRegion allocator was introduced to free memory after exiting an interpret "scope".
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 6.88
System time (seconds): 0.14
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.03
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 637204
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 158019
Voluntary context switches: 1
Involuntary context switches: 17
Swaps: 0
File system inputs: 474
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---
As of v2.100.0
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 7.13
System time (seconds): 0.07
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.22
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 482504
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 119238
Voluntary context switches: 1
Involuntary context switches: 223
Swaps: 0
File system inputs: 833
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---
With -lowmem.
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c -lowmem"
User time (seconds): 7.64
System time (seconds): 0.05
Percent of CPU this job got: 103%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.42
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 28760
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1
Minor (reclaiming a frame) page faults: 5679
Voluntary context switches: 2376
Involuntary context switches: 774
Swaps: 0
File system inputs: 833
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---
Comment #8 by ibuclaw — 2022-06-10T11:53:21Z
(In reply to Iain Buclaw from comment #7)
> v2.080:
> Maximum resident set size (kbytes): 1104116
> v2.085.0:
> Maximum resident set size (kbytes): 636044
> v2.089.0:
> Maximum resident set size (kbytes): 637204
> v2.100.0:
> Maximum resident set size (kbytes): 482504
> -lowmem (as of v2.090):
> Maximum resident set size (kbytes): 28760
It's still nearly 500MB, so only 2x better than where we were 4 years ago, and still a far cry away from the possible 30MB we could instead by managing with.
I also note that the compiler has slowed down by 1 second since v2.080 as well, so CTFE is not getting faster at all...
Comment #9 by robert.schadek — 2024-12-13T17:56:02Z