Bug 18017 – [External] [DMC] File.size() uses a 32-bit signed integer for size internally (gives wrong results for files over ≈2.1 GB)

Status
NEW
Severity
normal
Priority
P3
Component
phobos
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2017-11-28T15:00:15Z
Last change time
2024-12-01T16:31:08Z
Assigned to
No Owner
Creator
krzaq
Moved to GitHub: phobos#10271 →

Comments

Comment #0 by issues.dlang.org.kq.ajsx — 2017-11-28T15:00:15Z
I think the code should say it all: void main(string[] args) { import std.stdio; // assume C:\code\foo.raw is 2147483648 bytes big (INT_MAX+1), for example // dd if=/dev/zero of=/cygdrive/c/code/foo.raw bs=65536 count=32768 auto f = File("C:\\code\\foo.raw"); assert(f.size == 18446744071562067968u); // dd if=/dev/zero of=/cygdrive/c/code/foo.raw bs=65536 count=32768 }
Comment #1 by schveiguy — 2017-11-28T15:32:29Z
The issue here is that DMC's 32-bit ftell returns a 32-bit signed value, and this translates to int.min here. Then phobos translates that to an unsigned long (64-bit). The workaround is to simply use 64-bit C runtime (dmd -m64), which should work properly. But until DMC's clib can support 64-bit ftell, D can't do anything about this. Sure we can treat values from int.min to -2 as unsigned, but that doesn't help with 5GB files for instance. One thing we *could* do is throw an error. But I'm not sure that's a "solution". Nor am I sure that this workaround would work for files that are larger than uint.max.
Comment #2 by issues.dlang.org.kq.ajsx — 2017-11-28T17:42:16Z
Getting -m64 to work requires non-zero effort and even then isn't hassle-free. I used this as a workaround instead: ulong getFileSize(const string name) { import std.utf; import core.sys.windows.windows; //import core.stdc. HANDLE hFile = CreateFile(name.toUTF16z, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if (hFile==INVALID_HANDLE_VALUE){ return -1; // error condition, could call GetLastError to find out more } LARGE_INTEGER size; if (!GetFileSizeEx(hFile, &size)) { CloseHandle(hFile); return -1; // error condition, could call GetLastError to find out more } CloseHandle(hFile); return size.QuadPart; }
Comment #3 by schveiguy — 2017-11-28T18:04:59Z
Nevertheless, it's still a bug, as File.size using ulong as its return seems to suggest it can handle it. Note, there's also std.file.getSize: https://dlang.org/phobos/std_file.html#getSize if you aren't actually reading anything in the file.
Comment #4 by issues.dlang.org.kq.ajsx — 2017-11-28T18:09:38Z
std.file.getSize works correctly in my case. The thing is, I read this big file (using struct File and then byChunk) and I am getting all the data correctly - only the size returned by File.size isn't correct - that's why I suggested using raw winapi for this (at least on win32), but std.file.getSize might work as well.
Comment #5 by schveiguy — 2017-11-28T18:30:50Z
std.file.getSize works because it *does* use WinAPI directly. std.stdio.File is based completely on libc's FILE * structure. It can only support whatever that supports, and that isn't very much. In the case of 32-bit windows, the library it uses is Digital Mars' C runtime, which has some difficult limitations, this being one of them. A potential fix here is to get the handle directly from the FILE * and query it using WinAPI. But this doesn't fix File.tell(), which is going to use the libc version.
Comment #6 by kinke — 2017-11-28T21:25:46Z
(In reply to Steven Schveighoffer from comment #5) > std.stdio.File is based completely on libc's FILE * structure. It can only > support whatever that supports, and that isn't very much. In the case of > 32-bit windows, the library it uses is Digital Mars' C runtime, which has > some difficult limitations, this being one of them. > > A potential fix here is to get the handle directly from the FILE * and query > it using WinAPI. But this doesn't fix File.tell(), which is going to use the > libc version. Then `-m32mscoff` is another option for Win32.
Comment #7 by schveiguy — 2017-11-28T21:50:34Z
(In reply to kinke from comment #6) > Then `-m32mscoff` is another option for Win32. Yeah, if that defines CRuntime_Microsoft, then it should work. I admit I'm not too familiar with Windows dmd development, and haven't really tried all of these options. Appropriate version switch is here: https://github.com/dlang/phobos/blob/master/std/stdio.d#L1142
Comment #8 by robert.schadek — 2024-12-01T16:31:08Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/phobos/issues/10271 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB