Bug 24132 – ImportC: Add support for wchar_t, char16_t, char32_t

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
Windows
Creation time
2023-09-02T21:07:00Z
Last change time
2023-11-21T03:35:58Z
Keywords
ImportC, pull, rejects-valid
Assigned to
Walter Bright
Creator
Adam Wilson

Comments

Comment #0 by flyboynw — 2023-09-02T21:07:00Z
Currently ImportC handles wchar_t as a ushort, which works but is painful to use in D code that expects wchar* instead of ushort* for string pointers (example: the return value of toUTF16z).
Comment #1 by ryuukk.dev — 2023-09-03T00:14:31Z
Sounds like an easy PR to do, add the define here: https://github.com/dlang/dmd/blob/master/druntime/src/importc.h
Comment #2 by flyboynw — 2023-09-03T14:45:11Z
(In reply to ryuukk_ from comment #1) > Sounds like an easy PR to do, add the define here: > > https://github.com/dlang/dmd/blob/master/druntime/src/importc.h It's not *quite* that simple. wchar_t is #define as an unsigned short in C, and either an unsigned short *or* an intrinsic type in C++. This means that when the preprocessor runs it emits an unsigned short. However, most C preprocessors have a switch that treat wchar_t as an intrinsic type instead of "typedef wchar_t unsigned short" However, the current ImportC implementation does not support this and vomits up errors when you try to use it. What we need is the ability for ImportC to recognize wchar_t, char16_t, and char32_t *after* the preprocessor has run so that ImportC can emit the appropriate char/wchar/dchar types.
Comment #3 by bugzilla — 2023-09-12T00:54:48Z
C11 defines char32_t as uint_least32_t, which is specified to be a typedef, not a macro or a keyword. Preprocessors usually key off the existence of __cplusplus to turn C++ semantics on and off. ImportC currently does not do that. I suggest putting: typedef wchar_t unsigned short; in your copy of importc.h and see how far that gets?
Comment #4 by flyboynw — 2023-10-28T09:48:02Z
So I was poking around the compiler source today and I noticed that in lexer.d at line 80 there is a reference to wchar_t in ImportC specific code, so it appears to know about wide-chars in C. Then I discovered Ckeywords in tokens.d. So all we have to do is add wchar_ and dchar_ to that list and add the following to importc.h #define wchar_t wchar #define char16_t wchar #define char32_t dchar A little hacky maybe, but for ImportC it would work, and it would allow us to use D style strings natively which is the semantically correct outcome. IIRC, the wchar and dchar types are unsigned short and unsigned int respectively when using export(C) so that should function as normal.
Comment #5 by dlang-bot — 2023-10-31T03:45:55Z
@LightBender created dlang/dmd pull request #15757 "Issue 24132 - ImportC: Add support for wide-chars." fixing this issue: - Fix Issue 24132. Add wchar/dchar to C Keywords list. Use #defines to convert C wide-chars to wchar/dchar. https://github.com/dlang/dmd/pull/15757
Comment #6 by flyboynw — 2023-11-21T03:35:58Z
I'm going to close this as WON'T FIX. I'll try to attack my problem with a post-processor script.