← Back to index | Original Bugzilla link

Bug 5904 – std.json parseString doesn't handle chars outside the BMP

Status: RESOLVED
Resolution: FIXED
Severity: normal
Priority: P2
Component: phobos
Product: D
Version: D2
Platform: Other
OS: All
Creation time: 2011-04-28T12:24:48Z
Last change time: 2018-01-05T13:29:31Z
Keywords: pull
Assigned to: No Owner
Creator: Sean Kelly
See also: https://issues.dlang.org/show_bug.cgi?id=17556

Comments

Comment #0 by sean — 2011-04-28T12:24:48Z

According to RFC 4627, characters outside the Basic Multilingual Plane (ie. those that require more than two bytes to represent) are encoded as a surrogate pair in JSON strings. In effect, what you have to do is test whether a "\uXXXX" value is >= 0xD800 and <= 0xDBFF. If so, then the next value should be another "\uXXXX" character representing the low surrogate. To verify this, the value should be >= 0xDC00 and <= 0xDFFF. If it isn't, then skip the preceding "\uXXXX" value (the high surrogate) as invalid and decode the following "\uXXXX" value as a standalone Unicode code-point (the RFC is actually unclear on this point, but this seems the most reasonable failure mode). Assuming that you have a valid high and low surrogate, stick them into a wchar[2] and convert to UTF8.

Comment #1 by dlang-bugzilla — 2017-06-25T16:42:53Z

Test case: ///////////// test.d ///////////// import std.json; void main() { string s = `"\uD834\uDD1E"`; auto j = parseJSON(s); assert(j.str == "\U0001D11E"); } //////////////////////////////////

Comment #2 by dlang-bugzilla — 2017-06-26T10:03:48Z

https://github.com/dlang/phobos/pull/5511

Comment #3 by github-bugzilla — 2017-07-03T09:07:45Z

Commit pushed to stable at https://github.com/dlang/phobos https://github.com/dlang/phobos/commit/b23e7a4107cc2eb3275e022cb46f7270e586ca29 Fix Issue 5904 - std.json parseString doesn't handle chars outside the BMP

Comment #4 by github-bugzilla — 2017-07-08T17:09:24Z

Commit pushed to master at https://github.com/dlang/phobos https://github.com/dlang/phobos/commit/b23e7a4107cc2eb3275e022cb46f7270e586ca29 Fix Issue 5904 - std.json parseString doesn't handle chars outside the BMP

Comment #5 by github-bugzilla — 2018-01-05T13:29:31Z

Commit pushed to dmd-cxx at https://github.com/dlang/phobos https://github.com/dlang/phobos/commit/b23e7a4107cc2eb3275e022cb46f7270e586ca29 Fix Issue 5904 - std.json parseString doesn't handle chars outside the BMP