← Back to index | Original Bugzilla link

Bug 12897 – std.json.toJSON doesn't translate unicode chars(>=0x80) to "\uXXXX"

Status: RESOLVED
Resolution: FIXED
Severity: critical
Priority: P1
Component: phobos
Product: D
Version: D2
Platform: All
OS: All
Creation time: 2014-06-12T11:48:10Z
Last change time: 2020-03-21T03:56:36Z
Assigned to: Basile-z
Creator: egustc

Attachments

ID	Filename	Summary	Content-Type	Size
1362	foo.d	json bug	text/x-csrc	175

Comments

Comment #0 by egustc — 2014-06-12T11:48:10Z

Created attachment 1362 json bug As the attachment showed, uUnicode chars GE than 0x80 (for exp.: Chinese, Japanese ) should be converted to "\uXXXX" in JSON. But Phobos doesn't. It causes problems while transmitting JSON from D to other languages. A std.json.appendJSONChar implement can fix this bug: private void appendJSONChar(Appender!string* dst, wchar c) { if(isControl(c) || c>=0x80) dst.put("\\u%04x".format(c)); else dst.put(c); }

Comment #1 by justin — 2014-07-11T18:13:16Z

Looking at the spec (http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf) it appears that while strings _may_ encode characters using the escape sequence, they are not _required_ to for any range of characters. On the face of it it seems that std.json is conformant and other languages are not. Which parsers are unable to handle the raw UTF-8?

Comment #2 by egustc — 2014-07-12T01:13:01Z

OK... I used Python but didn't decode first and got a problem. (In reply to Justin Whear from comment #1) > Looking at the spec > (http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf) > it appears that while strings _may_ encode characters using the escape > sequence, they are not _required_ to for any range of characters. On the > face of it it seems that std.json is conformant and other languages are not. > Which parsers are unable to handle the raw UTF-8?

Comment #3 by b2.temp — 2016-03-22T17:22:50Z

(In reply to egustc from comment #2) > OK... I used Python but didn't decode first and got a problem. > > (In reply to Justin Whear from comment #1) > > Looking at the spec > > (http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf) > > it appears that while strings _may_ encode characters using the escape > > sequence, they are not _required_ to for any range of characters. On the > > face of it it seems that std.json is conformant and other languages are not. > > Which parsers are unable to handle the raw UTF-8? I propose a PR for this (https://github.com/D-Programming-Language/phobos/pull/4106), but it was not clear if you considered the problem as fixed or not. Maybe it can even be closed without any modification. Let's see what people say.

Comment #4 by github-bugzilla — 2016-04-10T08:23:38Z

Commit pushed to master at https://github.com/D-Programming-Language/phobos https://github.com/D-Programming-Language/phobos/commit/b5cd354a05033ade13ae376377bec590bef62212 Merge pull request #4106 from BBasile/issue-12897 fix issue 12897 - toJSON, add the escapeNonAsciiChars option