Bug 3218 – Performance of std.xml.encode must be improved

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
Other
OS
Linux
Creation time
2009-07-30T20:00:00Z
Last change time
2015-06-09T05:15:08Z
Keywords
performance
Assigned to
andrei
Creator
andrei

Comments

Comment #0 by andrei — 2009-07-30T20:00:38Z
I'm relaying this on behalf of zsxxsz <[email protected]>, see http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=94478: ============================= Hi, below are the functions from Phobos and Tango with the same use, we can see why so many people like Tango more than Phobos. >>> In Phobos: string encode(string s) { // The ifs are (temprarily, we hope) necessary, because // std.string.write.replace // does not do copy-on-write, but instead copies always. if (s.indexOf('&') != -1) s = replace(s,"&","&amp;"); if (s.indexOf('"') != -1) s = replace(s,"\"","&quot;"); if (s.indexOf("'") != -1) s = replace(s,"'","&apos;"); if (s.indexOf('<') != -1) s = replace(s,"<","&lt;"); if (s.indexOf('>') != -1) s = replace(s,">","&gt;"); return s; } >>>In Tango: T[] toEntity(T) (T[] src, T[] dst = null) { T[] entity; auto s = src.ptr; auto t = s; auto e = s + src.length; auto index = 0; while (s < e) switch (*s) { case '"': entity = "&quot;"; goto common; case '>': entity = "&gt;"; goto common; case '<': entity = "&lt;"; goto common; case '&': entity = "&amp;"; goto common; case '\'': entity = "&apos;"; goto common; common: auto len = s - t; if (dst.length <= index + len + entity.length) dst.length = (dst.length + len + entity.length) + dst.length / 2; dst [index .. index + len] = t [0 .. len]; index += len; dst [index .. index + entity.length] = entity; index += entity.length; t = ++s; break; default: ++s; break; } // did we change anything? if (index) { // copy tail too auto len = e - t; if (dst.length <= index + len) dst.length = index + len; dst [index .. index + len] = t [0 .. len]; return dst [0 .. index + len]; } return src; } We can see the function's performance from Tango is more high than which one from Phobos. This maybe not the only one function difference. :)
Comment #1 by andrei — 2009-08-28T09:54:16Z
I changed encode (which was indeed horrendous) to this: S encode(S)(S s, S buffer = null) { string r; size_t lastI; if (buffer) buffer.length = 0; auto result = Appender!(string)(&buffer); foreach (i, c; s) { switch (c) { case '&': r = "&amp;"; break; case '"': r = "&quot;"; break; case '\'': r = "&apos;"; break; case '<': r = "&lt;"; break; case '>': r = "&gt;"; break; default: continue; } // Replace with r result.put(s[lastI .. i]); result.put(r); lastI = i + 1; } if (!result.data) return s; result.put(s[lastI .. $]); return result.data; }