Bug 3218 – Performance of std.xml.encode must be improved
Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
Other
OS
Linux
Creation time
2009-07-30T20:00:00Z
Last change time
2015-06-09T05:15:08Z
Keywords
performance
Assigned to
andrei
Creator
andrei
Comments
Comment #0 by andrei — 2009-07-30T20:00:38Z
I'm relaying this on behalf of zsxxsz <[email protected]>, see http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=94478:
=============================
Hi, below are the functions from Phobos and Tango with the same use, we can
see why so many people like Tango more than Phobos.
>>> In Phobos:
string encode(string s)
{
// The ifs are (temprarily, we hope) necessary, because
// std.string.write.replace
// does not do copy-on-write, but instead copies always.
if (s.indexOf('&') != -1) s = replace(s,"&","&");
if (s.indexOf('"') != -1) s = replace(s,"\"",""");
if (s.indexOf("'") != -1) s = replace(s,"'","'");
if (s.indexOf('<') != -1) s = replace(s,"<","<");
if (s.indexOf('>') != -1) s = replace(s,">",">");
return s;
}
>>>In Tango:
T[] toEntity(T) (T[] src, T[] dst = null)
{
T[] entity;
auto s = src.ptr;
auto t = s;
auto e = s + src.length;
auto index = 0;
while (s < e)
switch (*s)
{
case '"':
entity = """;
goto common;
case '>':
entity = ">";
goto common;
case '<':
entity = "<";
goto common;
case '&':
entity = "&";
goto common;
case '\'':
entity = "'";
goto common;
common:
auto len = s - t;
if (dst.length <= index + len + entity.length)
dst.length = (dst.length + len + entity.length)
+ dst.length / 2;
dst [index .. index + len] = t [0 .. len];
index += len;
dst [index .. index + entity.length] = entity;
index += entity.length;
t = ++s;
break;
default:
++s;
break;
}
// did we change anything?
if (index)
{
// copy tail too
auto len = e - t;
if (dst.length <= index + len)
dst.length = index + len;
dst [index .. index + len] = t [0 .. len];
return dst [0 .. index + len];
}
return src;
}
We can see the function's performance from Tango is more high than which one
from Phobos. This maybe not the only one function difference. :)
Comment #1 by andrei — 2009-08-28T09:54:16Z
I changed encode (which was indeed horrendous) to this:
S encode(S)(S s, S buffer = null)
{
string r;
size_t lastI;
if (buffer) buffer.length = 0;
auto result = Appender!(string)(&buffer);
foreach (i, c; s)
{
switch (c)
{
case '&': r = "&"; break;
case '"': r = """; break;
case '\'': r = "'"; break;
case '<': r = "<"; break;
case '>': r = ">"; break;
default: continue;
}
// Replace with r
result.put(s[lastI .. i]);
result.put(r);
lastI = i + 1;
}
if (!result.data) return s;
result.put(s[lastI .. $]);
return result.data;
}