Comment #0 by bearophile_hugs — 2010-02-24T02:33:14Z
While programming in D I have seen that you can forget that the "byte" is signed. (Because normally I think of bytes as unsigned entities. Other people share the same idea). (It's similar but not equal to the situation of signed and unsigned chars in C).
There are several ways to solve this small problem. One of the simpler ways I can think of is to deprecate the "byte" type name and introduce a "sbyte" type name (that replaces the "byte" type name). Using a sbyte it's probably quite more easy to not forget that it's a signed value.
This introduces an inconstancy in the naming scheme of D integral values (they are now symmetric, ubyte, byte, int, uint, etc), but it can help avoid some bugs, especially from D newbies.
Comment #1 by bearophile_hugs — 2010-03-14T18:19:24Z
The signed/unsigned bytes in C# are:
- The sbyte type represents signed 8-bit integers with values between -128 and 127.
- The byte type represents unsigned 8-bit integers with values between 0 and 255.
Choosing ubyte/sbyte is acceptable too.
Comment #2 by andrej.mitrovich — 2012-10-21T19:52:53Z
Although I agree with you I think it's way too late to fix this without breaking tons of code. You can always use an alias in your own code. Adding it to Phobos would probably be unwise too (people would ask what's the difference between byte and sbyte).
Comment #3 by clugdbug — 2012-10-22T02:02:18Z
This is not a newbie issue. I make this mistake myself, fairly often. *Walter* made this mistake once, in the header generation tool! My experience is that 90% of uses of "byte", should instead be "ubyte". It is really, really unusual to be using signed bytes.
I wish we could change this. (I would do it by changing the type to "sbyte" and then adding "alias byte = sbyte;" to object.d).
Comment #4 by andrej.mitrovich — 2012-10-22T08:58:02Z
(In reply to comment #3)
> This is not a newbie issue. I make this mistake myself, fairly often.
Absolutely, it happens to me all the time as well.
> I wish we could change this. (I would do it by changing the type to "sbyte" and
> then adding "alias byte = sbyte;" to object.d).
That still won't prevent you from making the mistake of typing 'byte' instead of 'ubyte' though. :)
Comment #5 by bearophile_hugs — 2012-10-22T09:52:06Z
(In reply to comment #4)
> That still won't prevent you from making the mistake of typing 'byte' instead
> of 'ubyte' though. :)
If you have sbyte and ubyte, and you keep using them consistently, I think this alone helps reduce mistakes a little.
And once few years have passed, and using "byte" is considered a bad idiom, D programs in the wild use "byte" less and less, we can even consider deprecating it.
There are tons of C++ code that represents null as "0", yet in C++11 there is nullptr, and G++ from version 4.7 has a warning (-Wzero-as-null-pointer-constant) that allows to find usage of "0" to represent null pointer.
The most important thing is the desire to improve the situation, then some slow deprecation paths exist.
Comment #6 by clugdbug — 2012-10-23T03:15:33Z
>> I wish we could change this. (I would do it by changing the type to "sbyte"
>> and then adding "alias byte = sbyte;" to object.d).
> That still won't prevent you from making the mistake of typing 'byte' instead
> of 'ubyte' though. :)
By itself, no, but anybody can modify their local copy of object.d to remove the alias...
A very slow deprecation path is possible.
Comment #7 by kozzi11 — 2012-10-23T07:02:09Z
(In reply to comment #0)
> While programming in D I have seen that you can forget that the "byte" is
> signed. (Because normally I think of bytes as unsigned entities. Other people
> share the same idea). (It's similar but not equal to the situation of signed
> and unsigned chars in C).
>
> There are several ways to solve this small problem. One of the simpler ways I
> can think of is to deprecate the "byte" type name and introduce a "sbyte" type
> name (that replaces the "byte" type name). Using a sbyte it's probably quite
> more easy to not forget that it's a signed value.
>
> This introduces an inconstancy in the naming scheme of D integral values (they
> are now symmetric, ubyte, byte, int, uint, etc), but it can help avoid some
> bugs, especially from D newbies.
I think byte should be unsigned by default. So I am for sbyte(signed byte - Is there really anyone who need it?) and byte (unsigned byte)
Comment #8 by bearophile_hugs — 2012-10-23T09:32:37Z
(In reply to comment #7)
> I think byte should be unsigned by default. So I am for sbyte(signed byte - Is
> there really anyone who need it?) and byte (unsigned byte)
Ideally I agree with you. In practice D built-in types are prefixed by "u" when unsigned, so a more practical solution is the C# one, that is using the "ubyte" and "sbyte" names pair.
Regarding the usefulness of signed bytes: small data types like ubyte, sbyte, short, ushort and even float are mostly useful in aggregates, like arrays and arrays of structs. They are not so useful if you need only one of them.
Recently I have used an array of sbyte values to represent indexes in a short array (statically known to be shorter than 127 items). Using 1 byte instad of an int/uint/size_t saves space if you have many of such indexes. And saving space means reducing cache misses. And to represent those indexes I used a sbyte instead of a ubyte because I have used -1 to represent "missing value").
sbyte values are not used often, but it's right to have them too in a system language.
Comment #9 by bearophile_hugs — 2014-10-09T16:02:27Z
The principle of least surprise is utterly violated with this: byte is unsigned everywhere except for D.
A symmetric name can be tiny/utiny (for tiny int).
(In reply to Andrej Mitrovic from comment #2)
> Although I agree with you I think it's way too late to fix this without
> breaking tons of code.
Another easy task for dfix, BTW.
Comment #12 by samjnaa — 2015-10-17T10:53:26Z
If the default signage of byte is going to be changed, then I support the request for tiny/utiny (a very nice choice) or some other <name>/u<name> pair.
byte would then be aliased to utiny/u<name> and ubyte slowly deprecated and removed.
I personally haven't had much problem with byte/ubyte, but I can see where it would be a problem for others.
Comment #13 by dmitry.olsh — 2018-05-16T14:53:53Z
Like it or not but changing `byte` to be unsigned and `sbyte` to be signed or some such is ton of _trivial_ breakage that gives us exactly 0 benefit.
It may appease C# programmers, but I believe name of signed byte is the least of their problems. Java has signed byte as `byte` though so not w/o precedent.