Bug 3813 – Bad writeln of arrays

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2010-02-18T09:47:00Z
Last change time
2015-06-09T01:27:40Z
Keywords
patch
Assigned to
andrei
Creator
bearophile_hugs

Comments

Comment #0 by bearophile_hugs — 2010-02-18T09:47:14Z
import std.stdio: writeln; void main() { auto a = ["a", "bc"]; writeln(a); } This writeln prints an ugly and misleading output: a bc instead of something much more realistic and useful as: ["a", "bc"] (This is my first bug report in this bug tracker. I will probably add here some more bugs, some of them will probably be duplicates of already present bugs. Most of them will be bugs, I try to limit the true enhancements requests to very few and small, because D2 is now feature frozen. All my bug reports will be relative to D2/Phobos, but many of them can be present in D1 too. If I am doing something wrong please tell me.)
Comment #1 by andrei — 2010-02-18T10:33:23Z
This is by design. The default array bounds were changed from "[" and "]" to "" and "", and the default separator has been changed from ", " to " ".
Comment #2 by bearophile_hugs — 2010-02-18T10:56:31Z
(In reply to comment #1) > This is by design. The default array bounds were changed from "[" and "]" to "" > and "", and the default separator has been changed from ", " to " ". It's a bad design, much less useful than the design I have shown. A print like: ["a", "bc"] Tells you it's probably an array, how long it is, how long its contents are, and the strings themselves.
Comment #3 by bearophile_hugs — 2010-08-09T07:31:25Z
In dmd 2.048beta the situation is improved, but it's not good enough yet. This D2 program: import std.stdio, std.typetuple, std.typecons; void main() { auto a1 = ["10", "20"]; writeln(a1); auto a2 = [10, 20]; writeln(a2); char[] a3 = ['5', '7', '9']; writeln(a3); auto t1 = TypeTuple!(10, "20", '7'); writeln(t1); auto t2 = tuple(10, "20"); writeln(t2); } Prints: [10, 20] [10, 20] 579 10207 Tuple!(int,string)(10, 20) From that output there is now way, in both the array and the TypeTuple, to tell apart strings, numbers and chars. This doesn't help D2 debugging, and it doesn't help all when you write quick scripts that often use a simple writeln() for their output. In both cases being able to tell apart numbers and strings in the output is quite important. So my warm suggestion is to put "" around strings, '' around chars when they are printed inside collections (Inside string literals special chars need to be escaped). So my expected output is: ["10", "20"] [10, 20] 579 10"20"'7' Tuple!(int,string)(10, "20") A possible alternative output: ["10", "20"] [10, 20] ['5', '7', '9'] 10"20"'7' Tuple!(int, string)(10, "20") A similar Python 2.7 program: from collections import namedtuple a1 = ["10", "20"] print a1 a2 = [10, 20] print a2 a3 = ['5', '7', '9'] print a3 t1 = (10, "20", '7') print t1 t2 = namedtuple('Two', 'a b')(10, "20") print t2 Prints: ['10', '20'] [10, 20] ['5', '7', '9'] (10, '20', '7') Two(a=10, b='20') (In Python strings can be delimited by '' too.) In D I presume writeln() can't tell that the input is a TypeTuple, so it can't print something like the () as in Python.
Comment #4 by bearophile_hugs — 2010-08-09T08:41:44Z
Sorry, the expected results are wrong because writeln() can't see TypeTuple items as inside a collection, so the expected output is: ["10", "20"] [10, 20] 579 10207 Tuple!(int,string)(10, "20") And the possible alternative output is: ["10", "20"] [10, 20] ['5', '7', '9'] 10207 Tuple!(int, string)(10, "20")
Comment #5 by bearophile_hugs — 2010-08-09T08:49:59Z
See also bug 4605
Comment #6 by bearophile_hugs — 2010-08-17T05:57:11Z
See also bug 4660
Comment #7 by bearophile_hugs — 2010-10-11T16:20:06Z
See also bug 5043
Comment #8 by bearophile_hugs — 2010-10-29T17:28:53Z
import std.stdio, std.range; void main() { writeln(iota(5)); } With DMD 2.050 that program prints: [0, 1, 2, 3, 4] But that's not an array, it's a lazy sequence, and I'd like to be able to tell apart an array from a lazy sequence in a printout. A possible simple way to tell them apart is to print that lazy range like this, like an array, but with semicolons instead of commas (in some languages this syntax is used to tell apart linked lists from arrays, but in D lazy ranges are probably more common than lists): [0; 1; 2; 3; 4]
Comment #9 by denis.spir — 2010-12-16T00:13:28Z
(In reply to comment #4) > Sorry, the expected results are wrong because writeln() can't see TypeTuple > items as inside a collection, so the expected output is: > > > ["10", "20"] > [10, 20] > 579 > 10207 > Tuple!(int,string)(10, "20") > > > And the possible alternative output is: > > ["10", "20"] > [10, 20] > ['5', '7', '9'] > 10207 > Tuple!(int, string)(10, "20") I support this enhancement request. About char[], I think using a string format rather than an array format is better, to respect the semantics of "char" (as opposed to ubyte[]). Especially for debugging: char[] is often used temporarily to manipulate a string, thus we want to be able to (visually) compare it to a string. Denis
Comment #10 by bearophile_hugs — 2011-01-08T03:30:24Z
This Python2 program: array1 = [1, 2] print array1 array2 = [1.0, 2.0] print array2 Prints: [1, 2] [1.0, 2.0] This similar D2 program: import std.stdio; void main() { int[2] array1 = [1, 2]; writeln(array1); double[2] array2 = [1.0, 2.0]; writeln(array2); } With DMD 2.051 prints: [1, 2] [1, 2] A print like the one shown by Python this is better because it allows to better tell apart visually the types of the two arrays: [1, 2] [1.0, 2.0]
Comment #11 by denis.spir — 2011-01-28T06:08:53Z
More generally, why don't writeln / formatValue / whatever builtin funcs used for output simply recurse to format elements of collections. This is how things work in all languages I know that provide default output forms. And this default is equal, or very similar, to literal notation). Isn't this the only sensible choice? I think we agree default output format is primarily for programmer's feedback. Side-point: I would love default formats for composites thingies like struct & class objects as well. Probably another enhancement request. Currently, the code copied below writes out: S modulename.C Great! very helpful ;-) I wish we would get, as default, an output similar to the notation needed to create the objects: S(1, 1.1, '1', "1.1", S.Sub(1)) C(1, 1.1, '1', "1.1", new C.Sub(1)) (Except for members with default values, possibly not provided in creation code, but listed on output.) At least we can write a toString... but it's a bit annaoying to be forced to do it, esp for quickly written pieces of code, when a default would do the job (prbably most cases by far). Denis Code: struct S { struct Sub { int j; this (int j) { this.j = j; } } int i; float f; char c; string s; Sub sub; } class C { static class Sub { int j; this (int j) { this.j = j; } } int i; float f; char c; string s; Sub sub; this (int i, float f, char c, string s, Sub sub) { this.i = i; this.f = f; this.c = c; this.s = s; this.sub = sub; } } unittest { S s = S(1, 1.1, '1', "1.1", S.Sub(1)); writeln(s); C c = new C(1, 1.1, '1', "1.1", new C.Sub(1)); writeln(c); }
Comment #12 by k.hara.pg — 2011-09-02T12:04:18Z
https://github.com/D-Programming-Language/phobos/pull/126 All of ranges are formatted like "[elem1, elem2, ...]".
Comment #13 by bearophile_hugs — 2011-09-02T15:11:43Z
(In reply to comment #12) > https://github.com/D-Programming-Language/phobos/pull/126 > > All of ranges are formatted like "[elem1, elem2, ...]". I appreciate the work you are doing to improve D textual Input/Output. Regarding lazy ranges, generally I prefer the textual output to give me hints of what I have printed. So I'd like some difference between the textual representation of this array: [0, 1, 2] And this range: iota(3) I think a simple way to tell them apart is to use a different separator. Functional languages sometimes use the semicolon to separate list items, so I think it's nice to print iota(3) like this: [0; 1; 2]
Comment #14 by k.hara.pg — 2011-09-02T16:00:43Z
(In reply to comment #13) > (In reply to comment #12) > > https://github.com/D-Programming-Language/phobos/pull/126 > > > > All of ranges are formatted like "[elem1, elem2, ...]". > > I appreciate the work you are doing to improve D textual Input/Output. > > Regarding lazy ranges, generally I prefer the textual output to give me hints > of what I have printed. So I'd like some difference between the textual > representation of this array: > > [0, 1, 2] > > And this range: > > iota(3) > > I think a simple way to tell them apart is to use a different separator. > Functional languages sometimes use the semicolon to separate list items, so I > think it's nice to print iota(3) like this: > > [0; 1; 2] I think it is nonsense to distinguish formatting between eager range (like array) and lazy range (like iota). Because (1) today we cannot restore lazy range from already formatted output. We don't have common interface to rebuild range from formatted string. Maybe OutputRange is the I/F (call put(r, e0), put(r, e1), ... against range), but now almost lazy ranges are not OutputRange. And D is strict-typed language, so when you want to unformat range, you should give range type first. So the separator will not be an issue. (2) I think that the output like "[e0, e1, ..., eN]" is *range formatting*, not array formatting, even if it looks like array. And an array is a kind of ranges, so we should format them with *one* formatting. From the reasons, it seems to me that we have no necessary to distinguish range formattings.
Comment #15 by bearophile_hugs — 2011-09-02T16:38:20Z
(In reply to comment #14) > (2) I think that the output like "[e0, e1, ..., eN]" is *range formatting*, not > array formatting, even if it looks like array. And an array is a kind of > ranges, so we should format them with *one* formatting. Then an alternative idea is to use [,,,,] for all ranges that support random access (std.range.isRandomAccessRange), and [;;;;] for the all the other kinds of ranges.
Comment #16 by andrei — 2011-09-04T12:38:55Z
Let's use [,,,] for all ranges. Thanks Kenji for your work.
Comment #17 by bearophile_hugs — 2011-09-05T02:53:08Z
(In reply to comment #16) > Let's use [,,,] for all ranges. Thanks Kenji for your work. What's your rationale for this decision? --------------------------- In 2.055beta the printing situation is improved a lot, thanks to your work. I see two things where I'd like further improvement: This code: import std.stdio; void main() { int[2] array1 = [1, 2]; writeln(array1); double[2] array2 = [1.0, 2.0]; writeln(array2); } With DMD 2.055beta prints: [1, 2] [1, 2] This is not good, because from the text you can't tell FP point numbers from integer ones. In a similar situation Python prints this, that I think is a better output: [1.0, 2.0] [1.0, 2.0] ------------------------------ This code: import std.stdio, std.typecons; void main () { writeln(tuple(1, "xx")); } Prints: Tuple!(int,string)(1, xx) But I'd like: tuple(1, "xx") That's closer to array representation, because D tuples are not TypeTuples, they have a different semantics (with a TypeTuple is probably better to not print the "" around strings). Python does something similar (Python uses ' insted if " but this is not relevant): >>> (1, "xx") (1, 'xx') -------------------------- This is a less important thing. This code: import std.stdio; void main () { writeln([1:2, 3:4]); } Prints: [1:2, 3:4] While Python 2.6 prints: >>> {1:2, 3:4} {1: 2, 3: 4} Python adds a space after the colon probably to increase readability.
Comment #18 by bearophile_hugs — 2011-09-05T02:55:45Z
(In reply to comment #17) > This is not good, because from the text you can't tell FP point numbers from > integer ones. In a similar situation Python prints this, that I think is a > better output: > > [1.0, 2.0] > [1.0, 2.0] Sorry, I meant: [1, 2] [1.0, 2.0]
Comment #19 by k.hara.pg — 2011-09-05T03:11:08Z
(In reply to comment #18) > (In reply to comment #17) > > > This is not good, because from the text you can't tell FP point numbers from > > integer ones. In a similar situation Python prints this, that I think is a > > better output: > > > > [1.0, 2.0] > > [1.0, 2.0] > > Sorry, I meant: > > [1, 2] > [1.0, 2.0] If you want that result, following code will work. writefln("[%(%.1f, %)]", array2); // prints [1.0, 2.0]
Comment #20 by bearophile_hugs — 2011-09-05T04:12:39Z
(In reply to comment #19) > If you want that result, following code will work. > > writefln("[%(%.1f, %)]", array2); // prints [1.0, 2.0] What I meant to say is that I'd like writeln([1.0, 2.0]) To print this one default: [1.0, 2.0]
Comment #21 by andrei — 2011-09-05T06:02:55Z
(In reply to comment #17) > (In reply to comment #16) > > Let's use [,,,] for all ranges. Thanks Kenji for your work. > > What's your rationale for this decision? Absent other considerations, more consistency is better than less consistency. You keep on asking for arbitrary differentiation of kinds of ranges in their textual representation without any rationale. The burden is on you to justify the inconsistency, not on Kenji to justify consistency. Plus, the requests you are making now are self-contradictory. First, you want the representation of the array to clarify its type: > [1, 2] > [1.0, 2.0] This marginally differentiates floating point types from integral types. However it's an incomplete solution as it doesn't differentiate across float/double/real or short/int/long/uint etc. But let's note that here you are asking for a format that helps type differentiation. One paragraph below (!) you are asking for the exact opposite: less type differentiation. You want to change > Tuple!(int,string)(1, xx) to > tuple(1, "xx") Both changes have pros and cons, but there's no guiding rationale behind them.
Comment #22 by bearophile_hugs — 2011-09-05T16:14:07Z
(In reply to comment #21) > more consistency is better than less consistency. Consistency is good when the situations don't change. But in our discussions the conditions weren't invariant, because the range types (random access or not) are different. The different separators is not an important feature, so if you aren't interested I'll stop asking for it. It's just a cute little thing that I think gives a little help. > You keep on asking for arbitrary differentiation of kinds of ranges in their > textual representation without any rationale. The burden is on you to justify > the inconsistency, not on Kenji to justify consistency. The desire to tell apart lazy ranges from random access ranges comes from the desire to give more information to the person that reads the textual output. This is expecially useful in dynamically typed languages, but it's useful in D too, because you use "auto" definition, template type values, and generally reading textual outputs it's not always easy to know what part of the printout is generated by a specific writeln. > Plus, the requests you are making now are self-contradictory. The moderate self-contradictory nature of what I've said comes from practical considerations. See below. > First, you want the representation of the array to clarify its type: > > > [1, 2] > > [1.0, 2.0] > > This marginally differentiates floating point types from integral types. > However it's an incomplete solution as it doesn't differentiate across > float/double/real or short/int/long/uint etc. But let's note that here you are > asking for a format that helps type differentiation. Currently D allows implicit casts between float/double/real. But you can't implicitly cast an int to float. So while float and double are two different types just as int and double are two different types, in D double and real are less different than a double and an int. So the D definition of "different type" is not fully binary, it's a bit fuzzy. (It can be argued that a good textual representation of floats/reals has to contain a leading "F"/"L". I think this is a bit overkill, but it's not so bad.) > One paragraph below (!) you are asking for the exact opposite: less type > differentiation. You want to change > > > Tuple!(int,string)(1, xx) > > to > > > tuple(1, "xx") > > Both changes have pros and cons, but there's no guiding rationale behind them. There is a rationale: I'd like the textual representation to be as informative as possible, and possibly to be "invertible" too (so "unprinting" it gives back the original value or data structure), unless this goes too much against other practical considerations (like output space, output readability, etc). Writing on default a double as 2.0 instead of 2 is good because it tells me that I have printed a value of one of the floating point types (float/double/real or some other library defined FP value). In several situations it's useful to know if a value is integral or not. Printing a tuple like: Tuple!(int,string)(1, xx) Is bad because the second argument is not what you see in the code, so if you "unprint" it you get an error (unless you add a special case for this situation, but this is not good and it doesn't scale). (Doing the same for a TypeTuple is acceptable, in my opinion.) So this is a better textual representation: Tuple!(int,string)(1, "xx") Currently if you print an array of strings you get: ["xx", "yy"] Because this allows unprinting, is more readable, etc. For consistency with string array printing, I'd like tuples to do the same (while TypeTuples are very different from arrays). Now think about an array of tuples, if you print one of them you get something like: [Tuple!(int,string)(1, "aa"), Tuple!(int,string)(2, "bb"), Tuple!(int,string)(3, "cc"), Tuple!(int,string)(4, "dd"), Tuple!(int,string)(1, "ee"), Tuple!(int,string)(1, "ff"), Tuple!(int,string)(1, "gg"), ...] Here in my opinion there is too much noise, most of the space is used by redudant things. This is why I have suggested to print tuples, when they are inside a collection, in a shorter way: [tuple(1, "aa"), tuple(2, "bb"), tuple(3, "cc"), tuple(4, "dd"), tuple(1, "ee"), tuple(1, "ff"), tuple(1, "gg"), ...] Or even (but this can't be unprinted, so this is probably eccessive): [(1, "aa"), (2, "bb"), (3, "cc"), (4, "dd"), (1, "ee"), (1, "ff"), (1, "gg"), ...] On the other hand if I print a single tuple, outside of collections, then showing the full typing is usually acceptable: Tuple!(int,string)(1, "xx") In arrays of FP values the ".0" adds add just two chars and only to numbers that don't already have one or more decimal digit. So the added ".0" doesn't add too much noise (unlike adding "Tuple!(int,string)").
Comment #23 by andrei — 2011-09-05T19:11:33Z
(In reply to comment #22) > (In reply to comment #21) > > > more consistency is better than less consistency. > > Consistency is good when the situations don't change. But in our discussions > the conditions weren't invariant, because the range types (random access or > not) are different. This doesn't make sense, so I wouldn't know how to answer. Anyhow, first you wanted ';' to denote lazy vs. eager ranges. Now you want it to denote random access vs. other ranges - all irrelevant features to printing ranges. This forces someone who e.g. wants to parse some range to special-case things because the emitter of the string used a random-access range or not. This, again, makes no sense. For all I can tell you'd be happy as long as ';' vs. ',' differentiates _some_ aspect of a range, it doesn't really matter which. > The different separators is not an important feature, so if you aren't > interested I'll stop asking for it. It's just a cute little thing that I think > gives a little help. Please don't frame me as the arbiter. I'm just pointing out some nonsense. > > You keep on asking for arbitrary differentiation of kinds of ranges in their > > textual representation without any rationale. The burden is on you to justify > > the inconsistency, not on Kenji to justify consistency. > > The desire to tell apart lazy ranges from random access ranges comes from the > desire to give more information to the person that reads the textual output. > This is expecially useful in dynamically typed languages, but it's useful in D > too, because you use "auto" definition, template type values, and generally > reading textual outputs it's not always easy to know what part of the printout > is generated by a specific writeln. So you want differentiation, it doesn't matter on what feature. Numeric vs. non-numeric seems fair game too. > Currently D allows implicit casts between float/double/real. But you can't > implicitly cast an int to float. So while float and double are two different > types just as int and double are two different types, in D double and real are > less different than a double and an int. So the D definition of "different > type" is not fully binary, it's a bit fuzzy. No, it's quite binary. Two types are identical or not. > (It can be argued that a good textual representation of floats/reals has to > contain a leading "F"/"L". I think this is a bit overkill, but it's not so > bad.) > > > > One paragraph below (!) you are asking for the exact opposite: less type > > differentiation. You want to change > > > > > Tuple!(int,string)(1, xx) > > > > to > > > > > tuple(1, "xx") > > > > Both changes have pros and cons, but there's no guiding rationale behind them. > > There is a rationale: I'd like What one likes is not rationale. > the textual representation to be as informative > as possible, and possibly to be "invertible" too (so "unprinting" it gives back > the original value or data structure), unless this goes too much against other > practical considerations (like output space, output readability, etc). But in the same breath you propose tuple(1, "hello") instead of Tuple!(int, string) although it could originate from a Tuple!(short, char[]). Your own requests are dissonant with your own desiderata. To be frank, you quite literally don't know what you want. Why are we spending our valuable time on this? > Writing on default a double as 2.0 instead of 2 is good because it tells me > that I have printed a value of one of the floating point types > (float/double/real or some other library defined FP value). In several > situations it's useful to know if a value is integral or not. It's also bad because it takes more space. > Printing a tuple like: > > Tuple!(int,string)(1, xx) > > Is bad because the second argument is not what you see in the code, so if you > "unprint" it you get an error (unless you add a special case for this > situation, but this is not good and it doesn't scale). > > (Doing the same for a TypeTuple is acceptable, in my opinion.) > > So this is a better textual representation: > > Tuple!(int,string)(1, "xx") I agree that printing a tuple should produce unambiguous representation of its fields, but that's different from your request that the type of Tuple is replaced with the word "tuple". > Currently if you print an array of strings you get: > > ["xx", "yy"] > > Because this allows unprinting, is more readable, etc. For consistency with > string array printing, I'd like tuples to do the same (while TypeTuples are > very different from arrays). So now you list unprinting as a desideratum yet it's only seconds after you asked for making that gratuitously difficult by using ';' vs. ',' on some arbitrary criterion. > Now think about an array of tuples, if you print one of them you get something > like: > > [Tuple!(int,string)(1, "aa"), Tuple!(int,string)(2, "bb"), > Tuple!(int,string)(3, "cc"), > Tuple!(int,string)(4, "dd"), Tuple!(int,string)(1, "ee"), > Tuple!(int,string)(1, "ff"), > Tuple!(int,string)(1, "gg"), ...] > > Here in my opinion there is too much noise, most of the space is used by > redudant things. This is why I have suggested to print tuples, when they are > inside a collection, in a shorter way: > > [tuple(1, "aa"), tuple(2, "bb"), tuple(3, "cc"), > tuple(4, "dd"), tuple(1, "ee"), tuple(1, "ff"), > tuple(1, "gg"), ...] > > Or even (but this can't be unprinted, so this is probably eccessive): > > [(1, "aa"), (2, "bb"), (3, "cc"), (4, "dd"), (1, "ee"), (1, "ff"), (1, "gg"), > ...] > > > On the other hand if I print a single tuple, outside of collections, then > showing the full typing is usually acceptable: > > Tuple!(int,string)(1, "xx") > > > In arrays of FP values the ".0" adds add just two chars and only to numbers > that don't already have one or more decimal digit. So the added ".0" doesn't > add too much noise (unlike adding "Tuple!(int,string)"). Quite honestly I think the noise is in this discussion. I am closing this report. Sorry.
Comment #24 by bearophile_hugs — 2011-09-06T02:31:57Z
(In reply to comment #23) > Why are we spending our valuable time on this? Because I print tuples often, and I'd like to see this situation improved. See also: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=143987 > Quite honestly I think the noise is in this discussion. It was an useful discussion already: http://d.puremagic.com/issues/show_bug.cgi?id=6606