Comment #0 by csmith.ku2013 — 2015-12-23T02:07:29Z
This issue expands on Issue #14473, https://issues.dlang.org/show_bug.cgi?id=14473.
I've gone through and highlighted some issues with the generated markup.
Example source:
* view-source:http://dlang.org/spec/property.html
I went ahead and ran this page through an HTML validator(https://validator.w3.org/nu/?doc=http%3A%2F%2Fdlang.org%2Fspec%2Fproperty.html), and here's a brief summary:
* links need to be striped for anchor tags.
* center is obsolete, use CSS
* nesting code in an order list generates invalid html
Additional notes that pass the validator:
* Usage of the <b> tag instead of <strong>. This breaks the semantic layer html is supposed to provide. See http://www.w3.org/International/questions/qa-b-and-i-tags.
* Loading a stylesheet in the body, all stylesheets using rel should be in the head. http://www.w3.org/html/wg/drafts/html/master/single-page.html#the-link-element:attr-link-rel-4
* Unnecessary span tags. Example: `<td><span class="d_inlinecode donthyphenate notranslate">float.nan</span></td>` could be reduced to `<td class="d_inlinecode donthyphenate notranslate">float.nan</td>`
* Tons of unnecessary whitespace, and at the same time, insufficient whitespace for this to be readily editable after html generation.
Overall, all of these things can account for slower page loads, which can impact page ranking in search engines.
Comment #1 by destructionator — 2015-12-23T02:10:37Z
What do you mean by "links need to be striped for anchor tags." ?
Comment #2 by csmith.ku2013 — 2015-12-23T03:41:47Z
(In reply to Adam D. Ruppe from comment #1)
> What do you mean by "links need to be striped for anchor tags." ?
Poor phrasing, my mistake. I'm referring to the white space in the href. E.g anchor tag on that page:
<a href=' ../spec/intro.html'>Introduction</a>
I think this should be something as simple as adding std.string.stripLeft to, but I haven't investigated every occurrence.
Comment #3 by destructionator — 2015-12-23T03:51:37Z
Ah, yes indeed, though the whitespace is prolly generated accidentally by some macro which can't generically strip :(
Comment #4 by csmith.ku2013 — 2015-12-23T04:08:34Z
(In reply to Adam D. Ruppe from comment #3)
> Ah, yes indeed, though the whitespace is prolly generated accidentally by
> some macro which can't generically strip :(
That's what I was afraid of, and is pretty unfortunate. This actually violates the standard, so there's nothing requiring the browsers to actually support what we're trying to do, and they could change at any time (not necessarily they will ever though).
Comment #5 by andrei — 2015-12-23T15:25:01Z
We can do some of these with relative ease bot some others are likely to be difficult. In particular,
<td><span class="d_inlinecode donthyphenate notranslate">float.nan</span></td>
is an artifact of how generation occurs, first the code font is expanded then the table tag is expanded. Merging those two is nontrivial with what we have.
I'll add that we generate a few tags that are in fact optional, see the list at http://www.w3.org/TR/2011/WD-html5-20110525/syntax.html#optional-tags.
All of these issues in aggregate, including whitespace, would add to some inefficiency. I think it may be unmeasurable or difficult to measure, and at best add only to a couple percent. Certainly at this point "all of these things can account for slower page loads, which can impact page ranking in search engines" is pure speculation.
Do we want nice tight generated HTML? Somewhat. Does fixing this issue help? Yes. Do we want fast-loading pages? Yes. Does fixing this issue help? Unlikely to make any sensible improvement.
Comment #6 by destructionator — 2015-12-23T15:26:46Z
Indeed, the load time of these things is probably irrelevant. Minification efforts, in general, are almost plain negligible after gzipping.
gzip though, does make a huge difference and we should ensure it is enabled.
Comment #7 by andrei — 2015-12-23T15:39:53Z
I recall Jan has enabled compression a while ago. How do we check?
Comment #8 by destructionator — 2015-12-23T15:43:33Z
Just load it with an empty cache with network console open.
Yes, the homepage is gzipped. It comes in at about 8 KB on the wire, not bad at all.
Comment #9 by csmith.ku2013 — 2015-12-23T16:27:04Z
(In reply to Andrei Alexandrescu from comment #5)
> We can do some of these with relative ease bot some others are likely to be
> difficult. In particular,
>
> <td><span class="d_inlinecode donthyphenate
> notranslate">float.nan</span></td>
>
> is an artifact of how generation occurs, first the code font is expanded
> then the table tag is expanded. Merging those two is nontrivial with what we
> have.
This is what I figured, I figured I'd point it out just the same. Unfortunately these kind of things can make designing new templates a bit chaotic, since you end up with oddball things like:
div#search-box, span#search-query, span#search-dropdown, span#search-submit
{
border: 0.1em none #aaa;
}
I should've explicitly listed this as a concern.
> All of these issues in aggregate, including whitespace, would add to some
> inefficiency. I think it may be unmeasurable or difficult to measure, and at
> best add only to a couple percent. Certainly at this point "all of these
> things can account for slower page loads, which can impact page ranking in
> search engines" is pure speculation.
Fair enough, it was a weak argument, and as Adam pointed out, gzipping makes this not even that. The only concern I was thinking of when pointing this out was more for search engine page ranking, which favors compact compressed sites, but I have no idea to what degree.
Nonconforming and deprecated html however is also present, and that is perhaps a larger issue. I don't suspect browsers to just stop what they're doing already that fixes the mistakes, so it might just be good to know for the future.
Comment #10 by dfj1esp02 — 2015-12-23T16:44:43Z
(In reply to Charles Smith from comment #0)
> * Unnecessary span tags. Example: `<td><span class="d_inlinecode
> donthyphenate notranslate">float.nan</span></td>` could be reduced to `<td
> class="d_inlinecode donthyphenate notranslate">float.nan</td>`
Not equivalent if you want the span to have a border.
Comment #11 by robert.schadek — 2024-12-15T15:23:09Z