j4: (dodecahedron)
j4 ([personal profile] j4) wrote2007-03-20 09:16 am
Entry tags:

This is an ex-HTML

Okay, I think I'm going mad. I put the following into our CMS:
<ul>
<li> Item 1
<ul>
<li> SubItem 1</li>
<li> SubItem 2</li>
</ul>
</li>
<li> Item 2</li>
</ul>
and it (silently, without any notification) 'corrected' it to the following:
<ul>
<li>Item 1
<ul></ul></li>
<li>SubItem 1</li>
<li>SubItem 2</li>
<li>Item 2</li></ul>
I pointed this out to the people who are setting up the new site for us, and they raised it as a support call with the CMS people, and got the following response:
"Could you please use the following schema:

<ul>
<li>Item 1</li>
<ul>
<li>SubItem 1</li>
<li>SubItem 2</li>
</ul>
<li>Item 2</li>
</ul>


Such syntax is formatted correctly."
If such syntax is formatted correctly, why doesn't it validate? I'm not even trying to be a validation Nazi about this (it's not as if anything that comes out of this CMS is ever going to validate anyway), it's more that I don't really want to have to 'correct' all our existing HTML to prevent it being 'corrected' by the CMS.

[identity profile] stephdairy.livejournal.com 2007-03-20 09:48 am (UTC)(link)
I don't see why their option shouldn't validate. Both it and your original version are valid HTML 4 (though only theirs is also valid XHTML).

Their CMS's behaviour is reminiscent of that of LJ's HTML "fixer" which spews out a load of close tags for any elements you may have forgotten to close at any point in the rest of the document...

I'd say this "miscorrection" is a bug in the CMS, unless the CMS is expecting to deal with XHTML.

(S)

[identity profile] bellinghman.livejournal.com 2007-03-20 09:53 am (UTC)(link)
Hmm, their example is valid according to what I see on the W3C site, but marked DEPRECATED.
  1. The </li> isn't required
  2. You can dump a sublist in without a <li>!
  3. lists will sometimes run backward!

[identity profile] barnacle.livejournal.com 2007-03-20 10:52 am (UTC)(link)
Their option is dead wrong. But your option may be failing because they might be doing DTD validation and I don't know enough DTD language to be sure that it's right. There's a condition that might mean that li elements can contain EITHER block content (div, p etc.) or inline content (plain text, span, label, b, i, em, a etc.) but not both.

http://www.w3.org/TR/html4/sgml/dtd.html

states:

<!ENTITY % flow "%block; | %inline;">
...
<!ELEMENT LI - O (%flow;)* ...

The dash and the O means that the opening LI tag is required and the closing is optional. That seemss ambiguous depending on your SGML parser's behaviour and how it treats the binding there. As I say, I don't speak DTD or SGML well enough, but that could be interpreted as "any number of either-block-or-inline elements" or "either any number of block elements or any number of inline elements."

The schema also may be tight: http://www.w3.org/2002/08/xhtml/xhtml1-strict.xsd creates a complexType called Flow which can be any one of four choices. I think choice means it has to be a single one of them, but then that one can occur multiple times. So multiple elements from the block or inline groups, but not a mixture.

Try wrapping everything within the LI in a DIV and see where that gets you.

[identity profile] jvvw.livejournal.com 2007-03-20 11:14 am (UTC)(link)
Sounds very much like a bug in the CMS and that the CMS folk don't really know what they're talking about it. I'd try and raise it as a bug through whatever channels there are, explaining that their syntax isn't valid HTML citing the stuff you have here and then just accept that you'll probably have to workaround it (what CMS are you using out of curiosity?)

[identity profile] damned-colonial.livejournal.com 2007-03-20 11:56 am (UTC)(link)
Web 0.2! I have to use that! *writes in biro on back of hand*
ext_8103: (Default)

[identity profile] ewx.livejournal.com 2007-03-20 11:58 am (UTC)(link)

You're right that their suggestion is broken; UL contains one or more LI but not another nested UL. (The same rule makes the empty UL illegal, too.) It sounds like their software has a completely broken model of UL.

Given the syntax they suggest is invalid, presumably browsers are allowed to do anything they like with it, so any possible choice of eventual displayed output would be "formatted correctly".

redbird: closeup of me drinking tea, in a friend's kitchen (Default)

[personal profile] redbird 2007-03-20 12:07 pm (UTC)(link)
You could send them back a note saying, as politely as you can manage, that the syntax you're entering and the CMS is refusing is valid HTML, and that their suggested fix is not good HTML, even though the CMS accepts it, as shown by the validator.

It may fit some local DTD, but HTML is not SGML, and you presumably are working on something that will be seen by outsiders and run through browsers that don't have your DTD.

[It's too early in the morning for me to find the politic phrasing; some mention of the fact that you know they didn't create the CMS might help.]