...

View Full Version : XHTML, custom attributes, and DTDs



beetle
12-11-2002, 11:10 PM
I was under the apparently mistaken impression that XHTML allowed (and would validate) custom attributes, such as

<li ignore="true">blah</li>

But it doesn't. I easily added this attribute to the XHTML1 DTD in the LI element's ATTLIST, but had to save the whole DTD to my domain just for this small change.

Is this how things are supposed to be?

krycek
12-12-2002, 12:30 AM
hmmm, interesting.

I have done a similar thing, i.e. use custom properties in my tags, however I have not implemented them on my latest sites - which use XHTML - hence I have not been aware of this validation problem before now.

I have no experience at all in writing DTDs, so could you post a quick tip on how to do it?

But I must say, thinking about it, it does make sense that you would have to add it to the DTD - after all, XHTML is supposed to be strict, and you are introducing an unknown. Therefore I would say that the validator is correct.

::] krycek [::

oracleguy
12-12-2002, 04:27 AM
Perhaps they can help you better in the XML forum. But as I recall you gotta use full-fledged XML to be able to do that. That is sorta what makes XML so cool, is that you can make custom arributes and tags.

brothercake
12-12-2002, 04:42 AM
I think that's basically it - when you add custom attributes to XHTML to have to think of it like XML, because that's what you're doing. Adding that attribute to a unique DTD on your server is, I think, exactly the right solution.

Although it does seem excessive that custom attributes can't be allowed, it makes sense conceptually.

Personally I tend to stick to either pure XML or good old HTML transitional

krycek
12-12-2002, 02:30 PM
Originally posted by brothercake
Personally I tend to stick to either pure XML or good old HTML transitional

Interesting, I was thinking about using XML recently however as I understood it hardly any browser PROPERLY support XML, e.g. with XSL and stuff.

So I just use XHTML instead.

Is it possible to make a standard webpage purely from XML and XSL (or CSS) and have it display correctly...? And would this be a good idea... or should I stick to XHTML.

I have used a little XML and XSLT but never really done much with DTDs. If the DTD beetle mentioned would just be an ordinary XML DTD then I reckon I could handle that, however although I agree with the W3C that the DTD should be necessary (see my previous post) I do think that it is a pity, too :)

::] krycek [::

beetle
12-12-2002, 02:35 PM
Here (http://www.lanwizards.com/test.xml) is a little test page I did that is just xml/xslt and like you said, doesn't have a lot of support in this form. IE6 renders it, Mozilla doesn't.

brothercake
12-12-2002, 04:16 PM
Originally posted by krycek
Is it possible to make a standard webpage purely from XML and XSL (or CSS) and have it display correctly...? And would this be a good idea... or should I stick to XHTML.

Client-side XSL transformation is a non-starter on the web, for exactly that compatibility reason.

I know of two useful approaches:

1 - for apache you can use PHP SAX to parse the XML and transform it into [X]HTML - that's what I've done here (http://www.mori.com/news.phtml) and it works very well.

2 - for IIS, you can link XML and XSL together in almost exactly the same way as you would with javascript, get the server to transform and then deliver [x]html to the client.


I've used the former approach partly because php/apache is already available to me, but mostly because it means I can parse/process and otherwise manipulate the XML data all in PHP - I don't know whether XPath has such sophisticated pre-processing functions.

jkd
12-12-2002, 04:37 PM
Originally posted by beetle
Here (http://www.lanwizards.com/test.xml) is a little test page I did that is just xml/xslt and like you said, doesn't have a lot of support in this form. IE6 renders it, Mozilla doesn't.

Perhaps you should use the correct type in the xml-stylesheet processing instruction?

<?xml-stylesheet type="text/xml" href="test.xsl"?>

IE's XSLT transformer is about on par with Mozilla's Transformiix module.

And only Mozilla supports the more useful abilities of XML, such as combining namespaces into a complex document. (MathML inside SVG inside XHTML for example).

beetle
12-12-2002, 04:42 PM
Well, PHP also supports Expat and Sablotron (if compiled to do so). Read about that here (http://www.php.net/manual/en/printwn/ref.xslt.php). I'm gonna recompile PHP sometime soon to mess around with these. Also, I'm pretty sure these are used by Randy (http://www.php-tools.de/site.php?&file=patXMLRendererOverview.xml), an interesting looking XML renderer from the pat people. As you can see, their entire site is in XML, as each page has a "Documented Source" link at the bottom that lets you view the XML. Pretty neat.

brothercake
12-12-2002, 05:23 PM
Expat is not very good IME - it can only handle small, simple xml documents.

Sablotron uses XSL, but my understanding is that it's not considered stable enough for production use.

beetle
12-12-2002, 05:46 PM
Thanks jkd, but I didn't really need a Mozilla 'commercial' :p

Ok, so text/xml is more proper? Does text/xsl only work for IE? I'm a bit confused....

Also, can someone give me the lowdown on Sax, expat, sablotron, xerces, and all this other hoopla? I've been told to use any number of these by different people for different reasons, and now it's all getting mixed in my head. Expat is too simple? Sablotron is not stable? If so, why use them? Lots of people (apparently) do.

And what about SAX, brothercake? I see only two SAX-related functions in PHP. Is there more to it than this? Got any source code you'd like to share? :D

Thank you all for your patience with my 'n00b-ness' on all this stuff :o

brothercake
12-12-2002, 06:42 PM
Well I'm passing on the gist of what I've read and heard over the last few months, as well as my own experience. I went with SAX in the end, because "Professional PHP 4" (Wrox press) recommended it over the others.

Btw - my previous post about Expat is wrong - expat is the name of an apache module, rather than a process. I was thinking of Prax.

I had a discussion with George about this quite recently http://www.codingforums.com/showthread.php?s=&threadid=9382

There are indeed only two parsing functions in SAX - to process open tags / close tags, and to process node data - but that's all you need. Here's a pretty comprehensive example:

I have an XML document of publications - it looks like this:



<publication cat="p" sub="p" sba="digest" sbax="political">
<title>The Mood Of The Nation</title>
<url>/polls/2002/t021119.shtml</url>
<date>25 November 2002</date>
<source>mori.com</source>
<author>Simon Atkinson</author>
</publication>


That's one example - there are several hundred of these <publication> nodes.

The attributes serve several purposes. Take the "sba" attribute as an example - there are also "sba2", "sba3" and so on, all of which are optional and use to associate publications with one or more business areas. For each business area publications list, I simply have to identify which category it belongs to, compare that with the attribute values, and output the ones which match.

Like this - in the actual .php page:



<?

//category filter for this SBA
$catFilter = "advertising";

//require sba database creator
require ("../ssi/sba_database_creator.inc");

?>



And the 'database' creator (we use the word database in this context to mean "output list of publications") looks like this:



<?


//special style to shift right cell of pdf articles
print "<style type=\"text/css\"><!--\n";
print ".shift { position:relative;left:-12px; }\n";
print "--></style>\n\n";



//output array
$output = array();
$keyCount = -1;



//remember which ones to show
$showOutput = array();



//process open tags
function processOpenTag($parser,$tagName,$attributes) {

global $catFilter,$currentTag,$output,$showOutput,$keyCount;

//remember current tag
$currentTag = $tagName;

//increase array key count
if($currentTag=="publication") {
$keyCount++;

//define optional attributes for sba category
if(!isset($attributes["sba"])) { $attributes["sba"]=""; }
if(!isset($attributes["sba2"])) { $attributes["sba2"]=""; }
if(!isset($attributes["sba3"])) { $attributes["sba3"]=""; }
if(!isset($attributes["sbax"])) { $attributes["sbax"]=""; }
if(!isset($attributes["sbax2"])) { $attributes["sbax2"]=""; }

//show output var
$showOutput[$keyCount] = false;

//if attribute value matches sba criteria
if (
$attributes["sba"] == $catFilter
||
$attributes["sba2"] == $catFilter
||
$attributes["sba3"] == $catFilter
||
$attributes["sbax"] == $catFilter
||
$attributes["sbax2"] == $catFilter
) {
$showOutput[$keyCount] = true;
}


}

}




//process closing tags
function processCloseTag($parser,$tagName) {

global $currentTag,$output,$showOutput,$keyCount;

//remember current tag
$currentTag = $tagName;

}





//process data between tags
function processTagData($parser,$tagData) {

global $currentTag,$output,$showOutput,$keyCount;

//strip tabs and line-breaks
$tagData=eregi_replace("[\n||\t]*","",$tagData);

//add tag data to output array if correct attribute digit was present
if($currentTag!="pubinfo" && $currentTag!="publication" && $showOutput[$keyCount]) {

//initialise array element
if(!isset($output[$keyCount][$currentTag])) {
$output[$keyCount][$currentTag]="";
}

//deliminate author names
if($currentTag=="author") {
$tagData = eregi_replace(",",", ",$tagData);
}

//add to array element
$output[$keyCount][$currentTag] .= $tagData;

}

}






//require xml parser
require("../ssi/xml_parser.inc");

//parse xml
parseXML("../pubinfo/articles.xml");




//sub-heading
print "<p class=\"mid-orange\"><br />Published papers and articles<br />\n\n";


//open table
print "<table cellpadding=\"0\" cellspacing=\"12\" border=\"0\">\n";


//list of pubinfo articles
foreach($output as $key => $data) {

//open row
print "<tr>";

//defaults for non-existent elements
if(!isset($data["author"])) { $data["author"] = ""; }
if(!isset($data["source"])) { $data["source"] = ""; }
if(!isset($data["url"])) { $data["url"] = ""; }
if(!isset($data["date"])) { $data["date"] = ""; }
if(!isset($data["size"])) { $data["size"] = ""; }

//pdf icon cell
if($data["size"]!=""){
print "<td valign=\"top\"><img src=\"http://www.mori.com/pics/pdfsmall.gif\" vspace=\"1\" width=\"22\" height=\"20\" border=\"0\" align=\"left\" alt=\"pdf\"></td>";
print "<td valign=\"top\" class=\"shift\">";
}
else {
print "<td valign=\"top\" colspan=\"2\">";
}

//link and title
if($data["url"]!="") {
print "<a href=\"http://www.mori.com".$data["url"]."\"";
//pdf target
if($data["size"]!=""){
print " target=\"_blank\"";
}
print ">";
}
print $data["title"];
if($data["url"]!="") { print "</a>"; }

//date
if($data["date"]!="") { print " <span class=\"date\">[".$data["date"]."]</span>"; }

//pdf size
if($data["size"]!="") { print " <span class=\"pdf\">[pdf - ".$data["size"]."K]</span>"; }

//author
print " <span class=\"source\">".$data["author"]."</span>";

//close row
print "</td></tr>\n";


}


//close table
print "</table></p>\n";





?>



You'll see in that two more external dependencies - "pubinfo.xml" is the data, while (here's the best bit) "xml_parser.inc" is just generic code which opens the file and associates the SAX methods with named functions.


<?


//parse the XML file
function parseXML($fileName) {

//location of the xml file
$xmlFile = $fileName;

//initialise xml parser
$xmlParser = xml_parser_create();

//target encoding
$targetEncoding = xml_parser_get_option($xmlParser,XML_OPTION_TARGET_ENCODING);

//disable case folding
$caseFold = xml_parser_get_option($xmlParser,XML_OPTION_CASE_FOLDING);
if($caseFold == 1) {
xml_parser_set_option($xmlParser,XML_OPTION_CASE_FOLDING,false);
}

//set process functions
xml_set_element_handler($xmlParser,"processOpenTag","processCloseTag");
xml_set_character_data_handler($xmlParser,"processTagData");


//open xml file
if (!($fp = fopen($xmlFile,"r"))) {
die ("<p><font color=red><b>Error: Cannot open XML file [$xmlFile]</b></font><p>If you are reading this message, please be kind enough to contact <a href=\"mailto:webmaster@mori.com\">webmaster@mori.com</a> and inform us of the error, including the address of this page: <i>http://www.mori.com".$_SERVER["PHP_SELF"]."?".$_SERVER["QUERY_STRING"]."</i></p>");
}

//read and parse data
while ($xmlData = fread($fp,4096)) {

if (!xml_parse($xmlParser,$xmlData,feof($fp))) {

//die on parsing error
die(
sprintf(
"<p><font color=red><b>XML error: %s at line %d</b></font>"
.
"<p>If you are reading this message, please be kind enough to contact <a href=\"mailto:webmaster@mori.com\">webmaster@mori.com</a> and inform us of the error, including the address of this page: <i>http://www.mori.com".$_SERVER["PHP_SELF"]."?".$_SERVER["QUERY_STRING"]."</i></p>",
xml_error_string(xml_get_error_code($xmlParser)),
xml_get_current_line_number($xmlParser)
)
);

//free up the parser
xml_parser_free($xmlParser);
}

}
//free up the parser
xml_parser_free($xmlParser);

}




?>



The end result of all this is here (http://www.mori.com/advertising/database.phtml) - it doesn't look like much, but there are another 25 odd pages - different business areas - which all share the same, single data source. There's even stuff on our intranet - which uses XSL to parse the XML - a completely different process on a different server platform - but the data still comes from the same document.

And .. I've written another thing which uses the same XML again, to build a search/sort application for all publications. That's here (http://www.mori.com/pubinfo/articles.phtml)

Sheesh! I'm giving away good stuff here ;)



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum