Style for the WebCSS and XSLTHenry S. ThompsonHCRC Language Technology GroupDivision of InformaticsUniversity of Edinburgh© 2003 Henry S. ThompsonOverview of the materialWhy a style language?Style control for HTMLTwo approaches to style for XML:CSS for simple casesXSL for complex cases2When you see this, it means there’s accompanying information in the Additional Notes1Why a style language?Separating form from contentSeparating structure from appearanceSingle source, multiple delivery media1Three stages on the wayDocument Compilers: ASCII text with formatting instructions and body text intermixednroff, Scribe, TeXWYSIWYG Word Processors: Out-of-band formatting instructions change appearance on-screen; proprietary file formats.Word, Word Perfect(Semi-)Structured Markup: Markup has either intrinsic or extrinsic rendering consequences.SGML, HTML, XML1Is this progress?The old document compilershad complex procedural semantics, which made debugging and maintenance very tricky for documents of any sophistication.made authoring and reading tedious, with obtrusive annotations everywhere.The use of scoped annotations in Scribe and TeX was a big improvement over _roff, but the annotations were still resolutely about appearance, not structure.LaTeX tried to fix this, but paid an unacceptable price in terms of complexity and fragility.1Is this progress?, cont'dThe WYSIWYG systemsare lovely to look at, and there's no problem with obtrusive annotations.but even with the addition of paragraph and character styles, generalisation and consistency are hard to come by.and there's the built-in obsolescence of proprietary formats to worry about.1SGML . . .SGML solved the proprietary format problemIt's an ISO standard (8879)It's human-readable (and understandable!)But for a long time there was no standard way of formatting SGML documents for printing or viewing1. . . and HTMLSo HTML (nearly/post-hoc an SGML application), by mandating a rendering semantics for all its semi-structural markup, filled a real need.But it wasnot extensible (fixed tag set)not customisable (fixed appearance per tag)1Three Problems; Three Solutions: Electronic Style!Style standard for SGML?DSSSLCustomise HTML page appearance?CSSExtend HTML tag-set and control style?XML and CSSXSLTechnology AppraisalsHenry S. ThompsonStyle for XML, London 1998-11-259Cascading Style SheetsLevel 1 Accepted Recommendation per W3C, December 1996Level 2 Accepted Recommendation, May 1998Addresses the problems of:customising the appearance of HTML documentsminimal styling for XMLInitially driven by the need for site designers to differentiate the appearance of their pages from one anotherFocus accordingly is on controlling the colour, size and shape of regions and fontsA pretty CSS example1CSSHTML example:<HTML><HEAD><TITLE>Example file</TITLE></HEAD><BODY><H2>Example text</H2><P>Here is some text. It's a paragraph of text in fact. But with very little content. Pretty boring if you ask me.<P>And some more text. Again with very little content. All marked upin vanilla HTML.</BODY></HTML>11CSSYou can change the way HTML tags are rendered: <HTML><HEAD><TITLE>An example file</TITLE><STYLE>H2 {font-family: Times, "Times New Roman", serif; color: blue; text-align: center;}P {font-family: Helvetica, sans-serif; font-style: italic; font-weight: lighter; font-size: medium; background-color: yellow; border-width: thin; border-style: solid; border-color: red}</STYLE></HEAD><BODY>...1CSS rulesCSS style rules associate properties with elements in your documents which match selectorsThe basic structure of a rule looks like this:selector[, selector ...] {pname: pvalue[; pname: pvalue ...]}Simple examples:verbatim {white-space: pre}H1 {text-align: center; font-variant: small-caps}The first would provide style for an XML doc'tThe second would change HTML's H151CSS: Cascading Style SheetsCustomising HTMLformatting <P> elements by means of simple instruction:Formatting XMLformatting <foobar> elements by means of similar instruction:P {font-weight: bold; font-size: 14pt; font-family: sans-serif}foobar {display: block; border-style: solid; background-color: green}1Associating rules with documentsContents of STYLE element in the HTML headerDestination of an appropriate LINK elementIn STYLE attributes on any HTML element56CSS selectorsRules can have one or more selectors, separated with commasSimple names select elements by nameIn addition to element type names, other selector syntax includesSpace-separated lists, indicating (non-immediate) ancestryQualification with period or hash, indicating class or id attribute matchingQualification with colon (pseudo-classes), for link state and typographic sensitivityCSS selectors: Vertical contextSometimes you need context-sensitive selectorsFor depth-sensitive renderingOL {list-style-type: lower-alpha}OL OL {list-style-type: lower-roman}For context-appropriate renderingH1 {font-weight: bold;font-size: large}H2 {font-weight: bold;font-style: italic}H3 {font-style: italic}H2 EM,H3 EM {font-style: normal}Note that in the last rule we have two selectors, separated by commas, sharing the same resultCSS boxesCSS uses a nested-boxes rendering model, and every block element is rendered into a boxBoxes all have margins, borders and padding (outside in)All four margins and paddings (left-,right-, top-, bottom-) have width properties, and a shorthand property for setting them all togetherCSS bordersBorders, in addition to widths, have colours and styles, plus shorthand properties for various combinationsThere are also float and clear properties to allow a modest amount of displacement and flow-around.CSS2 goes a lot further with thisP { margin: 3ex; border-width: thin;border-style: solid;border-left: double;text-align: justify;border-color: blue; padding: 2ex 4ex}gives the following for a sample paragraphCSS box exampleCSS property valuesSome are symbolic, e.g. font-style: italicURLs appear in a few places, e.g. background-image: url(http://www...)Most arelengths, e.g. 3em, 2pxpercentages, e.g. 110%numberscolours, e.g. red, #fd0The 'Cascade' in CSSWhat happens when there is more than one rule which provides a value for a property on a given element?The highest priority value assignment winsWhen no assignment is found, the value is either inherited or defaultedThis explains why our original H1 example was boldCSS priorityA number of things contribute to determining priorityOrigin, in increasing order of importancebrowseruserauthorSpecificity, in increasing order of importanceNumber of element typesNumber of CLASS selectorsNumber of ID selectorsImportance, marked with !importantCSS cascade exampleThe following are in increasing order of priorityLIUL LIUL OL LILI.specialOL LI.special#hotone1CSS for XML for realIn principle, it's easyJust use your own element type names instead of CSS'sIn practiceIE5 and Netscape 6 support itLack of complete support will continue to be a problemStyle sheet linkage is via a PI<?xml-stylesheet type="text/css" href="…">81CSS: SummaryEasy to learnAlso useful for HTMLWorks in Internet Explorer since IE5, Netscape since Netscape 6, Mozilla, OperaLimited by the single-tree constraint (see below)What is DSSSL?An ISO standard (ISO 10179:1996)A style languageHow do I format my SGML documents?A transformation languageHow do I transform my SGML documents?A hopeless acronymDocument Style Semantics and Specification LanguageA lost opportunity!Sunk by webhead round-paren allergies1XSL: Extensible Stylesheet LanguageA style language specifically for XMLW3C recommendation, Nov 1999Synthesis of the best of CSS and DSSSLDSSSL processing and formatting modelsCSS propertiesXSL is XMLA declarative specification of both the "pattern" and the "action" of template rules.More generic than CSSstyle and rendering are just a special case of more general tree transformation processescan be used for other transformations (XSLT)1XSL: Extensible Stylesheet LanguageMain propertieslocalised (template rules keyed to elements)scoped (inheritance of general characteristics)unbiased (with respect to writing direction, language,…)“indent paragraph” will mean “indent on the left” for English“indent on the right” for Hebrew“inline” will meanleft to right for Englishtop to bottom for some Japanese textWhat is XSL for?Portable standard style specificationSingle source documents, multiple delivery mediaPrintPresentationScreenMultiple document types, single house styleJust as much complexity as you needSingle continuous scroll for screen deliveryMulti-column pages with side bars etc. for booksWhat is XSL not for?Controlling filling and line breaking:left to a low-level formatting enginePage or line fidelity:use a photocopier!Carefully crafted page layout:emphasis is on automatable processesUser interaction:ditto1XSLT process architectureCSS takes document tree and decorates it with formatting propertiesXSL takes a document tree and builds a new document tree which it then decoratesXSL is really two languagesa transformation language (XSLT)a formatting language (XSL-FO)1XSLT TransformationsXSLT style sheet: template rulespattern which specifies which tree it applies toresult which specifies which tree it should outputXSLT processorreads XML document and XSL stylesheetcarries out the instructions in the stylesheetoutputs a new XML document1XSLT TransformationsFrom XML to XMLnot from/to PDF, TeX, Word, Postscript,…you can go from XML to intermediate language, and then different processor to Tex, Word,…you can go from/to HTML or SGML if it’s well-formed XML1XSLT ProcessingThree places it can happenWeb browser (e.g. IE5) is handed XML document and stylesheet, transforms the document and presents it to the userThe server applies style sheet to document to create different format (e.g. HTML) and sends that document to the client http://www.alphaworks.ibm.com/tech/LotusXSLA program (e.g. xt) transforms XML document before it is placed on the serverhttp://www.jclark.comKey style conceptsModularStructuring of specs at the XML levelLocalisedtemplate rules keyed to elementsScopedGeneric characteristics are inheritedUnbiasedNo expectations regarding national language, writing direction or character set1Process architectureCSS takes a document tree and decorates it with formatting propertiesXSLT takes a (source) document tree and builds a new (result) document treeIf the result tree's vocabulary defines appearance, then XSLT can be a style language(X)HTML is an obvious candidateXSL-FO is another, for high-quality printXSLT is XMLNo parentheses!XSLT is notated with XML element typesDSSSL semantics without DSSSL syntaxBut you can think of it more like specifying a transformation from one DTD (the source document) to another (for a formatting object specification document)Template rulesThe main component of an XSL stylesheet is the template ruleEach template rule containsA pattern, identifying the source document elements which the rule should apply toLike a CSS selectorA template, specifying what gets added to the result tree when this rule firesCrucially, and unlike CSS, this typically involves elementsSimple rule example <xsl:template match='div/title'><fo:block font-weight='bold'><xsl:apply-templates/></fo:block></xsl:template>T h e s . . .divtitleBlock [f-w: bold]Thes. . .PatternRestriction onmatch contextThe el't typeto matchThe for-matting objectto be createdThe content of the formatting object:use the subordinate resultsXSL and CSSWe could try translate our example into CSS as follows:div title { font-weight: bold }But that would actually be wrong:The interpretation of nesting is different: The XSL pattern matches title elements with div as parent, where the CSS pattern matches title elements with div as ancestor at any remove.XSL does not require a one-to-one relation between source and destinationRicher patternsXSL can restrict matches based onancestrydescendantsposition wrt siblingsattribute presence/absence/valueThese are expressed in the form of path expressions, which are shared with the draft XPointer proposalThe common part is called XPath1XPath patterns/ for (root's) children// for (root's) descendants.. for parentname for matching elements@name for matching attributes[. . .] for conditions=, != for (in)equalityNumerical and boolean expressionsString and number literalsSpecial-purpose functionsnot(…), position(), last()12SpecificityWith all these pattern variants, what happens if two rules match?Drawing on both DSSSL and CSS, there are a set of precedence rulesBasically, the richer the pattern, the higher precedenceIf all else fails, there is a numeric priority attributeIconic actionsThe 'action' part of a rule isn't much like an action at allIt's more like a picture of what you want in the way of formatting objectsNesting is specified directlySo you can build up quite detailed formatting object structuresThe special xsl:apply-templates element type determines where the formatting objects resulting from processing the children of the matched node should be plugged inAction exampleThis 'action' builds a rich result structure <p> <span style='font-size: 150%'> <xsl:value-of select='@name'/> <xsl:text>. . . .</xsl:text> </span> <em> <xsl:apply-templates/> </em> </p>Plugging in resultsdemoHTMLBODY<x:templ match='demo'><HTML> <BODY> <x:apply-templates/> </BODY></HTML>P<x:templ match='para'><P> <x:apply-templates/></P>paraT h e f . . .Thef. . .paraT h e s . . .PThes. . .1Combining XSLT and CSSAdd a <style> element to the template for the root<xsl:template match="/"> . . .<head> <style> SPAN.price {color: red} </style> . . .</xsl:template>SelectionYou may not always want to just invoke processing on a node's children in the ordinary wayYou can supply a select attribute on xsl:apply-templates to specify what you want processedIf all you want is the text content of an element or attribute as suchUse the xsl:value-of element insteadExample of select<xsl:template match='/'> <HTML> <HEAD> <TITLE> <xsl:value-of select='doc/title'/> </TITLE> </HEAD> <BODY> <xsl:apply-templates/> </BODY> </HTML></xsl:template>14Reordering using selectYou may not even want material to appear in the output in the same order it appears in the source, e.g. if the source was derived from a databaseselect can be used to reorder by pulling out first one child type, then another, etc.<xsl:apply-templates select='a'/><xsl:apply-templates select='b'/>All a's will end up before all b's, regardless of where they startedxsl:sort provides more detailed controlDefaultsXSL has two default rules, similar to DSSSL'sFor character nodes, build a character formatting objectFor all others, a sequence of the results of processing all children151XSLT for Transformation as suchXML to XML can be very usefulDTDs changeDocuments can be mergedSophisticated applications can be built by combining multiple XSLT-implemented transformations1The identity transformationThe core of every serious transformation<xsl:template match="@*|*|comment()|processing- instruction()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy></xsl:template>1VariablesXSLT is a pure functional languageNo stateNo side-effectsYou can bind variables<xsl:variable name="currencySymbol"> £ </xsl:variable> <xsl:variable name="title" select="/catalog/title"/>And access them<xsl:value-of select="$currencySymbol"/>1Combining several documentsThe document function allows access within a stylesheet to named other documentsIf bound to a variable, can then be used as the starting point for a search<xsl:variable name="catalog" select="document('exa15.xml')/*"/>. . .<xsl:value-of select= "$catalog/entry[number='E102']/price"/>Implementations of XSLJames Clark has implemented most of XSLTLotus, IBM and others have done so as wellThe best Java implementation in my view is Michael Kay's Saxon (http://users.iclway.co.uk/mhkay/saxon/)IE5+ with the MSXML4 product (http://msdn.microsoft.com/xml/) supports the whole languageAnd its much faster than the Java implementationsOthers are implementing subsets of the formatting semanticsUsage is increasing very rapidly1Recall: XSL TransformationsThree places it can happenWeb browser (e.g. IE5) is handed XML document and stylesheet, transforms the document and presents it to the userthat is what we have been doing so farThe server applies style sheet to document to create different format (e.g. HTML) and sends that document to the client http://www.alphaworks.ibm.com/tech/ A program (e.g. xt) transforms XML document before it is placed on the serverhttp://www.jclark.com/