STC - Society for Technical Communication

Join STC
Upgrade your STC membership

Bylaws Education Committee Professional Development Employment Links Meetings Contacts Newsletter Restricted Access Home

 
Society for Technical Communication
Orlando Chapter STC
Professional Development

Notes from 50th International STC Conference
Dallas, Texas, May 18-21, 2003

XML for Technical Communicators

Neil Perlin
Hyper/Word Services

Neil Perlin, with Hyper/Word Services, is a "bleeding edge" expert in HTML, XML, and online publishing.

Session Description: This session was designed to familiarize technical communicators with XML and its capabilities, not to teach XML coding.

  • Preview: Overview, XML core components, support, resources.
  • Overview
     
    • XML = extensible markup language: It's a meta language for creating other languages. You create your own tags, unlike in HTML. Derives from SGML and HTML. The ability to customize tags is a key differentiator for XML.
    • SGML = standard generalized markup language
    • HTML = hypertext markup language: the basis for the Web
  • Benefits of XML: Avoids HTML's problems and limitations
     
    • Stricter adherence to syntax: in XML, you can control how strictly you will have to adhere to the syntax
    • Extensibility lets you create custom tags
    • Content-focus ignores formatting or handles it elsewhere
    • More robust linking, eventually
    • Code can be reused and designed for reuse
    • Seems to be the path to single-sourcing
    • Theoretically, at least, XML is universally compatible with all platforms, via conversions
    • known as transforms that make it into HTML, Microsoft Reader, WML, etc.
  • XML examples
     
    • WML (wireless ML): XML'ified HDML: for cell phones
    • MathML: tag-based equation editing in place of MathCAD screen captures
    • WSDL (Web Services, description language promulgated by the Worldwide Web Consortium [WWWC]): for describing services listed in registries under Web Services.
    • UDDI = universal description, discovery and integration of web services
    • MAML (MicroArray ML): stores DNA-array test data
    • NewsML: from Reuters
  • What about HTML?
     
    • Officially dead: now XHTML, an XML application. XHTML is HTML "done right"... following XML syntax requirements.
    • Realistically, very much alive: few current sites need XML's power; therefore, HTML will be used for years
  • XML core components
     
    • XML: content
    • DTD: syntax
    • CSS or XSL (rendering, formatting)
  • Content: element structure and tagging
     
    • Content must be structured to be effective
    • Content must be tagged
  • Elements
     
    • HTML tags act like on/off switches
    • XML "elements" are units consisting of a start tag, content, and an end tag
  • Syntax standards: XML documents must meet one of two syntax standards:
     
    • Well-formed: the basic standard. Document must meet minimum standard criteria
    • Valid: document must be well-formed and adhere to a DTD or schema
  • Well-formed criteria include:
     
    • Proper element nesting (e.g., bold-on, italics-on, italics-off, bold-off; not bold-on, italics-on, bold-off, italics-off). Note: To actually show "<" and ">" signs and not have them trigger code, you can either specify CDATA as "&#60;" and "&#62;" or you can type their HTML codes out as "&lt;" and "&gt;"
    • All elements have a start and end tag with matching capitalization
    • Attribute values are in single or double quotes
    • Empty elements need an end or closed start tag.
  • Why "well-formed" matters
     
    • Guarantees the document's syntax before sending it to an application
    • A clean syntax guarantee = less ambiguity to resolve = faster processing and smaller browsers (a "well-formed" violation is a fatal error)
    • XML parsers give you error messages that can be difficult to figure out, but they do generally give you a line # and a character #
  • Valid: to be valid, a document must be well-formed and adhere to a DTD or schema
     
    • DTD = document type definition.
    • DTD specifies elements in the document, their structural requirements, whether they're mandatory or optional.
    • A DTD effectively specifies the documentÍs grammatical rules.
  • Example of DTD's structural model
     
    • A memo consists of a head and a body: the head consists of a date, from, to, cc, and title (the date consists of..., the from consists of..., etc.); the body consists of one or more paragraphs (a paragraph consists of...)
    • Notice the structure...
  • Hand coding is more "fun," and with expertise, opens up more options, but WYSIWIG tools like RoboHelp, Dreamweaver, FrontPage, etc., avoid typos in code, resulting in far fewer code errors. WYSIWIG tools for XML are in the making. Use them.
  • XML Notepad is a free, basic XML writer. Good for practice; not sufficiently robust for actual documentation.
  • Why use a DTD?
     
    • "Well-formed" means the document meets a minimum standard set of rules
    • A DTD lets you define your own rules and languages and make sure the XML content adheres to them (WML, MAML, etc.)
  • In the DTD, you define what CONTENT must be contained within each tag, not the format.
  • After all the content definitions as you tunnel down come the remaining elements:
    #PCDATA = parseable character data that the processor will check for entities and markup characters.
  • Schemas
     
    • The next step beyond DTDs
    • More powerful and extensible than DTDs
    • Schemas are extensible. They are XML documents and written in XML syntax, unlike DTDs. Schemas support data types like date and currency that don't apply to doc but are vital for e-commerce
  • DTDs vs schemas for technical communication
     
    • Why use schemas?
       
      • More powerful than DTDs
      • The wave of the future: growing support by tools and by Office 2003
    • Why use DTDs?
       
      • Wider tool support
      • More examples available for use and reference
      • Wider pool of experienced developers
      • Well-suited for doc
  • Rendering (display)
     
    • HTML was designed to be displayed by a browser
    • But XML uses custom tags that a browser doesn't know how to display; legibility therefore requires applying styles (cascading style sheets [CSSs], extensible style language [XSL])
  • Cascading style sheets: work the same in XML as they do in HTML
     
    • HTML's equivalent of a Word template
    • "Cascading" refers to three types of style sheets that affect the display of the document:
       
      • Linked: a separate HTML file with a CSS extension, linked to from each document
      • Embedded: stored in each document to which the styles apply
      • Inline: a "local" setting
    • CSS was designed for HTML but works fine under XML. Rather than create an XSL style sheet, you can create a simpler CSS and attach it to an XML document with a command
  • Extensible style language
     
    • Comes from DSSSL, the SGML style language derived from LISP
    • An XSL sheet is well-formed XML
    • Supports a style sheet DTD for validation
    • Offers far greater processing ability than CSS: XML transforms (XSLT) can transform XML documents to other formats, such as HTML or WML. This is why XML appears to be the route to single-sourcing.
    • Browser support: pick up table
  • How XML will affect us
     
    • Help and Web work began at the code level and evolved to the GUI level
    • XML will also, but faster, because we now "know" the model
    • This suggests several things: most of us will write in XML, not code it; GUI tools will soon predominate; code skills will become less important but are always useful, especially for independents
  • XML Authoring Tools
     
    • XML Notepad, by Microsoft: available free; can be used to develop schemas but not DTDs.
    • XML Spy: Icon Information Systems ($349)
    • Turbo XML: Tibco ($270)
    • Stylus Studio: Excelon
    • XMetal: SoftQuad ($499): good, but Corel, which bought it, is in financial trouble
    • Epic: Arbortext
  • Word Processor Add-Ons
     
    • For Word: eXtyles: Inera and S4/Text 4O-i4i
    • For Framemaker, Framemaker with XML
  • Online Resources
     
 
   
Back  to Notes from 50th International STC Conference
 
   
BYLAWS | EDUCATION COMMITTEE | PROFESSIONAL DEVELOPMENT | EMPLOYMENT | EVENTS | LINKS
MEETINGS | CONTACTS | NEWSLETTER | RESTRICTED ACCESS | HOME
   
© 2007 Orlando Chapter STC