STC - Society for Technical Communication
Join STC
Renew your STC membership

Bylaws Education Committee Professional Development Employment Links Meetings Contacts Newsletter Restricted Access Home

 
Society for Technical Communication
Orlando Chapter STC
Professional Development

Notes from 46th International STC Conference
Cincinnati, Ohio, May 16-19, 1999

SGML or XML: Which is Right for You?

Lori DeFurio
Adobe Systems, Inc.

Session Description: SGML, the standard generalized markup language, has received a lot of attention since it was first proposed in 1986. Many organizations that have already implemented SGML-based publishing systems are wondering how XML fits into the picture.

  • Target users
     
    • Anyone who requires cross-media publishing
    • Anyone who requires document content reusability
  • Target markets
     
    • Authoring environment
    • Publishing environment
    • Vertical integration through Document Type Definitions (DTDs)
  • SGML = Standard Generalized Markup Language, adopted by ISO in 1986.
     
    • A method for defining structure and organization of information... separating format from content
    • Syntax for identifying information components
    • Hardware and software platform neutral
  • Physically, SGML looks very similar to HTML... it uses tags enclosed in < >.
  • ASCII code married to a style sheet (DTD) produces a finished product.
  • Specific example: em-dash. Netscape, Explorer, Word, and WordPerfect might all have different tags for it. SGML's answer is to use universal ASCII code for the character, and let each application/browser apply its own code to generate the character.
  • HTML: Hypertext Meta Language... programmers developed a finite set of tags that would be readable by all browsers.
  • XML = eXtensible Markup Language. W3C standard... first published in 1997. Combined the best of both worlds: the power of SGML with the standardization and hyperlinking capability of HTML.
  • XML allows scalable vector-based graphics on the Web (as opposed to the limited resolution of JPGs and GIFs).
  • XML also accommodates the double-byte character requirements of Eastern languages, and is thus better suited toward global Web-based product support and marketing.
  • XML can stand alone, without a DTD, because all the information is already embedded. Thus, each element within the code must contain all the information in tags, so the parser can read as it goes along.
  • DTD = Document Type Definition.
     
    • It provides a common vocabulary that all the documents within the SGML family can understand.
    • It provides built-in rules of organization.
    • XML is a subset of SGML.
    • SGML requires a DTD; XML can be used with or without a DTD.
    • XML is optimized for the Web.
    • XML retains the hierarchy of HTML.
  • How to pick between SGML and XML.
     
    • DoD = SGML
    • Web = XML
    • E-commerce = XML
    • Data interchange = SGML or XML
  • Do you need a DTD? When is a DTD required?
     
    • Stand-alone, single document... DTD is not appropriate
    • Part of an information base, with reusable chunks that you do not want to reauthor... DTD is well-suited to handle this
    • DTD is required for some phases of workflow: authoring and revision, formatting
    • DTD is not required for PDF, HTML, or XML
    • DTD is required for XML
    • DTD is required for archiving only if archiving is in SGML
  • FrameMaker + SGML does not import XML. You can export from FrameMaker + SGML into XML, however. Two files are produced: an .xml and a .css (cascading style sheet, providing the hierarchy needed for Web application).
  • Browsers don't need a DTD. It doesn't matter if the organization rules aren't followed. The browser works entirely with tags found within the document, which is why SGML does not work on the Web and XML does.
  • XML provides flexibility for DTD-less authoring for ad hoc documents and documents during the design phase. In other words, don't invest the effort to develop a firm DTD until the document is mature. Then add the rigidity and power implicit in SGML.
  • What tools are available?
     
    • SGML-based tools: Adobe FrameMaker + SGML, ArborText Adept Editor, Timelux
    • HTML-based: SoftQuad XMetal
    • New releases: See online resources, below.
  • Online resources
     
    • Check out Adobe on the potential XML offers for the Web.
    • Go for a browse on XMLU, which bills itself as "the premier source for information on XML, the new language for the Web!"
    • Find out what's hot at Graphic Communications Association (GCA), which, by self-proclamation, is the SGML/XML Association.
  • Adobe Acrobat 4.0 lets you go to a Web page and export directly as a PDF.
  • SGML and XML are both useful. Which you select depends on:
     
    • Type of document
    • Phase of the workflow
    • Delivery media/path
  • SGML is a powerful universal tool that is complex and costly to develop, but which has a large payoff for documents that will be reused extensively and/or exported to a variety of different media. SGML can export to PDF, HTML, and XML, among others. However, if you commit to SGML too early in the document development process, the cost of changes will eat you alive. And if you use SGML for a single-use document, you are in an overkill mode, like using a howitzer to kill a gnat...there is an insufficient ROI.
  • Tool availability determines the implementation rate for any new software application.
  • XSL = Extensible Style Language. Microsoft has added proprietary tags to Internet Explorer to make sure Adobe and others can't play. However, market demand has forced them to read XML plus cascading style sheets as well.

    Note: Some believe XSL is a dangerous development for the Web. For more on this controversial new tool, including its relationship to cascading style sheets, check out XML.com.
  • On the "bleeding edge": brand-new concepts that offer high potential returns but which have not matured sufficiently to allow implementation without high risk.
  • On the "leading edge": new concepts with sufficient tools in place to allow risk-controlled implementation
  • Demo: FrameMaker + SGML looks like a very powerful and flexible tool to develop documents that can easily be repurposed into many different media.
  • FrameMaker + SGML can export "well-formed XML" plus a cascading style sheet that can then be posted on the Web. Explorer 5 supports XML; the next version of Netscape will support it. However, those with older versions of the browsers will still only be able to read HTML. Thus, there will be a transitional period where you have to do both to hit the widest audience.
 
   
Back  to Notes from 46th International STC Conference
 
   
BYLAWS | EDUCATION COMMITTEE | PROFESSIONAL DEVELOPMENT | EMPLOYMENT | LINKS
MEETINGS | CONTACTS | NEWSLETTER | RESTRICTED ACCESS | HOME
   
© 2012 Orlando Chapter STC