|
|
|
Notes from 46th International STC Conference
Cincinnati, Ohio, May 16-19, 1999
SGML or XML: Which is Right for You?
Session Description: SGML, the standard generalized markup language,
has received a lot of attention since it was first proposed in 1986. Many organizations that
have already implemented SGML-based publishing systems are wondering how
XML fits into the picture.
- Target users
- Anyone who requires cross-media publishing
- Anyone who requires document content reusability
- Target markets
- Authoring environment
- Publishing environment
- Vertical integration through Document Type Definitions (DTDs)
- SGML = Standard Generalized Markup Language, adopted by ISO in 1986.
- A method for defining
structure and organization of information... separating format from content
- Syntax for identifying information components
- Hardware and software platform neutral
- Physically, SGML looks very similar to HTML... it uses tags enclosed in < >.
- ASCII code married to a style sheet (DTD) produces a finished product.
- Specific example: em-dash.
Netscape, Explorer, Word, and WordPerfect might all have different tags
for it. SGML's answer is to use universal ASCII code for the character,
and let each application/browser apply its own code to generate the character.
- HTML: Hypertext Meta Language... programmers developed a finite set of
tags that would be readable by all browsers.
- XML = eXtensible Markup Language. W3C standard... first published in 1997.
Combined the best of both worlds: the power of SGML with the standardization and hyperlinking
capability of HTML.
- XML allows scalable vector-based graphics on the Web (as opposed to the
limited resolution of JPGs and GIFs).
- XML also accommodates the
double-byte character requirements of Eastern languages, and is thus
better suited toward global Web-based product support and marketing.
- XML can stand alone, without a DTD, because all the information is already embedded.
Thus, each element within the code must contain all the information in tags,
so the parser can read as it goes along.
- DTD = Document Type Definition.
- It provides a common vocabulary that all the documents within
the SGML family can understand.
- It provides built-in rules of organization.
- XML is a subset of SGML.
- SGML requires a DTD; XML can be used with or without a DTD.
- XML is optimized for the Web.
- XML retains the hierarchy of HTML.
- How to pick between SGML and XML.
- DoD = SGML
- Web = XML
- E-commerce = XML
- Data interchange = SGML or XML
- Do you need a DTD? When is a DTD required?
- Stand-alone, single document... DTD is not appropriate
- Part of an information base, with reusable chunks that you do not want to reauthor...
DTD is well-suited to handle this
- DTD is required for some phases of workflow: authoring and revision, formatting
- DTD is not required for PDF, HTML, or XML
- DTD is required for XML
- DTD is required for archiving only if archiving is in SGML
- FrameMaker + SGML does not
import XML. You can export from FrameMaker + SGML into XML, however.
Two files are produced: an .xml and a .css (cascading style sheet, providing
the hierarchy needed for Web application).
- Browsers don't need a DTD.
It doesn't matter if the organization rules aren't followed. The browser
works entirely with tags found within the document, which is why SGML
does not work on the Web and XML does.
- XML provides flexibility
for DTD-less authoring for ad hoc documents and documents during the
design phase. In other words, don't invest the effort to develop a firm
DTD until the document is mature. Then add the rigidity and power implicit in SGML.
- What tools are available?
- SGML-based tools: Adobe FrameMaker + SGML, ArborText Adept Editor, Timelux
- HTML-based: SoftQuad XMetal
- New releases: See online resources, below.
- Online resources
- Check out Adobe
on the potential XML offers for the Web.
- Go for a browse on XMLU, which bills itself
as "the premier source for information on XML, the new language for the Web!"
- Find out what's hot
at Graphic Communications Association
(GCA), which, by self-proclamation, is the SGML/XML Association.
- Adobe Acrobat 4.0 lets you go to a Web page and export directly as a PDF.
- SGML and XML are both useful. Which you select depends on:
- Type of document
- Phase of the workflow
- Delivery media/path
- SGML is a powerful universal
tool that is complex and costly to develop, but which has a large payoff
for documents that will be reused extensively and/or exported to a variety
of different media. SGML can export to PDF, HTML, and XML, among others.
However, if you commit to SGML too early in the document development
process, the cost of changes will eat you alive. And if you use SGML
for a single-use document, you are in an overkill mode, like
using a howitzer to kill a gnat...there is an insufficient ROI.
- Tool availability determines
the implementation rate for any new software application.
- XSL = Extensible Style
Language. Microsoft has added proprietary tags to Internet Explorer
to make sure Adobe and others can't play. However, market demand has
forced them to read XML plus cascading style sheets as well.
Note: Some believe
XSL is a dangerous development for the Web. For more on this
controversial new tool, including its relationship to cascading style
sheets, check out XML.com.
- On the "bleeding edge":
brand-new concepts that offer high potential returns but which have
not matured sufficiently to allow implementation without high risk.
- On the "leading edge":
new concepts with sufficient tools in place to allow risk-controlled implementation
- Demo: FrameMaker + SGML
looks like a very powerful and flexible tool to develop documents that
can easily be repurposed into many different media.
- FrameMaker + SGML can export
"well-formed XML" plus a cascading style sheet that can then be posted
on the Web. Explorer 5 supports XML; the next version of Netscape will
support it. However, those with older versions of the browsers will
still only be able to read HTML. Thus, there will be a transitional
period where you have to do both to hit the widest audience.
|