Click Here!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Perl CGl Programming: No experience required.
(Publisher: Sybex, Inc.)
Author(s): Erik Strom
ISBN: 0782121578
Publication Date: 11/01/97

Bookmark It

Search this book:
 
Previous Table of Contents Next


Skill 10
The Language of the Web

  Introducing SGML: the Standard Generalized Markup Language
  Defining document types
  How HTML fits into the World Wide Web
  Extending HTML with applets and frames

By now, you’ve gained no small amount of knowledge about how Perl and the Common Gateway Interface can be used to make your Web site both dynamic and attractive.

Perl and CGI are wonderful tools, but what they do happens for the most part in the background. You’ve probably already discovered that the bulk of your Web work has been and will be done in the Hypertext Markup Language, or HTML. Even many of your CGI applications will be Perl programs that write HTML programs; it’s what you’ve been doing through much of this book.

HTML is truly the lingua franca of the World Wide Web. Where did it come from? What led to this sometimes-strange collection of tags and commands that, fed into a Web browser, puts nicely formatted text and pictures on your visitors’ screens?

Introducing SGML: the Basis of HTML

SGML, or the Standard Generalized Markup Language, was developed more than 10 years ago as a formal method of describing how electronic text should look. It is called a metalanguage because it provides a framework for writing other markup languages that will behave in a way that is independent of hardware and software systems. SGML describes how a document should look by describing the purpose of each element in the document; the nuts and bolts of creating the look are left up to the device that does the display, which in our case is a Web browser.

HTML was derived from SGML; it is a subset of the older language. Your understanding of the language you use most often in your Web work can be enhanced a great deal by an examination of its parent: SGML.

Marking Up Documents

Both HTML and SGML are formally named as markup languages. The concept of “marking up” text is a bit murky to people who aren’t involved in the printing trades, but it’s really not that difficult to understand.

In the dark ages before computers got cheap (and therefore popular), as a writer you would send typed sheets of paper to a human printer. You would scribble annotations in the margins telling the printer how to go about typesetting the document and, if you were lucky, your annotations were clear enough for the printer to produce the look you were trying to achieve.

This was called markup and the term has stayed with us all the way through a near-total automation of the printing process. It’s all done with computers now, but the concept of marked-up text is still very real.

Any document created by a word processor is filled with commands that tell the word processor and the electronic printer how the document should look. The commands usually are binary codes that translate to gibberish if you look at them, but the word processor understands them perfectly.

The commands can be somewhat understandable, too, as in the case of a Rich Text Format (RTF) file. Figure 10.1 shows the first part of this skill formatted in RTF.


Figure 10.1:  The first portion of Skill 10 in Rich Text Format

Rich Text Format is a markup language. As you can see from the illustration in Figure 10.1, it’s pretty big and intimidating, but the commands are written in semi-English and, with a modicum of study, you could take them apart and interpret them yourself. This is not something you could do with, say, a Microsoft Word document that you simply called up in anything other than Microsoft Word. At best, in that case, you would see a completely unintelligible stream of hieroglyphics; at worst, the display would be so strange that it would crash your text editor.

Still, Word files are marked up, too. It’s just that the markup language is virtually indecipherable to anything but Word.

The thing to remember about markup is that it is included directly in the document. It is usually set off by its own delimiters—the forward slash character (/) in RTF, for example, or the less-than and greater-than characters (<>) in HTML—that distinguish actual text from markup commands. Markup is intended to describe to some other entity, such as a printer or a Web browser, how the formatted text is supposed to look when it’s displayed or printed.


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited.