To access the contents, click the chapter and section titles.
Perl CGl Programming: No experience required.
(Publisher: Sybex, Inc.)
Author(s): Erik Strom
ISBN: 0782121578
Publication Date: 11/01/97
Skill 10 The Language of the Web
- Introducing SGML: the Standard Generalized Markup Language
- Defining document types
- How HTML fits into the World Wide Web
- Extending HTML with applets and frames
By now, youve gained no small amount of knowledge about how Perl and the Common Gateway Interface can be used to make your Web site both dynamic and attractive.
Perl and CGI are wonderful tools, but what they do happens for the most part in the background. Youve probably already discovered that the bulk of your Web work has been and will be done in the Hypertext Markup Language, or HTML. Even many of your CGI applications will be Perl programs that write HTML programs; its what youve been doing through much of this book.
HTML is truly the lingua franca of the World Wide Web. Where did it come from? What led to this sometimes-strange collection of tags and commands that, fed into a Web browser, puts nicely formatted text and pictures on your visitors screens?
Introducing SGML: the Basis of HTML
SGML, or the Standard Generalized Markup Language, was developed more than 10 years ago as a formal method of describing how electronic text should look. It is called a metalanguage because it provides a framework for writing other markup languages that will behave in a way that is independent of hardware and software systems. SGML describes how a document should look by describing the purpose of each element in the document; the nuts and bolts of creating the look are left up to the device that does the display, which in our case is a Web browser.
HTML was derived from SGML; it is a subset of the older language. Your understanding of the language you use most often in your Web work can be enhanced a great deal by an examination of its parent: SGML.
Marking Up Documents
Both HTML and SGML are formally named as markup languages. The concept of marking up text is a bit murky to people who arent involved in the printing trades, but its really not that difficult to understand.
In the dark ages before computers got cheap (and therefore popular), as a writer you would send typed sheets of paper to a human printer. You would scribble annotations in the margins telling the printer how to go about typesetting the document and, if you were lucky, your annotations were clear enough for the printer to produce the look you were trying to achieve.
This was called markup and the term has stayed with us all the way through a near-total automation of the printing process. Its all done with computers now, but the concept of marked-up text is still very real.
Any document created by a word processor is filled with commands that tell the word processor and the electronic printer how the document should look. The commands usually are binary codes that translate to gibberish if you look at them, but the word processor understands them perfectly.
The commands can be somewhat understandable, too, as in the case of a Rich Text Format (RTF) file. Figure 10.1 shows the first part of this skill formatted in RTF.
Figure 10.1: The first portion of Skill 10 in Rich Text Format
Rich Text Format is a markup language. As you can see from the illustration in Figure 10.1, its pretty big and intimidating, but the commands are written in semi-English and, with a modicum of study, you could take them apart and interpret them yourself. This is not something you could do with, say, a Microsoft Word document that you simply called up in anything other than Microsoft Word. At best, in that case, you would see a completely unintelligible stream of hieroglyphics; at worst, the display would be so strange that it would crash your text editor.
Still, Word files are marked up, too. Its just that the markup language is virtually indecipherable to anything but Word.
The thing to remember about markup is that it is included directly in the document. It is usually set off by its own delimitersthe forward slash character (/) in RTF, for example, or the less-than and greater-than characters (<>) in HTMLthat distinguish actual text from markup commands. Markup is intended to describe to some other entity, such as a printer or a Web browser, how the formatted text is supposed to look when its displayed or printed.
|