Your Brain is Hungry.
InformIT
ProgramITDatabaseITWebITNetworkITConfigureIT


Enter search string and press enter!

You are here : Home : Upgrading & Repairing PCs, Eighth Edition

Upgrading & Repairing PCs, Eighth Edition

Add To MyInformIT

 

< Back Contents Next >

Author: Scott Mueller
Retail Price: $49.99
Publisher: Que
ISBN: 0789712954
Publication Date: 9/16/97
Pages: 1168


Chapter 14 - Hard Disk Drives

Gain understanding of the hard disk drive from a physical, mechanical, and electrical point of view

 
1x1blue.gif (43 bytes)
 

 

To most users, the hard disk drive is the most important, yet most mysterious, part of a computer system. A hard disk drive is a sealed unit that holds the data in a system. When the hard disk fails, the consequences usually are very serious. To maintain, service, and expand a PC system properly, you must fully understand the hard disk unit.

Most computer users want to know how hard disk drives work and what to do when a problem occurs. Few books about hard disks, however, cover the detail necessary for the PC technician or sophisticated user. This chapter corrects that situation.

This chapter thoroughly describes the hard disk drive from a physical, mechanical, and electrical point of view. In particular, this chapter examines the construction and operation of a hard disk drive in a practical sense.

Definition of a Hard Disk

A hard disk drive contains rigid, disk-shaped platters usually constructed of aluminum or glass. Unlike floppy disks, the platters cannot bend or flex--hence the term hard disk. In most hard disk drives, the platters cannot be removed; for that reason, IBM calls them fixed disk drives. Although a removable hard disk drive has been very popular of late, the Jaz drive by Iomega is unlike its smaller brother, the Zip drive, in that the Jaz drive's removable media is comprised of the same hard disks found in any fixed disk drive.

Hard disk drives used to be called Winchester drives. This term dates back to the 1960s, when IBM developed a high-speed hard disk drive that had 30M of fixed-platter storage and 30M of removable-platter storage. The drive had platters that spun at high speeds and heads that floated over the platters while they spun in a sealed environment. That drive, the 30-30 drive, soon received the nickname Winchester after the famous Winchester 30-30 rifle. After that time, drives that used a high-speed spinning platter with a floating head also became known as Winchester drives. The term has no technical or scientific meaning; it is a slang term, and is considered synonymous with hard disk.

Capacity Measurements

To eliminate confusion in capacity measurements, I will be using the abbreviation M in this section. The true industry standard abbreviations for these figures are shown in Table 14.1.

Table 14.1  Standard Abbreviations and Meanings

Abbreviation Description Decimal Meaning Binary Meaning
Kbit Kilobit 1,000 1,024
K Kilobyte 1,000 1,024
Mbit Megabit 1,000,000 1,048,576
M Megabyte 1,000,000 1,048,576
Gbit Gigabit 1,000,000,000 1,073,741,824
G Gigabyte 1,000,000,000 1,073,741,824
Tbit Terabit 1,000,000,000,000 1,099,511,627,776
T Terabyte 1,000,000,000,000 1,099,511,627,776

Unfortunately, there are no differences in the abbreviations when used to indicate metric verses binary values. In other words, M can be used to indicate both "millions of bytes" and megabytes. In general, memory values are always computed using the binary derived meanings, while disk capacity goes either way. Unfortunately, this often leads to confusion in reporting disk capacities.

Hard Drive Advancements

In the 15 or more years that hard disks have commonly been used in PC systems, they have undergone tremendous changes. To give you an idea of how far hard drives have come in that time, following are some of the most profound changes in PC hard disk storage:

  • Maximum storage capacities have increased from the 10M 5 1/4-inch full-height drives available in 1982 to 10G or more for small 3 1/2-inch half-height drives, and 3G or more for notebook 2 1/2-inch drives.

  • Data transfer rates from the media have increased from 85 to 102K/sec for the original IBM XT in 1983 to nearly 10M/sec for some of the fastest drives today.

  • Average seek times have decreased from more than 85 ms (milliseconds) for the 10M XT hard disk in 1983 to fewer than 8 ms for some of the fastest drives today.

  • In 1982, a 10M drive cost more than $1,500 ($150 per megabyte). Today, the cost of hard drives has dropped to 10 cents per megabyte or less.

Areal Density

Areal density has been used as a primary technology-growth-rate indicator for the hard disk drive industry. Areal density is defined as the product of the linear bits per inch (BPI), measured along the length of the tracks around the disk, multiplied by the number of tracks per inch (TPI) measured radially on the disk. The results are expressed in units of Mbit per square inch (Mbit/sq-inch) and are used as a measure of efficiency in drive recording technology. Current high-end 2.5-inch drives record at areal densities of about 1.5Gbit per square inch (Gbit/sq-inch). Prototype drives with densities as high as 10Gbit/sq-inch have been constructed, allowing for capacities of more than 20G on a single 2 1/2-inch platter for notebook drives.

Areal density (and, therefore, drive capacity) has been doubling approximately every two to three years, and production disk drives are likely to reach areal densities of 10+Gbit/sq-inch before the year 2000. A drive built with this technology would be capable of storing more than 10G of data on a single 2 1/2-inch platter, allowing 20 or 30G drives to be constructed that fit in the palm of your hand. New media and head technologies, such as ceramic or glass platters, MR (Magneto-Resistive) heads, pseudo-contact recording, and PRML (Partial Response Maximum Likelihood) electronics, are being developed to support these higher areal densities. The primary challenge in achieving higher densities is manufacturing drive heads and disks to operate at closer tolerances.

It seems almost incredible that computer technology improves by doubling performance or capacity every two to three years--if only other industries could match that growth and improvement rate!

Hard Disk Drive Operation

The basic physical operation of a hard disk drive is similar to that of a floppy disk drive: A hard drive uses spinning disks with heads that move over the disks and store data in tracks and sectors. A track is a concentric ring of information, which is divided into individual sectors that normally store 512 bytes each. In many other ways, however, hard disk drives are different from floppy disk drives.

Hard disks usually have multiple platters, each with two sides on which data can be stored. Most drives have at least two or three platters, resulting in four or six sides, and some drives have up to 11 or more platters. The identically positioned tracks on each side of every platter together make up a cylinder. A hard disk drive normally has one head per platter side, and all the heads are mounted on a common carrier device, or rack. The heads move in and out across the disk in unison; they cannot move independently because they are mounted on the same rack.

Hard disks operate much faster than floppy drives. Most hard disks originally spun at 3,600 RPM--approximately 10 times faster than a floppy drive. Until recently, 3,600 RPM was pretty much a constant among hard drives. Now, however, quite a few hard drives spin even faster. The Toshiba 3.3G drive in my notebook computer spins at 4,852 RPM; other drives spin as fast as 5,400, 5,600, 6,400, 7,200 and even 10,000 RPM. High rotational speed combined with a fast head-positioning mechanism and more sectors per track make one hard disk faster than another, and all these features combine to make hard drives much faster than floppy drives in storing and retrieving data.

The heads in most hard disks do not (and should not!) touch the platters during normal operation. When the heads are powered off, however, they land on the platters as they stop spinning. While the drive is on, a very thin cushion of air keeps each head suspended a short distance above or below the platter. If the air cushion is disturbed by a particle of dust or a shock, the head may come into contact with the platter spinning at full speed. When contact with the spinning platters is forceful enough to do damage, the event is called a head crash. The result of a head crash may be anything from a few lost bytes of data to a totally trashed drive. Most drives have special lubricants on the platters and hardened surfaces that can withstand the daily "takeoffs and landings" as well as more severe abuse.

Because the platter assemblies are sealed and non-removable, track densities can be very high. Many drives have 3,000 or more TPI of media. Head Disk Assemblies (HDAs), which contain the platters, are assembled and sealed in clean rooms under absolutely sanitary conditions. Because few companies repair HDAs, repair or replacement of items inside a sealed HDA can be expensive. Every hard disk ever made will eventually fail. The only questions are when the hard disk will fail and whether your data is backed up.

Many PC users think that hard disks are fragile, and generally, they are one of the most fragile components in your PC. In my weekly PC Hardware and Troubleshooting or Data Recovery seminars, however, I have run various hard disks for days with the lids off, and have even removed and installed the covers while the drives were operating. Those drives continue to store data perfectly to this day with the lids either on or off. Of course, I do not recommend that you try this test with your own drives; neither would I use this test on my larger, more expensive drives.

The Ultimate Hard Disk Drive Analogy

I'm sure that you have heard the traditional analogy that compares the interaction of the head and media in a typical hard disk as being similar in scale to a 747 flying a few feet off the ground at cruising speed (500+ mph). I have heard this analogy used over and over again for years, and I've even used it in my seminars many times without checking to see whether the analogy is technically accurate with respect to modern hard drives.

One highly inaccurate aspect of the 747 analogy has always bothered me--the use of an airplane of any type to describe the head-and-platter interaction. This analogy implies that the heads fly very low over the surface of the disk--but technically, this is not true. The heads do not fly at all, in the traditional aerodynamic sense; instead, they float on a cushion of air that's dragged around by the platters.

A much better analogy would use a hovercraft instead of an airplane; the action of a hovercraft much more closely emulates the action of the heads in a hard disk drive. Like a hovercraft, the drive heads rely somewhat on the shape of the bottom of the head to capture and control the cushion of air that keeps them floating over the disk. By nature, the cushion of air on which the heads float forms only in very close proximity to the platter and is often called an air bearing by the disk drive industry.

I thought it was time to come up with a new analogy that more correctly describes the dimensions and speeds at which a hard disk operates today. I looked up the specifications on a specific hard disk drive, and then equally magnified and rescaled all the dimensions involved to make the head floating height equal to 1 inch. For my example, I used a Seagate model ST-12550N Barracuda 2 drive, which is a 2G (formatted capacity), 3 1/2-inch SCSI-2 drive. In fact, I originally intended to install this drive in the portable system on which I am writing this book, but the technology took another leap and I ended up installing an ST-15230N Hawk 4 drive (4G) instead! Table 14.2 shows the specifications of the Barracuda drive, as listed in the technical documentation.

Table 14.2  Seagate ST-12550N Barracuda 2, 3 1/2-inch, SCSI-2 Drive
Specifications

Specification Value Unit of Measure
Linear density 52,187 Bits Per Inch (BPI)
Bit spacing 19.16 Micro-inches (u-in)
Track density 3,047 Tracks Per Inch (TPI)
Track spacing 328.19 Micro-inches (u-in)
Total tracks 2,707 Tracks
Rotational speed 7,200 Revolutions per minute(RPM)
Average head linear speed 53.55 Miles per hour (MPH)
Head slider length 0.08 Inches
Head slider height 0.02 Inches
Head floating height 5 Micro-inches (u-in)
Average seek time 8 Milliseconds (ms)

By interpreting these specifications, you can see that in this drive, the head sliders are about 0.08-inch long and 0.02-inch high. The heads float on a cushion of air about 5 u-in (millionths of an inch) from the surface of the disk while traveling at an average speed of 53.55 MPH (figuring an average track diameter of 2 1/2 inches). These heads read and write individual bits spaced only 19.16 u-in apart on tracks separated by only 328.19 u-in. The heads can move from one track to any other in only 8ms during an average seek operation.

To create my analogy, I simply magnified the scale to make the floating height equal to 1 inch. Because 1 inch is 200,000 times greater than 5 u-in, I scaled up everything else by the same amount.

The heads of this "typical" hard disk, magnified to such a scale, would be more than 1,300 feet long and 300 feet high (about the size of the Sears Tower, lying sideways!), traveling at a speed of more than 10.7 million MPH (2,975 miles per second!) only 1 inch above the ground, reading data bits spaced a mere 3.83 inches apart on tracks separated by only 5.47 feet.

Additionally, because the average seek of 8ms (.008 seconds) is defined as the time it takes to move the heads over one-third of the total tracks (about 902, in this case), each skyscraper-size head could move sideways to any track within a distance of 0.93 miles (902 tracksx5.47 feet) which results in an average sideways velocity of more than 420,000 MPH (116 miles per second)!

The forward speed of this imaginary head is difficult to comprehend, so I'll elaborate. The diameter of the Earth at the equator is 7,926 miles, which means a circumference of about 24,900 miles. At 2,975 miles per second, this imaginary head would circle the Earth about once every 8 seconds!

This analogy should give you a new appreciation of the technological marvel that the modern hard disk drive actually represents. It makes the 747 analogy look rather pathetic (not to mention totally inaccurate), doesn't it?

Magnetic Data Storage

Learning how magnetic data storage works will help you develop a feel for the way that your disk drives operate and can improve the way that you work with disk drives and disks.

Nearly all disk drives in personal computer systems operate on magnetic principles. Purely optical disk drives often are used as a secondary form of storage, but the computer to which they are connected is likely to use a magnetic storage medium for primary disk storage. Due to the high performance and density capabilities of magnetic storage, optical disk drives and media probably never will totally replace magnetic storage in PC systems.

Magnetic drives, such as floppy and hard disk drives, operate by using electromagnetism. This basic principle of physics states that as an electric current flows through a conductor, a magnetic field is generated around the conductor. This magnetic field then can influence magnetic material in the field. When the direction of the flow of electric current is reversed, the magnetic field's polarity also is reversed. An electric motor uses electromagnetism to exert pushing and pulling forces on magnets attached to a rotating shaft.

Another effect of electromagnetism is that if a conductor is passed through a changing magnetic field, an electrical current is generated. As the polarity of the magnetic field changes, so does the direction of the electric current flow. For example, a type of electrical generator used in automobiles, called an alternator, operates by rotating electromagnets past coils of wire conductors in which large amounts of electrical current can be induced. The two-way operation of electromagnetism makes it possible to record data on a disk and read that data back later.

The read/write heads in your disk drives (both floppy and hard disks) are U-shaped pieces of conductive material. This U-shaped object is wrapped with coils of wire, through which an electric current can flow. When the disk drive logic passes a current through these coils, it generates a magnetic field in the drive head. When the polarity of the electric current is reversed, the polarity of the field that is generated also changes. In essence, the heads are electromagnets whose voltage can be switched in polarity very quickly.

When a magnetic field is generated in the head, the field jumps the gap at the end of the U-shaped head. Because a magnetic field passes through a conductor much more easily than through the air, the field bends outward through the medium and actually uses the disk media directly below it as the path of least resistance to the other side of the gap. As the field passes through the media directly under the gap, it polarizes the magnetic particles through which it passes so that they are aligned with the field. The field's polarity--and, therefore, the polarity of the magnetic media--is based on the direction of the flow of electric current through the coils.

The disk consists of some form of substrate material (such as Mylar for floppy disks or aluminum or glass for hard disks) on which a layer of magnetizable material has been deposited. This material usually is a form of iron oxide with various other elements added. The polarities of the magnetic fields of the individual magnetic particles on an erased disk normally are in a state of random disarray. Because the fields of the individual particles point in random directions, each tiny magnetic field is canceled by one that points in the opposite direction, for a total effect of no observable or cumulative field polarity.

Particles in the area below the head gap are aligned in the same direction as the field emanating from the gap. When the individual magnetic domains are in alignment, they no longer cancel one another, and an observable magnetic field exists in that region of the disk. This local field is generated by the many magnetic particles that now are operating as a team to produce a detectable cumulative field with a unified direction.

The term flux describes a magnetic field that has a specific direction. As the disk sur- face rotates below the drive head, the head can lay a magnetic flux over a region of the disk. When the electric-current flowing through the coils in the head is reversed, so is the magnetic-field polarity in the head gap. This reversal also causes the polarity of the flux being placed on the disk to reverse.

The flux reversal or flux transition is a change in polarity of the alignment of magnetic particles on the disk surface. A drive head places flux reversals on a disk to record data. For each data bit (or bits) written, a pattern of flux reversals is placed on the disk in specific areas known as bit or transition cells. A bit cell or transition cell is a specific area of the disk controlled by the time and rotational speed in which flux reversals are placed by a drive head. The particular pattern of flux reversals within the transition cells used to store a given data bit or bits is called the encoding method. The drive logic or controller takes the data to be stored and encodes it as a series of flux reversals over a period of time, according to the encoding method used.

Modified Frequency Modulation (MFM) and Run Length Limited (RLL) are popular encoding methods. All floppy disk drives use the MFM scheme. Hard disks use MFM or several variations of RLL encoding methods. These encoding methods are described in more detail later in the section "MFM Encoding" later in this chapter.

During the write process, voltage is applied to the head, and as the polarity of this voltage changes, the polarity of the magnetic field being recorded also changes. The flux transitions are written precisely at the points where the recording polarity changes. Strange as it may seem, during the read process, a head does not output exactly the same signal that was written; instead, the head generates a voltage pulse or spike only when it crosses a flux transition. When the transition changes from positive to negative, the pulse that the head would detect is negative voltage. When the transition changes from negative to positive, the pulse would be a positive voltage spike.

In essence, while reading the disk the head becomes a flux transition detector, emitting voltage pulses whenever it crosses a transition. Areas of no transition generate no pulse. Figure 14.1 shows the relationship between the read and write waveforms and the flux transitions recorded on a disk.

FIG. 14.1  Magnetic write and read processes.

You can think of the write pattern as being a square waveform that is at a positive or negative voltage level and that continuously polarizes the disk media in one direction or another. Where the waveform transitions go from positive to negative voltage, or vice versa, the magnetic flux on the disk also changes polarity. During a read, the head senses the flux transitions and outputs a pulsed waveform. In other words, the signal is zero volts unless a positive or negative transition is being detected, in which case there is a positive or negative pulse. Pulses appear only when the head is passing over flux transitions on the disk media. By knowing the clock timing used, the drive or controller circuitry can determine whether a pulse (and therefore a flux transition) falls within a given transition cell.

The electrical pulse currents generated in the head while it is passing over a disk in read mode are very weak and can contain significant noise. Sensitive electronics in the drive and controller assembly then can amplify the signal above the noise level and decode the train of weak pulse currents back into data that is (theoretically) identical to the data originally recorded.

So as you now can see, disks are both recorded and read by means of basic electromagnetic principles. Data is recorded on a disk by passing electrical currents through an electromagnet (the drive head) that generates a magnetic field stored on the disk. Data on a disk is read by passing the head back over the surface of the disk; as the head encounters changes in the stored magnetic field, it generates a weak electrical current that indicates the presence or absence of flux transitions in the originally recorded signal.

Data Encoding Schemes

Magnetic media essentially is an analog storage medium. The data that we store on it, however, is digital information--that is, ones and zeros. When digital information is applied to a magnetic recording head, the head creates magnetic domains on the disk media with specific polarities. When a positive current is applied to the write head, the magnetic domains are polarized in one direction; when negative voltage is applied, the magnetic domains are polarized in the opposite direction. When the digital waveform that is recorded switches from a positive to a negative voltage, the polarity of the magnetic domains is reversed.

During a readback, the head actually generates no voltage signal when it encounters a group of magnetic domains with the same polarity, but it generates a voltage pulse every time it detects a switch in polarity. Each flux reversal generates a voltage pulse in the read head; it is these pulses that the drive detects when reading data. A read head does not generate the same waveform that was written; instead, it generates a series of pulses, each pulse appearing where a magnetic flux transition has occurred.

To optimize the placement of pulses during magnetic storage, the raw digital input data is passed through a device called an encoder/decoder (endec), which converts the raw binary information to a waveform that is more concerned with the optimum placement of the flux transitions (pulses). During a read operation, the endec reverses the process and decodes the pulse train back into the original binary data. Over the years, several different schemes for encoding data in this manner have been developed; some are better or more efficient than others.

In any consideration of binary information, the use of timing is important. When interpreting a read or write waveform, the timing of each voltage transition event is critical. If the timing is off, a given voltage transition may be recognized at the wrong time, and bits may be missed, added, or simply misinterpreted. To ensure that the timing is precise, the transmitting and receiving devices must be in sync. This synchronization can be accomplished by adding a separate line for timing, called a clock signal, between the two devices. The clock and data signals also can be combined and then transmitted on a single line. This combination of clock and data is used in most magnetic data encoding schemes.

When the clock information is added in with the data, timing accuracy in interpreting the individual bit cells is ensured between any two devices. Clock timing is used to determine the start and end of each bit cell. Each bit cell is bounded by two clock cells where the clock transitions can be sent. First there is a clock transition cell, and then the data transition cell, and finally the clock transition cell for the data that follows. By sending clock information along with the data, the clocks will remain in sync, even if a long string of 0 bits are transmitted. Unfortunately, all the transition cells that are used solely for clocking take up space on the media that otherwise could be used for data.

Because the number of flux transitions that can be recorded on a particular medium is limited by the disk media and head technology, disk drive engineers have been trying various ways of encoding the data into a minimum number of flux reversals, taking into consideration the fact that some flux reversals, used solely for clocking, are required. This method permits maximum use of a given drive hardware technology.

Although various encoding schemes have been tried, only a few are popular today. Over the years, these three basic types have been the most popular:

  • Frequency Modulation (FM)

  • Modified Frequency Modulation (MFM)

  • Run Length Limited (RLL)

The following section examines these codes, discusses how they work, where they have been used, and any advantages or disadvantages that apply to them.

FM Encoding

One of the earliest techniques for encoding data for magnetic storage is called Frequency Modulation (FM) encoding. This encoding scheme, sometimes called Single Density encoding, was used in the earliest floppy disk drives that were installed in PC systems. The original Osborne portable computer, for example, used these Single Density floppy drives, which stored about 80K of data on a single disk. Although it was popular until the late 1970s, FM encoding no longer is used today.

MFM Encoding

Modified Frequency Modulation (MFM) encoding was devised to reduce the number of flux reversals used in the original FM encoding scheme and, therefore, to pack more data onto the disk. In MFM encoding, the use of the clock transition cells is minimized, leaving more room for the data. Clock transitions are recorded only if a stored 0 bit is preceded by another 0 bit; in all other cases, a clock transition is not required. Because the use of the clock transitions has been minimized, the actual clock frequency can be doubled from FM encoding, resulting in twice as many data bits being stored in the same number of flux transitions as in FM.

Because it is twice as efficient as FM encoding, MFM encoding also has been called Double Density recording. MFM is used in virtually all PC floppy drives today and was used in nearly all PC hard disks for a number of years. Today, most hard disks use RLL (Run Length Limited) encoding, which provides even greater efficiency than MFM.

Because MFM encoding places twice as many data bits in the same number of flux reversals as FM, the clock speed of the data is doubled, so that the drive actually sees the same number of total flux reversals as with FM. This means that data is read and written at twice the speed in MFM encoding, even though the drive sees the flux reversals arriving at the same frequency as in FM. This method allows existing drive technology to store twice the data and deliver it twice as fast.

The only caveat is that MFM encoding requires improved disk controller and drive circuitry, because the timing of the flux reversals must be more precise than in FM. As it turned out, these improvements were not difficult to achieve, and MFM encoding became the most popular encoding scheme for many years.

Table 14.3 shows the data bit to flux reversal translation in MFM encoding.

Table 14.3  MFM Data to Flux Transition Encoding

Data Bit Value Flux Encoding
1 NT
0 preceded by 0 TN
0 preceded by 1 NN
T = Flux transition N = No flux transition

RLL Encoding

Today's most popular encoding scheme for hard disks, called RLL (Run Length Limited), packs up to 50 percent more information on a given disk than even MFM does and three times as much information as FM. In RLL encoding, groups of bits are taken as a unit and combined to generate specific patterns of flux reversals. By combining the clock and data in these patterns, the clock rate can be further increased while maintaining the same basic distance between the flux transitions on the disk.

IBM invented RLL encoding and first used the method in many of its mainframe disk drives. During the late 1980s, the PC hard disk industry began using RLL encoding schemes to increase the storage capabilities of PC hard disks. Today, virtually every drive on the market uses some form of RLL encoding.

Instead of encoding a single bit, RLL normally encodes a group of data bits at a time. The term Run Length Limited is derived from the two primary specifications of these codes, which is the minimum number (the run length) and maximum number (the run limit) of transition cells allowed between two actual flux transitions. Several schemes can be achieved by changing the length and limit parameters, but only two have achieved any real popularity: RLL 2,7 and RLL 1,7.

Even FM and MFM encoding can be expressed as a form of RLL. FM can be called RLL 0,1, because there can be as few as zero and as many as one transition cell separating two flux transitions. MFM can be called RLL 1,3, because as few as one and as many as three transition cells can separate two flux transitions. Although these codes can be expressed in RLL form, it is not common to do so.

RLL 2,7 initially was the most popular RLL variation because it offers a high-density ratio with a transition detection window that is the same relative size as that in MFM. This method allows for high storage density with fairly good reliability. In very high-capacity drives, however, RLL 2,7 did not prove to be reliable enough. Most of today's highest-capacity drives use RLL 1,7 encoding, which offers a density ratio 1.27 times that of MFM and a larger transition detection window relative to MFM. Because of the larger relative window size within which a transition can be detected, RLL 1,7 is a more forgiving and more reliable code; and, forgiveness and reliability are required when media and head technology are being pushed to their limits.

Another little-used RLL variation called RLL 3,9--sometimes called ARLL (Advanced RLL)--allowed an even higher density ratio than RLL 2,7. Unfortunately, reliability suffered too greatly under the RLL 3,9 scheme; the method was used by only a few controller companies that have all but disappeared.

It is difficult to understand how RLL codes work without looking at an example. Because RLL 2,7 was the most popular form of RLL encoding used with older controllers, I will use it as an example. Even within a given RLL variation such as RLL 2,7 or 1,7, many different flux transition encoding tables can be constructed to show what groups of bits are encoded as what sets of flux transitions. For RLL 2,7 specifically, thousands of different translation tables could be constructed, but for my examples, I will use the endec table used by IBM because it is the most popular variation used.

According to the IBM conversion tables, specific groups of data bits two, three, and four bits long are translated into strings of flux transitions four, six, and eight transition cells long, respectively. The selected transitions coded for a particular bit sequence are designed to ensure that flux transitions do not occur too close together or too far apart.

It is necessary to limit how close two flux transitions can be because of the basically fixed resolution capabilities of the head and disk media. Limiting how far apart these transitions can be ensures that the clocks in the devices remain in sync.

Table 14.4 shows the IBM-developed encoding scheme for 2,7 RLL.

Table 14.4  RLL 2,7 (IBM Endec) Data to Flux Transition Encoding

Data Bit Values Flux Encoding
10 NTNN
11 TNNN
000 NNNTNN
010 TNNTNN
011 NNTNNN
0010 NNTNNTNN
0011 NNNNTNNN
T = Flux transition N = No flux transition

In studying this table, you may think that encoding a byte such as 00000001b would be impossible because no combinations of data bit groups fit this byte. Encoding this type of byte is not a problem, however, because the controller does not transmit individual bytes; instead, the controller sends whole sectors, making it possible to encode such a byte simply by including some of the bits in the following byte. The only real problem occurs in the last byte of a sector if additional bits are needed to complete the final group sequence. In these cases, the endec in the controller simply adds excess bits to the end of the last byte. These excess bits are truncated during any reads so that the last byte always is decoded correctly.

Encoding Scheme Comparisons

Figure 14.2 shows an example of the waveform written to store an X ASCII character on a hard disk drive under three different encoding schemes.

FIG. 14.2  ASCII character "X" write waveforms using FM, MFM, and RLL 2,7 encoding.

In each of these encoding-scheme examples, the top line shows the individual data bits (01011000b) in their bit cells separated in time by the clock signal, which is shown as a period (.). Below that line is the actual write waveform, showing the positive and negative voltages as well as voltage transitions that result in the recording of flux transitions. The bottom line shows the transition cells, with T representing a transition cell that contains a flux transition and N representing a transition cell that is empty.

The FM encoding example is easy to explain. Each bit cell has two transition cells: one for the clock information and one for the data itself. All the clock transition cells contain flux transitions, and the data transition cells contain a flux transition only if the data is a 1 bit. No transition at all is used to represent a 0 bit. Starting from the left, the first data bit is 0, which decodes as a flux transition pattern of TN. The next bit is a 1, which decodes as TT. The next bit is 0, which decodes as TN, and so on. Using Table 14.2, you easily can trace the FM encoding pattern to the end of the byte.

The MFM encoding scheme also has clock and data transition cells for each data bit to be recorded. As you can see, however, the clock transition cells carry a flux transition only when a 0 bit is stored after another 0 bit. Starting from the left, the first bit is a 0, and the preceding bit is unknown (assume 0), so the flux transition pattern is TN for that bit. The next bit is a 1, which always decodes to a transition-cell pattern of NT. The next bit is 0, which was preceded by 1, so the pattern stored is NN. Using Table 14.3, you can easily trace the MFM encoding pattern to the end of the byte. You can see that the minimum and maximum number of transition cells between any two flux transitions is one and three, respectively; hence, MFM encoding also can be called RLL 1,3.

The RLL 2,7 pattern is more difficult to see because it relies on encoding groups of bits rather than encoding each bit individually. Starting from the left, the first group that matches the groups listed in Table 14.4 are the first three bits, 010. These bits are translated into a flux transition pattern of TNNTNN. The next two bits, 11, are translated as a group to TNNN; and the final group, 000 bits, is translated to NNNTNN to complete the byte. As you can see in this example, no additional bits were needed to finish the last group.

Notice that the minimum and maximum number of empty transition cells between any two flux transitions in this example are two and six, although a different example could show a maximum of seven empty transition cells. This is where the RLL 2,7 designation comes from. Because even fewer transitions are recorded than in MFM, the clock rate can be further increased to three times that of FM or 1.5 times that of MFM, allowing more data to be stored in the same space on the disk. Notice, however, that the resulting write waveform itself looks exactly like a typical FM or MFM waveform in terms of the number and separation of the flux transitions for a given physical portion of the disk. In other words, the physical minimum and maximum distances between any two flux transitions remain the same in all three of these encoding-scheme examples.

Another new feature in high-end drives involves the disk read circuitry. Read channel circuits using Partial-Response, Maximum-Likelihood (PRML) technology allow disk drive manufacturers to increase the amount of data that can be stored on a disk platter by up to 40 percent. PRML replaces the standard "detect one peak at a time" approach of traditional analog peak-detect read/write channels with digital signal processing. In digital signal processing, noise can be digitally filtered out, allowing flux change pulses to be placed closer together on the platter, achieving greater densities.

I hope that the examinations of these different encoding schemes and how they work have taken some of the mystery out of the way data is recorded on a drive. You can see that although schemes such as MFM and RLL can store more data on a drive, the actual density of the flux transitions remains the same as far as the drive is concerned.

Sectors

A disk track is too large to manage effectively as a single storage unit. Many disk tracks can store 50,000 or more bytes of data, which would be very inefficient for storing small files. For that reason, a disk track is divided into several numbered divisions known as sectors. These sectors represent slices of the track.

Different types of disk drives and disks split tracks into different numbers of sectors, depending on the density of the tracks. For example, floppy disk formats use 8 to 36 sectors per track, whereas hard disks usually store data at a higher density and can use 17 to 100 or more sectors per track. Sectors created by standard formatting procedures on PC systems have a capacity of 512 bytes, but this capacity may change in the future.

Sectors are numbered on a track starting with 1, unlike the heads or cylinders which are numbered starting with 0. For example, a 1.44M floppy disk contains 80 cylinders numbered from 0 to 79 and two heads numbered 0 and 1, and each track on each cylinder has 18 sectors numbered from 1 to 18.

When a disk is formatted, additional ID areas are created on the disk for the disk controller to use for sector numbering and identifying the start and end of each sector. These areas precede and follow each sector's data area, which accounts for the difference between a disk's unformatted and formatted capacities. These sector headers, inter-sector gaps, and so on are independent of the operating system, file system, or files stored on the drive. For example, a 4M floppy disk (3 1/2-inch) has a capacity of 2.88M when it is formatted, a 2M floppy has a formatted capacity of 1.44M, and an older 38M hard disk has a capacity of only 32M when it is formatted. Modern IDE and SCSI hard drives are preformatted, so the manufacturers now only advertise formatted capacity. Even so, nearly all drives use some reserved space for managing the data that can be stored on the drive.

Although I have stated that each disk sector is 512 bytes in size, this statement technically is false. Each sector does allow for the storage of 512 bytes of data, but the data area is only a portion of the sector. Each sector on a disk typically occupies 571 bytes of the disk, of which only 512 bytes are usable for user data. The actual number of bytes required for the sector header and trailer can vary from drive to drive, but this figure is typical. A few modern drives now use an ID-less recording which virtually eliminates the storage overhead of the sector header information. In an ID-less recording, virtually all of the space on the track is occupied by data.

You may find it helpful to think of each sector as being a page in a book. In a book, each page contains text, but the entire page is not filled with text; rather, each page has top, bottom, left, and right margins. Information such as chapter titles (track and cylinder numbers) and page numbers (sector numbers) is placed in the margins. The "margin" areas of a sector are created and written to during the disk-formatting process. Formatting also fills the data area of each sector with dummy values. After the disk is formatted, the data area can be altered by normal writing to the disk. The sector header and trailer information cannot be altered during normal write operations unless you reformat the disk.

Each sector on a disk has a prefix portion, or header, that identifies the start of the sector and a sector number, as well as a suffix portion, or trailer, that contains a checksum (which helps ensure the integrity of the data contents). Each sector also contains 512 bytes of data. The data bytes normally are set to some specific value, such as F6h (hex), when the disk is physically (or low-level) formatted. (The following section explains low-level formatting.)

In many cases, a specific pattern of bytes that are considered to be difficult to write are written so as to flush out any marginal sectors. In addition to the gaps within the sectors, gaps exist between sectors on each track and also between tracks; none of these gaps contain usable data space. The prefix, suffix, and gaps account for the lost space between the unformatted capacity of a disk and the formatted capacity.

Table 14.5 shows the format for each track and sector on a typical hard disk with 17 sectors per track.

Table 14.5  Typical 17-Sector/17-Track Disk Sector Format

Bytes Name Description
16 POST INDEX GAP All 4Eh, at the track beginning after the Index mark.
The following sector data (shown between the lines in this table) is repeated 17 times for an MFM encoded track.
13 ID VFO LOCK All 00h; synchronizes the VFO for the sector ID.
1 SYNC BYTE A1h; notifies the controller that data follows.
1 ADDRESS MARK FEh; defines that ID field data follows.
2 CYLINDER NUMBER A value that defines the actuator position.
1 HEAD NUMBER A value that defines the head selected.
1 SECTOR NUMBER A value that defines the sector.
2 CRC Cyclic Redundancy Check to verify ID data.
3 WRITE TURN-ON GAP 00h written by format to isolate the ID from DATA.
13 DATA SYNC VFO LOCK All 00h; synchronizes the VFO for the DATA.
1 SYNC BYTE A1h; notifies the controller that data follows.
1 ADDRESS MARK F8h; defines that user DATA field follows.
512 DATA The area for user DATA.
2 CRC Cyclic Redundancy Ch