To most users, the hard disk drive is the most important,
yet most mysterious, part of a computer system. A hard disk
drive is a sealed unit that holds the data in a system.
When the hard disk fails, the consequences usually are very
serious. To maintain, service, and expand a PC system
properly, you must fully understand the hard disk unit.
Most computer users want to know how hard disk drives work
and what to do when a problem occurs. Few books about hard
disks, however, cover the detail necessary for the PC
technician or sophisticated user. This chapter corrects that
situation.
This chapter thoroughly describes the hard disk drive from
a physical, mechanical, and electrical point of view. In
particular, this chapter examines the construction and
operation of a hard disk drive in a practical sense.
Definition of a Hard Disk
A hard disk drive contains rigid, disk-shaped platters
usually constructed of aluminum or glass. Unlike floppy disks,
the platters cannot bend or flex--hence the term hard
disk. In most hard disk drives, the platters cannot be
removed; for that reason, IBM calls them fixed disk
drives. Although a removable hard disk drive has been very
popular of late, the Jaz drive by Iomega is unlike its smaller
brother, the Zip drive, in that the Jaz drive's removable
media is comprised of the same hard disks found in any fixed
disk drive.
Hard disk drives used to be called Winchester
drives. This term dates back to the 1960s, when IBM
developed a high-speed hard disk drive that had 30M of
fixed-platter storage and 30M of removable-platter storage.
The drive had platters that spun at high speeds and heads that
floated over the platters while they spun in a sealed
environment. That drive, the 30-30 drive, soon received
the nickname Winchester after the famous Winchester
30-30 rifle. After that time, drives that used a high-speed
spinning platter with a floating head also became known as
Winchester drives. The term has no technical or
scientific meaning; it is a slang term, and is considered
synonymous with hard disk.
Capacity Measurements
To eliminate confusion in capacity measurements, I will be
using the abbreviation M in this section. The true industry
standard abbreviations for these figures are shown in Table
14.1.
Table 14.1 Standard Abbreviations and
Meanings
Abbreviation |
Description |
Decimal Meaning |
Binary Meaning |
Kbit |
Kilobit |
1,000 |
1,024 |
K |
Kilobyte |
1,000 |
1,024 |
Mbit |
Megabit |
1,000,000 |
1,048,576 |
M |
Megabyte |
1,000,000 |
1,048,576 |
Gbit |
Gigabit |
1,000,000,000 |
1,073,741,824 |
G |
Gigabyte |
1,000,000,000 |
1,073,741,824 |
Tbit |
Terabit |
1,000,000,000,000 |
1,099,511,627,776 |
T |
Terabyte |
1,000,000,000,000 |
1,099,511,627,776 |
Unfortunately, there are no differences in the
abbreviations when used to indicate metric verses binary
values. In other words, M can be used to indicate both
"millions of bytes" and megabytes. In general, memory values
are always computed using the binary derived meanings, while
disk capacity goes either way. Unfortunately, this often leads
to confusion in reporting disk capacities.
Hard Drive Advancements
In the 15 or more years that hard disks have commonly been
used in PC systems, they have undergone tremendous changes. To
give you an idea of how far hard drives have come in that
time, following are some of the most profound changes in PC
hard disk storage:
- Maximum storage capacities have increased from the 10M 5
1/4-inch full-height drives available in 1982 to 10G or more
for small 3 1/2-inch half-height drives, and 3G or more for
notebook 2 1/2-inch drives.
- Data transfer rates from the media have increased from
85 to 102K/sec for the original IBM XT in 1983 to nearly
10M/sec for some of the fastest drives today.
- Average seek times have decreased from more than 85 ms
(milliseconds) for the 10M XT hard disk in 1983 to fewer
than 8 ms for some of the fastest drives today.
- In 1982, a 10M drive cost more than $1,500 ($150 per
megabyte). Today, the cost of hard drives has dropped to 10
cents per megabyte or less.
Areal Density
Areal density has been used as a primary
technology-growth-rate indicator for the hard disk drive
industry. Areal density is defined as the product of
the linear bits per inch (BPI), measured along the length of
the tracks around the disk, multiplied by the number of tracks
per inch (TPI) measured radially on the disk. The results are
expressed in units of Mbit per square inch (Mbit/sq-inch) and
are used as a measure of efficiency in drive recording
technology. Current high-end 2.5-inch drives record at areal
densities of about 1.5Gbit per square inch (Gbit/sq-inch).
Prototype drives with densities as high as 10Gbit/sq-inch have
been constructed, allowing for capacities of more than 20G on
a single 2 1/2-inch platter for notebook drives.
Areal density (and, therefore, drive capacity) has been
doubling approximately every two to three years, and
production disk drives are likely to reach areal densities of
10+Gbit/sq-inch before the year 2000. A drive built with this
technology would be capable of storing more than 10G of data
on a single 2 1/2-inch platter, allowing 20 or 30G drives to
be constructed that fit in the palm of your hand. New media
and head technologies, such as ceramic or glass platters, MR
(Magneto-Resistive) heads, pseudo-contact recording, and PRML
(Partial Response Maximum Likelihood) electronics, are being
developed to support these higher areal densities. The primary
challenge in achieving higher densities is manufacturing drive
heads and disks to operate at closer tolerances.
It seems almost incredible that computer technology
improves by doubling performance or capacity every two to
three years--if only other industries could match that growth
and improvement rate!
Hard Disk Drive Operation
The basic physical operation of a hard disk drive is
similar to that of a floppy disk drive: A hard drive uses
spinning disks with heads that move over the disks and store
data in tracks and sectors. A track is a concentric
ring of information, which is divided into individual sectors
that normally store 512 bytes each. In many other ways,
however, hard disk drives are different from floppy disk
drives.
Hard disks usually have multiple platters, each with two
sides on which data can be stored. Most drives have at least
two or three platters, resulting in four or six sides, and
some drives have up to 11 or more platters. The identically
positioned tracks on each side of every platter together make
up a cylinder. A hard disk drive normally has one head
per platter side, and all the heads are mounted on a common
carrier device, or rack. The heads move in and out
across the disk in unison; they cannot move independently
because they are mounted on the same rack.
Hard disks operate much faster than floppy drives. Most
hard disks originally spun at 3,600 RPM--approximately 10
times faster than a floppy drive. Until recently, 3,600 RPM
was pretty much a constant among hard drives. Now, however,
quite a few hard drives spin even faster. The Toshiba 3.3G
drive in my notebook computer spins at 4,852 RPM; other drives
spin as fast as 5,400, 5,600, 6,400, 7,200 and even 10,000
RPM. High rotational speed combined with a fast
head-positioning mechanism and more sectors per track make one
hard disk faster than another, and all these features combine
to make hard drives much faster than floppy drives in storing
and retrieving data.
The heads in most hard disks do not (and should not!) touch
the platters during normal operation. When the heads are
powered off, however, they land on the platters as they stop
spinning. While the drive is on, a very thin cushion of air
keeps each head suspended a short distance above or below the
platter. If the air cushion is disturbed by a particle of dust
or a shock, the head may come into contact with the platter
spinning at full speed. When contact with the spinning
platters is forceful enough to do damage, the event is called
a head crash. The result of a head crash may be
anything from a few lost bytes of data to a totally trashed
drive. Most drives have special lubricants on the platters and
hardened surfaces that can withstand the daily "takeoffs and
landings" as well as more severe abuse.
Because the platter assemblies are sealed and
non-removable, track densities can be very high. Many drives
have 3,000 or more TPI of media. Head Disk Assemblies
(HDAs), which contain the platters, are assembled and
sealed in clean rooms under absolutely sanitary conditions.
Because few companies repair HDAs, repair or replacement of
items inside a sealed HDA can be expensive. Every hard disk
ever made will eventually fail. The only questions are when
the hard disk will fail and whether your data is backed
up.
Many PC users think that hard disks are fragile, and
generally, they are one of the most fragile components in your
PC. In my weekly PC Hardware and Troubleshooting or Data
Recovery seminars, however, I have run various hard disks for
days with the lids off, and have even removed and installed
the covers while the drives were operating. Those drives
continue to store data perfectly to this day with the lids
either on or off. Of course, I do not recommend that you try
this test with your own drives; neither would I use this test
on my larger, more expensive drives.
The Ultimate Hard Disk Drive
Analogy
I'm sure that you have heard the traditional analogy that
compares the interaction of the head and media in a typical
hard disk as being similar in scale to a 747 flying a few feet
off the ground at cruising speed (500+ mph). I have heard this
analogy used over and over again for years, and I've even used
it in my seminars many times without checking to see whether
the analogy is technically accurate with respect to modern
hard drives.
One highly inaccurate aspect of the 747 analogy has always
bothered me--the use of an airplane of any type to describe
the head-and-platter interaction. This analogy implies that
the heads fly very low over the surface of the disk--but
technically, this is not true. The heads do not fly at all, in
the traditional aerodynamic sense; instead, they float on a
cushion of air that's dragged around by the platters.
A much better analogy would use a hovercraft instead of an
airplane; the action of a hovercraft much more closely
emulates the action of the heads in a hard disk drive. Like a
hovercraft, the drive heads rely somewhat on the shape of the
bottom of the head to capture and control the cushion of air
that keeps them floating over the disk. By nature, the cushion
of air on which the heads float forms only in very close
proximity to the platter and is often called an air
bearing by the disk drive industry.
I thought it was time to come up with a new analogy that
more correctly describes the dimensions and speeds at which a
hard disk operates today. I looked up the specifications on a
specific hard disk drive, and then equally magnified and
rescaled all the dimensions involved to make the head floating
height equal to 1 inch. For my example, I used a Seagate model
ST-12550N Barracuda 2 drive, which is a 2G (formatted
capacity), 3 1/2-inch SCSI-2 drive. In fact, I originally
intended to install this drive in the portable system on which
I am writing this book, but the technology took another leap
and I ended up installing an ST-15230N Hawk 4 drive (4G)
instead! Table 14.2 shows the specifications of the Barracuda
drive, as listed in the technical documentation.
Table 14.2 Seagate ST-12550N Barracuda 2, 3
1/2-inch, SCSI-2 Drive Specifications
Specification |
Value |
Unit of Measure |
Linear density |
52,187 |
Bits Per Inch (BPI) |
Bit spacing |
19.16 |
Micro-inches (u-in) |
Track density |
3,047 |
Tracks Per Inch (TPI) |
Track spacing |
328.19 |
Micro-inches (u-in) |
Total tracks |
2,707 |
Tracks |
Rotational speed |
7,200 |
Revolutions per minute(RPM) |
Average head linear speed |
53.55 |
Miles per hour (MPH) |
Head slider length |
0.08 |
Inches |
Head slider height |
0.02 |
Inches |
Head floating height |
5 |
Micro-inches (u-in) |
Average seek time |
8 |
Milliseconds (ms) |
By interpreting these specifications, you can see that in
this drive, the head sliders are about 0.08-inch long and
0.02-inch high. The heads float on a cushion of air about 5
u-in (millionths of an inch) from the surface of the disk
while traveling at an average speed of 53.55 MPH (figuring an
average track diameter of 2 1/2 inches). These heads read and
write individual bits spaced only 19.16 u-in apart on tracks
separated by only 328.19 u-in. The heads can move from one
track to any other in only 8ms during an average seek
operation.
To create my analogy, I simply magnified the scale to make
the floating height equal to 1 inch. Because 1 inch is 200,000
times greater than 5 u-in, I scaled up everything else by the
same amount.
The heads of this "typical" hard disk, magnified to such a
scale, would be more than 1,300 feet long and 300 feet high
(about the size of the Sears Tower, lying sideways!),
traveling at a speed of more than 10.7 million MPH (2,975
miles per second!) only 1 inch above the ground, reading data
bits spaced a mere 3.83 inches apart on tracks separated by
only 5.47 feet.
Additionally, because the average seek of 8ms (.008
seconds) is defined as the time it takes to move the heads
over one-third of the total tracks (about 902, in this case),
each skyscraper-size head could move sideways to any track
within a distance of 0.93 miles (902 tracksx5.47 feet) which
results in an average sideways velocity of more than 420,000
MPH (116 miles per second)!
The forward speed of this imaginary head is difficult to
comprehend, so I'll elaborate. The diameter of the Earth at
the equator is 7,926 miles, which means a circumference of
about 24,900 miles. At 2,975 miles per second, this imaginary
head would circle the Earth about once every 8 seconds!
This analogy should give you a new appreciation of the
technological marvel that the modern hard disk drive actually
represents. It makes the 747 analogy look rather pathetic (not
to mention totally inaccurate), doesn't it?
Magnetic Data Storage
Learning how magnetic data storage works will help you
develop a feel for the way that your disk drives operate and
can improve the way that you work with disk drives and
disks.
Nearly all disk drives in personal computer systems operate
on magnetic principles. Purely optical disk drives often are
used as a secondary form of storage, but the computer to which
they are connected is likely to use a magnetic storage medium
for primary disk storage. Due to the high performance and
density capabilities of magnetic storage, optical disk drives
and media probably never will totally replace magnetic storage
in PC systems.
Magnetic drives, such as floppy and hard disk drives,
operate by using electromagnetism. This basic principle
of physics states that as an electric current flows through a
conductor, a magnetic field is generated around the conductor.
This magnetic field then can influence magnetic material in
the field. When the direction of the flow of electric current
is reversed, the magnetic field's polarity also is reversed.
An electric motor uses electromagnetism to exert pushing and
pulling forces on magnets attached to a rotating shaft.
Another effect of electromagnetism is that if a conductor
is passed through a changing magnetic field, an electrical
current is generated. As the polarity of the magnetic field
changes, so does the direction of the electric current flow.
For example, a type of electrical generator used in
automobiles, called an alternator, operates by rotating
electromagnets past coils of wire conductors in which large
amounts of electrical current can be induced. The two-way
operation of electromagnetism makes it possible to record data
on a disk and read that data back later.
The read/write heads in your disk drives (both floppy and
hard disks) are U-shaped pieces of conductive material. This
U-shaped object is wrapped with coils of wire, through which
an electric current can flow. When the disk drive logic passes
a current through these coils, it generates a magnetic field
in the drive head. When the polarity of the electric current
is reversed, the polarity of the field that is generated also
changes. In essence, the heads are electromagnets whose
voltage can be switched in polarity very quickly.
When a magnetic field is generated in the head, the field
jumps the gap at the end of the U-shaped head. Because a
magnetic field passes through a conductor much more easily
than through the air, the field bends outward through the
medium and actually uses the disk media directly below it as
the path of least resistance to the other side of the gap. As
the field passes through the media directly under the gap, it
polarizes the magnetic particles through which it passes so
that they are aligned with the field. The field's
polarity--and, therefore, the polarity of the magnetic
media--is based on the direction of the flow of electric
current through the coils.
The disk consists of some form of substrate material (such
as Mylar for floppy disks or aluminum or glass for hard disks)
on which a layer of magnetizable material has been deposited.
This material usually is a form of iron oxide with various
other elements added. The polarities of the magnetic fields of
the individual magnetic particles on an erased disk normally
are in a state of random disarray. Because the fields of the
individual particles point in random directions, each tiny
magnetic field is canceled by one that points in the opposite
direction, for a total effect of no observable or cumulative
field polarity.
Particles in the area below the head gap are aligned in the
same direction as the field emanating from the gap. When the
individual magnetic domains are in alignment, they no longer
cancel one another, and an observable magnetic field exists in
that region of the disk. This local field is generated by the
many magnetic particles that now are operating as a team to
produce a detectable cumulative field with a unified
direction.
The term flux describes a magnetic field that has a
specific direction. As the disk sur- face rotates below the
drive head, the head can lay a magnetic flux over a region of
the disk. When the electric-current flowing through the coils
in the head is reversed, so is the magnetic-field polarity in
the head gap. This reversal also causes the polarity of the
flux being placed on the disk to reverse.
The flux reversal or flux transition is a
change in polarity of the alignment of magnetic particles on
the disk surface. A drive head places flux reversals on a disk
to record data. For each data bit (or bits) written, a pattern
of flux reversals is placed on the disk in specific areas
known as bit or transition cells. A bit cell or
transition cell is a specific area of the disk
controlled by the time and rotational speed in which flux
reversals are placed by a drive head. The particular pattern
of flux reversals within the transition cells used to store a
given data bit or bits is called the encoding method.
The drive logic or controller takes the data to be stored and
encodes it as a series of flux reversals over a period of
time, according to the encoding method used.
Modified Frequency Modulation (MFM) and Run
Length Limited (RLL) are popular encoding methods. All
floppy disk drives use the MFM scheme. Hard disks use MFM or
several variations of RLL encoding methods. These encoding
methods are described in more detail later in the section "MFM
Encoding" later in this chapter.
During the write process, voltage is applied to the head,
and as the polarity of this voltage changes, the polarity of
the magnetic field being recorded also changes. The flux
transitions are written precisely at the points where the
recording polarity changes. Strange as it may seem, during the
read process, a head does not output exactly the same signal
that was written; instead, the head generates a voltage pulse
or spike only when it crosses a flux transition. When the
transition changes from positive to negative, the pulse that
the head would detect is negative voltage. When the transition
changes from negative to positive, the pulse would be a
positive voltage spike.
In essence, while reading the disk the head becomes a flux
transition detector, emitting voltage pulses whenever it
crosses a transition. Areas of no transition generate no
pulse. Figure 14.1 shows the relationship between the read and
write waveforms and the flux transitions recorded on a
disk.
FIG.
14.1 Magnetic write and read
processes.
You can think of the write pattern as being a square
waveform that is at a positive or negative voltage level and
that continuously polarizes the disk media in one direction or
another. Where the waveform transitions go from positive to
negative voltage, or vice versa, the magnetic flux on the disk
also changes polarity. During a read, the head senses the flux
transitions and outputs a pulsed waveform. In other words, the
signal is zero volts unless a positive or negative transition
is being detected, in which case there is a positive or
negative pulse. Pulses appear only when the head is passing
over flux transitions on the disk media. By knowing the clock
timing used, the drive or controller circuitry can determine
whether a pulse (and therefore a flux transition) falls within
a given transition cell.
The electrical pulse currents generated in the head while
it is passing over a disk in read mode are very weak and can
contain significant noise. Sensitive electronics in the drive
and controller assembly then can amplify the signal above the
noise level and decode the train of weak pulse currents back
into data that is (theoretically) identical to the data
originally recorded.
So as you now can see, disks are both recorded and read by
means of basic electromagnetic principles. Data is recorded on
a disk by passing electrical currents through an electromagnet
(the drive head) that generates a magnetic field stored on the
disk. Data on a disk is read by passing the head back over the
surface of the disk; as the head encounters changes in the
stored magnetic field, it generates a weak electrical current
that indicates the presence or absence of flux transitions in
the originally recorded signal.
Data Encoding Schemes
Magnetic media essentially is an analog storage medium. The
data that we store on it, however, is digital
information--that is, ones and zeros. When digital information
is applied to a magnetic recording head, the head creates
magnetic domains on the disk media with specific polarities.
When a positive current is applied to the write head, the
magnetic domains are polarized in one direction; when negative
voltage is applied, the magnetic domains are polarized in the
opposite direction. When the digital waveform that is recorded
switches from a positive to a negative voltage, the polarity
of the magnetic domains is reversed.
During a readback, the head actually generates no voltage
signal when it encounters a group of magnetic domains with the
same polarity, but it generates a voltage pulse every time it
detects a switch in polarity. Each flux reversal
generates a voltage pulse in the read head; it is these pulses
that the drive detects when reading data. A read head does not
generate the same waveform that was written; instead, it
generates a series of pulses, each pulse appearing where a
magnetic flux transition has occurred.
To optimize the placement of pulses during magnetic
storage, the raw digital input data is passed through a device
called an encoder/decoder (endec), which converts the
raw binary information to a waveform that is more concerned
with the optimum placement of the flux transitions (pulses).
During a read operation, the endec reverses the process and
decodes the pulse train back into the original binary data.
Over the years, several different schemes for encoding data in
this manner have been developed; some are better or more
efficient than others.
In any consideration of binary information, the use of
timing is important. When interpreting a read or write
waveform, the timing of each voltage transition event is
critical. If the timing is off, a given voltage transition may
be recognized at the wrong time, and bits may be missed,
added, or simply misinterpreted. To ensure that the timing is
precise, the transmitting and receiving devices must be in
sync. This synchronization can be accomplished by adding a
separate line for timing, called a clock signal,
between the two devices. The clock and data signals also can
be combined and then transmitted on a single line. This
combination of clock and data is used in most magnetic data
encoding schemes.
When the clock information is added in with the data,
timing accuracy in interpreting the individual bit cells is
ensured between any two devices. Clock timing is used to
determine the start and end of each bit cell. Each bit cell is
bounded by two clock cells where the clock transitions can be
sent. First there is a clock transition cell, and then the
data transition cell, and finally the clock transition cell
for the data that follows. By sending clock information along
with the data, the clocks will remain in sync, even if a long
string of 0 bits are transmitted. Unfortunately, all the
transition cells that are used solely for clocking take up
space on the media that otherwise could be used for data.
Because the number of flux transitions that can be recorded
on a particular medium is limited by the disk media and head
technology, disk drive engineers have been trying various ways
of encoding the data into a minimum number of flux reversals,
taking into consideration the fact that some flux reversals,
used solely for clocking, are required. This method permits
maximum use of a given drive hardware technology.
Although various encoding schemes have been tried, only a
few are popular today. Over the years, these three basic types
have been the most popular:
- Frequency Modulation (FM)
- Modified Frequency Modulation (MFM)
- Run Length Limited (RLL)
The following section examines these codes, discusses how
they work, where they have been used, and any advantages or
disadvantages that apply to them.
FM Encoding
One of the earliest techniques for encoding data for
magnetic storage is called Frequency Modulation (FM)
encoding. This encoding scheme, sometimes called Single
Density encoding, was used in the earliest floppy
disk drives that were installed in PC systems. The original
Osborne portable computer, for example, used these Single
Density floppy drives, which stored about 80K of data on a
single disk. Although it was popular until the late 1970s, FM
encoding no longer is used today.
MFM Encoding
Modified Frequency Modulation (MFM) encoding was
devised to reduce the number of flux reversals used in the
original FM encoding scheme and, therefore, to pack more data
onto the disk. In MFM encoding, the use of the clock
transition cells is minimized, leaving more room for the data.
Clock transitions are recorded only if a stored 0 bit is
preceded by another 0 bit; in all other cases, a clock
transition is not required. Because the use of the clock
transitions has been minimized, the actual clock frequency can
be doubled from FM encoding, resulting in twice as many data
bits being stored in the same number of flux transitions as in
FM.
Because it is twice as efficient as FM encoding, MFM
encoding also has been called Double Density
recording. MFM is used in virtually all PC floppy
drives today and was used in nearly all PC hard disks for a
number of years. Today, most hard disks use RLL (Run Length
Limited) encoding, which provides even greater efficiency than
MFM.
Because MFM encoding places twice as many data bits in the
same number of flux reversals as FM, the clock speed of the
data is doubled, so that the drive actually sees the same
number of total flux reversals as with FM. This means that
data is read and written at twice the speed in MFM encoding,
even though the drive sees the flux reversals arriving at the
same frequency as in FM. This method allows existing drive
technology to store twice the data and deliver it twice as
fast.
The only caveat is that MFM encoding requires improved disk
controller and drive circuitry, because the timing of the flux
reversals must be more precise than in FM. As it turned out,
these improvements were not difficult to achieve, and MFM
encoding became the most popular encoding scheme for many
years.
Table 14.3 shows the data bit to flux reversal translation
in MFM encoding.
Table 14.3 MFM Data to Flux Transition
Encoding
Data Bit Value |
Flux Encoding |
1 |
NT |
0 preceded by 0 |
TN |
0 preceded by 1 |
NN | T = Flux
transition N = No flux transition
RLL Encoding
Today's most popular encoding scheme for hard disks, called
RLL (Run Length Limited), packs up to 50 percent more
information on a given disk than even MFM does and three times
as much information as FM. In RLL encoding, groups of bits are
taken as a unit and combined to generate specific patterns of
flux reversals. By combining the clock and data in these
patterns, the clock rate can be further increased while
maintaining the same basic distance between the flux
transitions on the disk.
IBM invented RLL encoding and first used the method in many
of its mainframe disk drives. During the late 1980s, the PC
hard disk industry began using RLL encoding schemes to
increase the storage capabilities of PC hard disks. Today,
virtually every drive on the market uses some form of RLL
encoding.
Instead of encoding a single bit, RLL normally encodes a
group of data bits at a time. The term Run Length
Limited is derived from the two primary specifications of
these codes, which is the minimum number (the run length) and
maximum number (the run limit) of transition cells allowed
between two actual flux transitions. Several schemes can be
achieved by changing the length and limit parameters, but only
two have achieved any real popularity: RLL 2,7 and RLL
1,7.
Even FM and MFM encoding can be expressed as a form of RLL.
FM can be called RLL 0,1, because there can be as few as zero
and as many as one transition cell separating two flux
transitions. MFM can be called RLL 1,3, because as few as one
and as many as three transition cells can separate two flux
transitions. Although these codes can be expressed in RLL
form, it is not common to do so.
RLL 2,7 initially was the most popular RLL variation
because it offers a high-density ratio with a transition
detection window that is the same relative size as that in
MFM. This method allows for high storage density with fairly
good reliability. In very high-capacity drives, however, RLL
2,7 did not prove to be reliable enough. Most of today's
highest-capacity drives use RLL 1,7 encoding, which offers a
density ratio 1.27 times that of MFM and a larger transition
detection window relative to MFM. Because of the larger
relative window size within which a transition can be
detected, RLL 1,7 is a more forgiving and more reliable code;
and, forgiveness and reliability are required when media and
head technology are being pushed to their limits.
Another little-used RLL variation called RLL
3,9--sometimes called ARLL (Advanced RLL)--allowed
an even higher density ratio than RLL 2,7. Unfortunately,
reliability suffered too greatly under the RLL 3,9 scheme; the
method was used by only a few controller companies that have
all but disappeared.
It is difficult to understand how RLL codes work without
looking at an example. Because RLL 2,7 was the most popular
form of RLL encoding used with older controllers, I will use
it as an example. Even within a given RLL variation such as
RLL 2,7 or 1,7, many different flux transition encoding tables
can be constructed to show what groups of bits are encoded as
what sets of flux transitions. For RLL 2,7 specifically,
thousands of different translation tables could be
constructed, but for my examples, I will use the endec table
used by IBM because it is the most popular variation used.
According to the IBM conversion tables, specific groups of
data bits two, three, and four bits long are translated into
strings of flux transitions four, six, and eight transition
cells long, respectively. The selected transitions coded for a
particular bit sequence are designed to ensure that flux
transitions do not occur too close together or too far
apart.
It is necessary to limit how close two flux transitions can
be because of the basically fixed resolution capabilities of
the head and disk media. Limiting how far apart these
transitions can be ensures that the clocks in the devices
remain in sync.
Table 14.4 shows the IBM-developed encoding scheme for 2,7
RLL.
Table 14.4 RLL 2,7 (IBM Endec) Data to Flux
Transition Encoding
Data Bit Values |
Flux Encoding |
10 |
NTNN |
11 |
TNNN |
000 |
NNNTNN |
010 |
TNNTNN |
011 |
NNTNNN |
0010 |
NNTNNTNN |
0011 |
NNNNTNNN | T = Flux
transition N = No flux transition
In studying this table, you may think that encoding a byte
such as 00000001b would be impossible because no combinations
of data bit groups fit this byte. Encoding this type of byte
is not a problem, however, because the controller does not
transmit individual bytes; instead, the controller sends whole
sectors, making it possible to encode such a byte simply by
including some of the bits in the following byte. The only
real problem occurs in the last byte of a sector if additional
bits are needed to complete the final group sequence. In these
cases, the endec in the controller simply adds excess bits to
the end of the last byte. These excess bits are truncated
during any reads so that the last byte always is decoded
correctly.
Encoding Scheme Comparisons
Figure 14.2 shows an example of the waveform written to
store an X ASCII character on a hard disk drive under three
different encoding schemes.
FIG.
14.2 ASCII character "X" write waveforms
using FM, MFM, and RLL 2,7 encoding.
In each of these encoding-scheme examples, the top line
shows the individual data bits (01011000b) in their bit cells
separated in time by the clock signal, which is shown as a
period (.). Below that line is the actual write waveform,
showing the positive and negative voltages as well as voltage
transitions that result in the recording of flux transitions.
The bottom line shows the transition cells, with T
representing a transition cell that contains a flux transition
and N representing a transition cell that is empty.
The FM encoding example is easy to explain. Each bit cell
has two transition cells: one for the clock information and
one for the data itself. All the clock transition cells
contain flux transitions, and the data transition cells
contain a flux transition only if the data is a 1 bit. No
transition at all is used to represent a 0 bit. Starting from
the left, the first data bit is 0, which decodes as a flux
transition pattern of TN. The next bit is a 1, which decodes
as TT. The next bit is 0, which decodes as TN, and so on.
Using Table 14.2, you easily can trace the FM encoding pattern
to the end of the byte.
The MFM encoding scheme also has clock and data transition
cells for each data bit to be recorded. As you can see,
however, the clock transition cells carry a flux transition
only when a 0 bit is stored after another 0 bit. Starting from
the left, the first bit is a 0, and the preceding bit is
unknown (assume 0), so the flux transition pattern is TN for
that bit. The next bit is a 1, which always decodes to a
transition-cell pattern of NT. The next bit is 0, which was
preceded by 1, so the pattern stored is NN. Using Table 14.3,
you can easily trace the MFM encoding pattern to the end of
the byte. You can see that the minimum and maximum number of
transition cells between any two flux transitions is one and
three, respectively; hence, MFM encoding also can be called
RLL 1,3.
The RLL 2,7 pattern is more difficult to see because it
relies on encoding groups of bits rather than encoding each
bit individually. Starting from the left, the first group that
matches the groups listed in Table 14.4 are the first three
bits, 010. These bits are translated into a flux transition
pattern of TNNTNN. The next two bits, 11, are translated as a
group to TNNN; and the final group, 000 bits, is translated to
NNNTNN to complete the byte. As you can see in this example,
no additional bits were needed to finish the last group.
Notice that the minimum and maximum number of empty
transition cells between any two flux transitions in this
example are two and six, although a different example could
show a maximum of seven empty transition cells. This is where
the RLL 2,7 designation comes from. Because even fewer
transitions are recorded than in MFM, the clock rate can be
further increased to three times that of FM or 1.5 times that
of MFM, allowing more data to be stored in the same space on
the disk. Notice, however, that the resulting write waveform
itself looks exactly like a typical FM or MFM waveform in
terms of the number and separation of the flux transitions for
a given physical portion of the disk. In other words, the
physical minimum and maximum distances between any two flux
transitions remain the same in all three of these
encoding-scheme examples.
Another new feature in high-end drives involves the disk
read circuitry. Read channel circuits using Partial-Response,
Maximum-Likelihood (PRML) technology allow disk drive
manufacturers to increase the amount of data that can be
stored on a disk platter by up to 40 percent. PRML replaces
the standard "detect one peak at a time" approach of
traditional analog peak-detect read/write channels with
digital signal processing. In digital signal processing, noise
can be digitally filtered out, allowing flux change pulses to
be placed closer together on the platter, achieving greater
densities.
I hope that the examinations of these different encoding
schemes and how they work have taken some of the mystery out
of the way data is recorded on a drive. You can see that
although schemes such as MFM and RLL can store more data on a
drive, the actual density of the flux transitions remains the
same as far as the drive is concerned.
Sectors
A disk track is too large to manage effectively as a single
storage unit. Many disk tracks can store 50,000 or more bytes
of data, which would be very inefficient for storing small
files. For that reason, a disk track is divided into several
numbered divisions known as sectors. These sectors
represent slices of the track.
Different types of disk drives and disks split tracks into
different numbers of sectors, depending on the density of the
tracks. For example, floppy disk formats use 8 to 36 sectors
per track, whereas hard disks usually store data at a higher
density and can use 17 to 100 or more sectors per track.
Sectors created by standard formatting procedures on PC
systems have a capacity of 512 bytes, but this capacity may
change in the future.
Sectors are numbered on a track starting with 1, unlike the
heads or cylinders which are numbered starting with 0. For
example, a 1.44M floppy disk contains 80 cylinders numbered
from 0 to 79 and two heads numbered 0 and 1, and each track on
each cylinder has 18 sectors numbered from 1 to 18.
When a disk is formatted, additional ID areas are created
on the disk for the disk controller to use for sector
numbering and identifying the start and end of each sector.
These areas precede and follow each sector's data area, which
accounts for the difference between a disk's unformatted and
formatted capacities. These sector headers, inter-sector gaps,
and so on are independent of the operating system, file
system, or files stored on the drive. For example, a 4M floppy
disk (3 1/2-inch) has a capacity of 2.88M when it is
formatted, a 2M floppy has a formatted capacity of 1.44M, and
an older 38M hard disk has a capacity of only 32M when it is
formatted. Modern IDE and SCSI hard drives are preformatted,
so the manufacturers now only advertise formatted capacity.
Even so, nearly all drives use some reserved space for
managing the data that can be stored on the drive.
Although I have stated that each disk sector is 512 bytes
in size, this statement technically is false. Each sector does
allow for the storage of 512 bytes of data, but the data area
is only a portion of the sector. Each sector on a disk
typically occupies 571 bytes of the disk, of which only 512
bytes are usable for user data. The actual number of bytes
required for the sector header and trailer can vary from drive
to drive, but this figure is typical. A few modern drives now
use an ID-less recording which virtually eliminates the
storage overhead of the sector header information. In an
ID-less recording, virtually all of the space on the track is
occupied by data.
You may find it helpful to think of each sector as being a
page in a book. In a book, each page contains text, but the
entire page is not filled with text; rather, each page has
top, bottom, left, and right margins. Information such as
chapter titles (track and cylinder numbers) and page numbers
(sector numbers) is placed in the margins. The "margin" areas
of a sector are created and written to during the
disk-formatting process. Formatting also fills the data area
of each sector with dummy values. After the disk is formatted,
the data area can be altered by normal writing to the disk.
The sector header and trailer information cannot be altered
during normal write operations unless you reformat the
disk.
Each sector on a disk has a prefix portion, or
header, that identifies the start of the sector and a sector
number, as well as a suffix portion, or trailer, that
contains a checksum (which helps ensure the integrity
of the data contents). Each sector also contains 512 bytes of
data. The data bytes normally are set to some specific value,
such as F6h (hex), when the disk is physically (or low-level)
formatted. (The following section explains low-level
formatting.)
In many cases, a specific pattern of bytes that are
considered to be difficult to write are written so as to flush
out any marginal sectors. In addition to the gaps within the
sectors, gaps exist between sectors on each track and also
between tracks; none of these gaps contain usable data space.
The prefix, suffix, and gaps account for the lost space
between the unformatted capacity of a disk and the formatted
capacity.
Table 14.5 shows the format for each track and sector on a
typical hard disk with 17 sectors per track.
Table 14.5 Typical 17-Sector/17-Track Disk
Sector Format
Bytes |
Name |
Description |
16 |
POST INDEX GAP |
All 4Eh, at the track beginning after the
Index mark. |
The following sector data
(shown between the lines in this table) is repeated 17
times for an MFM encoded track. |
13 |
ID VFO LOCK |
All 00h; synchronizes the VFO for the
sector ID. |
1 |
SYNC BYTE |
A1h; notifies the controller that data
follows. |
1 |
ADDRESS MARK |
FEh; defines that ID field data
follows. |
2 |
CYLINDER NUMBER |
A value that defines the actuator
position. |
1 |
HEAD NUMBER |
A value that defines the head
selected. |
1 |
SECTOR NUMBER |
A value that defines the sector. |
2 |
CRC |
Cyclic Redundancy Check to verify ID
data. |
3 |
WRITE TURN-ON GAP |
00h written by format to isolate the ID
from DATA. |
13 |
DATA SYNC VFO LOCK |
All 00h; synchronizes the VFO for the
DATA. |
1 |
SYNC BYTE |
A1h; notifies the controller that data
follows. |
1 |
ADDRESS MARK |
F8h; defines that user DATA field
follows. |
512 |
DATA |
The area for user DATA. |
2 |
CRC |
Cyclic Redundancy
Ch | |