As we've seen, authentication systems require some sort of credential. Although we usually associate credentials with some sort of document, that need not be the case. More broadly, credentials can be created using:
Something you know
Something you have
Something you are
Some combination of the three
These are known as authentication factors . In general, the more authentication factors that are present in an authentication system, the more secure it is. You'll hear the term "two-factor authentication," for example, meaning that the system incorporates two of these authentication factors. The remainder of this section will discuss some common authentication schemes and their authentication factors.
You may not have thought of cookies as an identity credential; but, the fact is, they represent the most prevalent form of identity credential on the Internet. The Hacker's Dictionary defines a cookie as a handle, transaction ID, or other token of agreement between cooperating programs. The claim check you get from a dry-cleaning shop is a perfect example of a cookie; the only thing it's useful for is making sure that you get your clothes back by relating two transactions that happen at different times.
On the Internet, cookies are exchanged between the browsers people use to access the Web and the servers the people visit. These cookies serve the same purpose as the claim check in the dry cleaning example: they tie transactions together that are otherwise difficult to connect.
To see how this works, suppose I use my browser to visit a web site I've never been to before, let's call it http://www.foo.com. The people who maintain http://www.foo.com configured the server to give a cookie to everyone who visits. So, on my first visit to the site, the server passes a cookie back to my browser. My browser dutifully stores the cookie in a file on my computer used especially for this purpose. The magic of cookies happens the next time I request a page from http://www.foo.com (which could happen in the same browsing session or days or even weeks later). When I tell my browser to retrieve a page from http://www.foo.com, my browser sends back the cookie it received on the last visit.
So, let's review what has happened:
The server asked my browser to store some information for it, knowing that it would be sent back verbatim when the browser made its next request.
The server chose the information it gave to my browser.
My browser didn't add any information to the cookie, just passed it back.
My browser stored the cookie in a special file that it chose, not one the server chose.
The cookie does not contain any information about me.
The cookie is not a program: it can't be executed and has no ability to access other information on my computer.
The next time I retrieve a page from that web site, the information is sent back.
That's really the long and short of cookies. There are some more details, but all cookies obey these rules.
Cookies were added to servers and browsers to identify the same user over multiple sessions. Because HTTP is a stateless protocol, every request looks like a different session. This ability to link together time-separated transactions is what allows a web site to run a shopping cart, remember who you are, fill in forms for you, or remember your password between visits. Just as you wouldn't return to a dry cleaner who couldn't match up their customers with their clothes, web sites without the ability to link transactions across time are pretty boring.
Cookies aren't typically thought of as an authentication device, but they are a credential that is used to lay claim to a particular identity. Cookies have a single authentication factor: "something I have." Cookies are a weak form of authentication, because there's no way to ensure that the same entity is using the cookie from visit to visit. The browser might be on a shared computer. I can even move cookies from one browser to another, if I'm clever, and let multiple people use the same cookie. Even with these problems, cookie-based identities are useful in a variety of circumstances. Chances are, you've got hundreds of cookie-based identities that you're largely unaware of.
If you're like me, you have dozens of ID/password pairs on various computer systems around the Internet. By typing in an ID, we lay claim to an identity, and the password is used to authenticate that we are allowed to do so. The system uses the identity represented by the ID to associate attributes with the holder of the ID.
Strictly speaking, ID and password systems are a two-factor authentication system with the ID representing something I have and the password being something I know. The problem, of course, is that an ID is usually public and is easily duplicated. As a result, most ID and password systems are nearly as weak as a one-factor system.
The biggest advantage of ID and password systems is their simplicity and familiarity. The largest drawback is their reliance on passwords . In theory, because passwords are secret (something you know), they are secure, and only the entity with the secret can reveal it to the authentication system. In practice, passwords suffer from several serious limitations:
People can remember only a limited number (around eight) of items with perfect accuracy. Furthermore, they usually have multiple passwords that they are trying to remember. As a result, people create passwords that are short and easy to remember. They also tend to use the same password for multiple credentials.
Easy to remember passwords can be easily guessed by an attacker. Even passwords that have no relation to the entity that holds them can be effectively guessed if they are what are known as "dictionary words." The best passwords would be long, random strings of characters, but people can't remember long, random strings.
People (and even machines) can be tricked into revealing the secret password to an attacker. This could be done, for example, by creating fake login screens. Another common technique is known as "social engineering" where the attacker contacts the person and tricks him into revealing his password by posing as an administrator or someone else the person trusts.
People write passwords down. Passwords get stored in files on computers. This makes them vulnerable to theft and misuse.
These problems don't have easy solutions. For example, many IT departments institute a password aging policy that forces users to change their passwords on a periodic basis to mitigate loss or sharing. They also frequently enforce rules about password structure in an effort to make passwords less guessable. For example, the rules may disallow dictionary words, require passwords longer than six characters, or require passwords to contain a mixture of letters, numbers, and punctuation. Often, the result of these kinds of policies is that users give up trying to remember their passwords and simply write them down and paste them to their monitors or stick them in the pencil drawer.
When it's poorly implemented, password aging can be a detriment in other ways. Some password aging implementations, for example, allow a user to immediately change her password back to the previous password, defeating the purpose. Also, many password aging systems surprise the user with a requirement to select a new password when she logs in. Faced with the goal of getting something done and having to select a new password quickly, many users select easily guessable passwords.
Password management systems can be implemented so as to conduct extensive checking on user-selected passwords to weed out poor choices. These systems can be customized on a per-project or per-group basis to eliminate passwords that might be acceptable in a general population, but bear additional significance to the group members or project team.
Managing passwords is a significant challenge in a large organization. Issues surrounding passwords can make up 25% of all calls to an IT help desk. With help desk calls costing as much as $35 per call, this can be an expensive proposition. To mitigate password management costs, many organizations have implemented self-service password reset systems. These systems allow a user to reset his password to something new whenever he wants, without knowing the old password.
Password reset services can work in various ways. One of the most common methods is to ask the user a question that she has previously answered and, if she gives the right answer, allow the password reset to proceed. The questions can be canned (e.g., "What is your mother's maiden name?") or free form, allowing the users to enter their own questions and answers when their accounts are originally established. Another common technique is to allow a user to create a new password and have it emailed to a previously specified account. This presumes that the user has access to her email without the password. Hybrid solutions that use both of these methods are also possible.
Challenge-response systems are related to password systems. In a challenge-response system, the server generates a random string of characters, and the user is required to manipulate the string according to some predefined, and secret, algorithm and send it back to the server. The algorithm may include the use of a secret key. Usually, a computing device of some sort performs the manipulation so that the algorithm can be made more complicated and thus more difficult to guess.
To see how this works, consider a simple problem. Suppose Alice and Bob agree to meet at the zoo to exchange a package. Alice and Bob do not know each other by sight and agree to verify their identities by way of a spoken password. We've seen this played out in spy movies. The two parties approach each other and one says something and the other responds with a predetermined response. This is essentially no different than how cookies work.
In a challenge-response system, Alice could give Bob a number, say, 5634. Bob has been told that he's to add 3133 to the number Alice gives him and repeat it back. He responds with 8767. Alice performs the same calculation and gets the same answer and knows that Bob is the party she's to meet. She could distinguish between multiple parties by the algorithm they use. Another party might have been told to reverse the number and add 3, for example.
Challenge-response systems have an advantage over ID/password systems, because the algorithm can be made sufficiently complex so that it is nearly impossible to guess. The disadvantages are that the algorithm must be coded on a special-purpose device or on the user's computer and is subject to loss or theft.
A variation on the challenge-response system gives the user a special-purpose computing device called a token, which calculates a series of pseudorandom numbers. The token and the server that is doing the authentication both run carefully synchronized clocks. The token shows a new pseudorandom number periodically (say, every 60 seconds).
This number can be used in several ways. As a challenge-response system, the server could issue a challenge and the user could use the pseudorandom number to manipulate the challenge phrase in some way. This is more secure than having a predetermined algorithm.
In a second scenario, the user enters an ID, the number from the token, and a password in response to the challenge. The server uses the ID to retrieve the correct number and password and verify that they are correct. In this system, the server doesn't issue a challenge, but both the server and the token use a carefully synchronized, complex algorithm to calculate a shared secret.
The pseudorandom number token creates a two-factor authentication system where the user uses something that he has (the token) and something he knows (the algorithm or the password) to log in. An attacker would have to have both pieces to break into the system.
Challenge-response systems can be combined with ID and password systems to avoid transmitting passwords across the network. In this scheme, both the client and the server know the password. The server generates a challenge phrase and sends it to the user. The user creates a message digest for the challenge phrase using the password and sends the result back to the server. The server compares the response with the message digest it calculates for the phrase with the user's password. If they match, the server knows that the client possesses the password without ever having to send it across the network. Care must be taken to ensure that the message digest algorithm and the way it's combined with the password are cryptographically sound or an attacker may be able to deduce the password from the network traffic.
We discussed digital certificates in detail in Chapter 6. Because certificates aggregate identity information and a public key that has been signed by a third party, digital certificates can easily be turned to the task of authentication.
The simplest way of using a digital certificate for authentication is in a challenge-response system. In this use, the digital-signing algorithm is substituted for the secret algorithm. The server generates a random string and sends it to the user. The subject digitally signs the random string and sends it back to the server. The server can verify that the random string is the same one that it generated and was in fact signed by the owner of the certificate. Once verified, the identifying information in the certificate can be assumed to belong to the subject.
Besides their cryptographic security advantages, digital certificates have other advantages over other authentication mechanisms. Multiple digital certificates can be associated with a single account so that a user does not need to hold multiple accounts to store different sets of attributes. Because the attributes can be stored in the certificate, the correct attributes are selected based on the certificate that the user presents. Furthermore, as we've seen, the certificate data structure can be extended so that it carries information specific to its use.
Even with their inherent security and other advantages, however, digital certificates have not had much success in authentication systems. Microsoft and others have invested millions of dollars building systems for creating, administering, and using client-side certificates for authentication. The primary problem is that digital certificates are quite a bit more complicated for users to manage. The concepts surrounding digital certificates are considerably more difficult to understand than those of simpler systems like ID and password systems.
A secondary problem is that commercially issued digital certificates cost money. They must be renewed regularly, and users frequently forget their access codes or lose the certificate entirely, leading to costly refreshes. This ongoing expense can deter organizations from adopting digital certificates as their primary authentication mechanism.
The final straw is that, except in certain circumstances, digital certificates don't provide a significant increase in the security of the authentication system. Digital certificates are usually stored on the user's machine and protected by a password. That password is subject to all the disadvantages of passwords that we saw earlier. Impersonators with access to the user's machine may be able to guess or crack the password being used to protect the digital certificate as easily as they can any other password.
All this doesn't mean that digital certificates have no place in an authentication strategy, but you should understand why you're using them and compare the return with the additional complexity and expense. Certificate-based authentication has found the most success in special-purpose applications where security is of particular concern. Even there, certificates are often misused. For example, we've seen that ignoring certificate-revocation lists is a common mistake.
Biometrics is the use of some physiological or behavioral trait to uniquely recognize a person. Physiological traits can include face characteristics, fingerprints, hand geometry, iris characteristics, or retinal maps. Behavioral traits are those that are learned or acquired like handwriting and signature characteristics or voiceprints. Biometric devices present the opportunity to build authentication systems that incorporate something you are as an authentication factor.
Biometric identification has long been the stuff of science fiction, but there is growing interest, primarily among government entities, in using biometrics to strengthen airport security, protect borders, and in other military and homeland security applications. You can even buy a PDA with a built-in fingerprint sensor.
As we've seen, most state-issued driver's licenses contain a biometric identification device called a picture. When the driver's license is presented as a credential, the integrated picture is used by the sophisticated face recognition technology we all carry around in our heads to ensure the credentials are being presented by the person pictured on the license. Similarly, when a biometric device is used to authenticate identity credentials, the user enters an ID or other identifying token and then, rather than entering a password, the user might touch a fingerprint recognition device or speak into a microphone.
Biometrics devices have an advantage over other authentication systems, because they are tied to a person uniquely. This property allows biometrics to be used for identification, not just authentication of credentials. To understand the difference, consider the problem of ensuring that a person receives a government entitlement like food stamps only once. Crooks can fraudulently obtain credentials and apply for entitlements under different names. Biometrics determine whether the person, rather than a set of credentials, is known to the system and whether that person has previously been enrolled. This is sometimes called negative identification, because its goal is to prove that someone isn't on a list.
This advantage also represents the most significant disadvantage of biometrics. While unlikely, biometric characteristics might be duplicated or cheated. Once a biometric identification characteristic is compromised, there is little recourse. You can't easily issue new irises or fingerprints to replace the compromised feature.
The other side of this problem is that there are individuals who don't fit into the standard patterns. For example, hand geometry devices cannot successfully identify the hand geometry of individuals with polydactylism (one or more extra fingers), yet this occurs in 0.2% of the general population. This can pose a difficult choice when system security depends on such a device. To see why, imagine a scenario in which a data center has installed hand geometry scanners at each of its doors and relies on them for perimeter security. If a polydactyl individual is hired, the company would be required to accommodate this disability, at least in the U.S. Most reasonable accommodations would create a special-purpose breach in the security wall or reduce the automation level of the authentication system.
Smart cards are credit card-sized devices that contain an embedded electronic chip and a simple interface for accessing the memory or processor on the chip. Most cards have a small set of contacts on the front, and when the card is inserted into a card reader, those contacts provide power and data interfaces to the electronics on the card.
Smart cards are something you have and can be used in authentication systems in a variety of scenarios. The most obvious scenario is to use the memory on the card to store a digital certificate. The advantage to storing the digital certificate on a smart card is its portability. On a card that contains both memory and a processor, the onboard processor can be used for encryption and decryption tasks and for verifying authorized access to the private key. Using the onboard processor in this way reduces the need to transport the keys off the card, decreasing the chance that they'll be compromised.
Smart cards can also be used to implement a challenge-response system. In this use, the set of algorithms used to manipulate the challenge phrase are encoded in the card. The card can be placed in a reader attached to the user's computer or the card can be inserted into a low-cost special-purpose device that allows the user to enter the challenge phrase and displays the result of the manipulation.
Some smart cards contain a built-in thumb-print reader, creating a portable biometric device. The biometric factor is used to ensure that the owner of the smart card is the one presenting or using it. This kind of system could potentially combine cryptographically strong digital certificates in a package that offers three-factor authentication: something you are, your thumb-print; something you have, the card; and something you know, a password.
Perhaps the most widespread use of smart cards for identity is in the cell phone industry. GSM phones rely on a Subscriber Information Module (SIM) that provides an individual identity for each mobile user account. Removing the SIM from one phone and placing it in another transfers the user's identity, including her network information, phone number, and, in most cases, address and contact information to the new phone. The SIM is just a smart card that allows the electrical contacts and processor to be punched out of the larger card and inserted into a small receptacle in the phone.
Smart cards have been available for many years, but have, thus far, failed to be adopted for widespread use in identity management. The primary drawback is the additional hardware needed to read the cards. For example, deploying a smart card-based identity management system in a large organization would require outfitting each desktop workstation with a smart card reader. Beyond the capital costs, the readers have to be maintained and managed.