Principles for Identity Data

As you review, consolidate, and create identity data, there are several important principles that you should keep in mind. Some of these are adapted from The Practical Guide to Enterprise Architecture by James McGovern, et al. (Prentice Hall).

Don't replicate identities

Wherever possible, you should avoid replicating identity data and ask for it from its canonical source instead. For example, if you need the SSN for an application, it would be better to retrieve it using a SAML request or database query from the HR employee record.

Business requirements should drive identity replication

If you can't avoid replicating identity information, you should do so only because the business requirements force the replication. Application developer convenience is never a good reason for replication. One of the primary reasons for this principle is that identity data is often subject to internal and external audit, security, and privacy requirements that will have to be monitored and paid for by the business unit. Replicating identity data increases the cost of these external requirements and so that cost ought to be traded off against business requirements.

Replicated identities should be read-only

Break this principle at your own peril. As soon as multiple systems can change replicated data, the data will be guaranteed to be out of sync.

Identity data should be locationally transparent

Applications should be constructed so that they don't rely on the identity data being in a particular data source or on a particular machine. This will increase flexibility in future application changes. Using SAML, for example, to request properties of a particular identity over the network, rather than reading them using an SQL query over JDBC makes it easy to change out the database for another application without breaking the system.

Enforce the consistency and integrity of identity data with policies, processes, and tools

The more you can build consistency into the infrastructure, the more likely it is that the data will remain consistent. We'll discuss the role of policy in this principle in the next chapter.

Don't rely on a single validation to check the integrity of identities

Whenever identity data will be collected and used in multiple locations, each system should perform validity checks on the identity data as part of its processing. These checks should ensure that the identities conform to the relevant schema and any business rules documented in the inventory or metadata repository.

Use open standards rather than proprietary standards

As we've discussed, open standards have a number of advantages over proprietary standards.

Use encryption to protect sensitive identity elements

Sensitive identity data should never be stored unencrypted. Usually, encrypting the entire record is too disruptive, because it has to be unencrypted each time any part of it is used. A better strategy is to encrypt just the data elements that are sensitive. As we saw in Chapter 6, the XML Encryption standard can be used to encrypt only certain elements in an XML record.