Teach Yourself Microsoft SQL Server 6.5 in 21 Days

Previous chapterNext chapterContents


Day 14

Data Integrity


On Day 13, "Indexing," you saw that indexing is critical for several reasons. You can use indexes to optimize selecting data from your SQL Server databases, as well as to optimize data-modification operations. You also can use indexes to enforce the uniqueness of rows in your tables. Clustered indexes physically sort your data, and nonclustered indexes maintain separate structures pointing to your data.

Today, you'll focus on data integrity. You'll look at two kinds of mechanisms to enforce data integrity: procedural and declarative. Declarative integrity in

Data Integrity

When you examine data integrity, you are trying to ensure that the data in your database is correct--both from a literal standpoint (without errors) and from a business standpoint. As just stated, you can enforce integrity procedurally or by using declarative integrity.

Procedural integrity means that you can use programmatic structures and separate objects to enforce data integrity. Frequently, this is done at the application-program level. However, that means that if any data modifications occur outside of that program, your integrity rules are not enforced. Therefore, it makes sense to put much of this integrity enforcement at the database level, which you can do by using triggers and stored procedures. You will learn about these methods of enforcement on Day 15, "Views, Stored Procedures, and Triggers." The rest of today covers the defaults and rules of enforcement mechanisms, as well as user-defined datatypes.

The other approach is declarative integrity; here, the mechanisms to enforce integrity are declared as part of the definition of the objects (tables) in your database. The mechanisms become an integral part of these objects. You will spend most of your time examining these objects, because they provide critical functionality that otherwise would require significant amounts of programming to enforce.

Types of Integrity

Many components are used to enforce integrity in SQL Server 6.5. Some of them are more obvious than others. There are three types of integrity: domain, referential, and entity.

Domain Integrity

Datatypes, for example, help determine what values are valid for a particular column; this is known as domain integrity. The domain is simply the set of valid values for a particular column. Nullability is another option to determine which values are valid in the domain--in this case, whether the unknown value (NULL) is valid. You can put further restrictions on the domain of a column with user-defined datatypes, rules, and defaults if you want to use traditional SQL Server integrity objects. Otherwise, you can use ANSI constraints (default and check constraints) to enforce your domain integrity. Note that datatypes and nullability always are used; the other components are optional.

Referential Integrity

Referential integrity refers to the maintenance of relationships between tables and data. You enforce referential integrity with DRI, as mentioned earlier in this chapter. You could enforce integrity using triggers, however, and programmatically control this functionality. However, it takes quite a bit of work. SQL Server 6.5 provides only delete restrict enforcement; if you try to delete a customer and referential integrity is in place, you cannot delete the customer if invoices are outstanding for that customer. Another type of integrity enforcement is delete cascade integrity. With this type of referential integrity in place, when you delete the customer, all related invoices also are removed from your database. To accomplish this, you must program it yourself (with triggers, stored procedures, or standard T-SQL statements).

Entity Integrity

The last type of integrity you should be concerned with is entity integrity, which means that you can uniquely identify every row in a table. You can do that with a unique index (as mentioned on Day 13) or with declarative integrity (primary keys or unique constraints), which you'll learn about today.

Traditional Methods of Integrity

Traditional methods of ensuring integrity include user-defined datatypes, defaults, and rules.

User-Defined Datatypes

User-defined datatypes enable you to help with domain integrity. Normally, when you create a table, you define each column with a datatype and specify whether it allows nulls. You also can specify your own datatypes in SQL Server. However, you must define the datatypes in terms of system-supplied datatypes; you cannot perform tasks such as creating structures as your own datatypes.

You use these user-defined datatypes when you want to translate logical datatypes from your data model into physical datatypes in SQL Server. If you model the use of postal codes in your data model on several tables, for example, a user-defined datatype might be appropriate. If you define a datatype as a char(10), you can create a datatype called something like postal_code_datatype. The datatype is a char(10).

To create this datatype, you can run the sp_addtype system-stored procedure, as in this example:

sp_addtype postal_code_datatype, `char(10)'
sp_addtype typename, phystype [, nulltype]

where

Types are stored in the systypes system table in each database.

To create a state_code_type datatype of char(2) that does not allow nulls by default, for example, you can run this:

sp_addtype state_code_type, `char(2)', `not null'

To use it, run this:

Create Table mytable2
(col1 state_code_type)

col1 does not allow nulls, because you specified in the datatype that you don't want to allow nulls, regardless of whether the database is set up to allow nulls by default.

When you finish with a user-defined datatype (because you never used it or because you dropped all tables that use the datatype), you can drop it with the sp_droptype system-stored procedure. If a user-defined datatype is still in use anywhere, you receive an error if you try to drop it. You first must drop all tables that are using the datatype before you can eliminate it.

sp_droptype typename

where typename is the name of the user-defined datatype you want to drop.

As you might expect, SQL Enterprise Manager provides an interface to support user-defined datatypes. Expand the database you want to work with (remember that datatypes are database specific), expand the Objects folder, and expand the User Defined Datatypes folder. You will see any datatypes you have created in this database. To create a new datatype, right-click the User Defined Datatypes folder and choose New UDDT from the pop-up menu. Or, you can highlight your database and choose Manage | User Defined Datatypes. Either action causes the Manage User-Defined Datatypes dialog box to appear, as shown in Figure 14.1.

To add a new user-defined datatype, simply type the name you want it to have, the owner of the object, the system datatype, and the length (if necessary). Then check the Allow Nulls column if you want the datatype to allow nulls by default. (You will learn more about the defaults and rules when you get to those objects later today.) Click Apply Now to add the user-defined datatype without exiting the dialog. To drop a datatype, simply highlight it and click Drop. Again, note that you cannot drop a datatype that is in use. To find out where your datatype is being used, highlight it and click Info; this causes a list of the tables and columns the datatype is used with to appear.

Figure 14.1. The Manage User-Defined Datatypes dialog box.

Defaults

Defaults are used to specify a value to add to a column when you don't want to directly insert a value into a column. Two kinds of defaults exist in SQL Server: ANSI constraint defaults and standalone objects called defaults. In this section, you will examine the standalone objects called defaults; you'll examine ANSI defaults later in the day. You must have been granted permission in order to run the CREATE DEFAULT statement.

Because defaults are separate objects in a database, they must have names that are unique from all other objects in the database. When created, defaults are recorded in the sysobjects system table in each database. The text of the CREATE DEFAULT statement is stored in the syscomments table.

As you might have guessed by now, you create defaults by using the CREATE DEFAULT statement.

CREATE DEFAULT [owner.]default_name
AS constant_expression

where

As you learned on Day 12, "Data Modification," you can use defaults during an insert by using the DEFAULT keyword or by simply not referencing the column in the column list.

You have to address a couple of concerns when using defaults:

So far, it has been implied that defaults apply to columns or user-defined datatypes. When you create a default, it is a standalone object in the database; it is not tied to any particular object. To actually use defaults, you must bind them to a column or user-defined datatype. To bind a default, you use the sp_bindefault system-stored procedure.

sp_bindefault defname, objname [, futureonly]

where

After the binding is complete, any user-defined datatype or column the default is bound to will have the default applied during an insert as appropriate.

To unbind the default, you run the sp_unbindefault system-stored procedure.

sp_unbindefault objname [, futureonly]

where

After you unbind a default from a datatype or default, it no longer is used during an insert. You cannot drop a default until it is unbound from all datatypes and columns.

DROP DEFAULT [owner.]default_name [, [owner.]default_name...]

This one is pretty straightforward. You can drop as many defaults as you want in a single statement. If you are the owner of the default, you have the right to drop the default.

Here is some sample code you can use to create a table and user-defined datatype and then apply some defaults:

Use pubs
go
Exec sp_addtype my_uddt_type, money
Go
Create default intdefault as 0
Go
Create default char5default as `Hello'
Go
Create default moneydefault as $10.00
go
Create table mytab
(intcol int not null,
 char5col char(5) not null,
 uddtcol my_uddt_type not null)
go

Now bind the defaults to the datatype and to the columns:

Exec sp_bindefault moneydefault, `my_uddt_type'
Exec sp_bindefault intdefault, `mytab.intcol'
Exec sp_bindefault char5default, `mytab.char5col'
Go

Now insert a default row and then select it:

INSERT mytab DEFAULT VALUES
Go
SELECT * FROM mytab
Go

You will see this return set, showing that the defaults were indeed used:

intcol     char5col    uddtcol
------     --------    -------
0          Hello       10.00
(1 row(s) affected)

Take a look at this through Enterprise Manager. Start Enterprise Manager if it's not already open, and expand your Pubs Database folder. You can choose Manage | Defaults or right-click the Defaults folder and choose New Default from the pop-up menu. Either action causes the Manage Defaults dialog box to appear, as shown in Figure 14.2.

Figure 14.2. The Manage Defaults dialog box.

On the Defaults tab, click the down arrow beside the Default field to view the drop-down list. Select <New Default> from the list, and simply type the name of the default you want to create. In the Description box, enter the character string, numeric value, money value, or binary value you want to give the default. When you begin typing in this window, the Add button becomes available. Click Add to add your default to the database.

After you create a default, you can view it by selecting it from the drop-down list in the Manage Defaults dialog box. To drop the default, select it and click Drop. To view where the default is used, click Bindings. A dialog box appears showing where your default is used (both in user-defined datatypes and columns).

If you select the Column Bindings tab, you see a drop-down list of tables you can choose from to bind with the currently selected default. In the Manage Defaults dialog box shown in Figure 14.3, you are binding a column in a table to a default (the char5_col of mytable(dbo)).

Figure 14.3. Binding a column with Enterprise Manager.

To configure bindings to user-defined datatypes, select the final tab--Datatype Bindings (see Figure 14.4). To select a binding, simply click on the appropriate column. To enable the futureonly option, simply click the Future Only box. To make your changes permanent, click Apply.

Click Close when you are ready to exit this dialog box. If you click Close without applying any changes you have made, you see a prompt asking whether you want to apply the changes.

Rules

Rules further enforce domain integrity by providing more sophisticated checking of valid values. Rules are used to ensure that values do the following:

Figure 14.4. Binding a user-defined datatype with Enterprise Manager.

Rules, like defaults, are standalone objects that require special permissions to create. Rules are stored in the same system tables--sysobjects and syscomments--in each database. Rules are checked for violations when inserts and updates are performed (whenever the column the rule affects is referenced).

You create rules by using the CREATE RULE statement:

CREATE RULE [owner.]rule_name
AS condition_expression

where

To create a rule for a part number column (and in which the column must start with a p or a t), for example, you can have a rule such as this:

CREATE RULE myrule AS @myvar like `p%' OR @myvar like `t%'

Note that @myvar is named arbitrarily. You can call it @fredandethel if you want. Most people use a variable name that is similar to the name of the column(s) or datatype with which it's used, though.

Rules, just like defaults, also must be bound to a datatype or column. This is done with the sp_bindrule system-stored procedure:

sp_bindrule rulename, objname [, futureonly]

where

After the rules are bound, they are used just as defaults. When you perform an insert, the data you insert is checked against the rule to make sure that it's valid. If it's not, you see a message similar to this:

Msg 513, Level 16, State 1
 A column insert or update conflicts with a rule imposed by
 a previous CREATE RULE command.
The command was aborted. The conflict occurred in database
 `pubs', table `t9', column `col1'
Command has been aborted.

If you want to unbind the rule, you use the sp_unbindrule system-stored procedure:

sp_unbindrule objname [, futureonly]

where

To drop a rule, you run the DROP RULE statement:

DROP RULE [owner.]rule_name [, [owner.]rule_name...]

where you simply specify the rule names you want to drop. As with the DROP DEFAULT statement, you can drop as many rules as you want with a single command. The rule(s) must not be bound to any columns or user-defined datatypes.

You can use the same table used earlier to examine how rules work, as well as how they work with defaults:

Create table myruletab
(intcol int not null,
 char5col char(5) not null,
 uddtcol my_uddt_type not null)
go
Create rule char5rule As @col like `h%'
Go
Create rule intrule As @intval < 100
Go
Create rule moneyrule As @moneyval between $5.00 and $10.00
Go
Exec sp_bindrule char5rule, `myruletab.char5col'
Exec sp_bindrule intrule, `myruletab.intcol'
Exec sp_bindrule moneyrule, `my_uddt_type'
Go

Now insert a valid column (based on these rules):

insert myruletab values (90,'Howdy',$6.00)
go

Test that your defaults comply with the rules:

insert myruletab DEFAULT VALUES
go

In each case, you get the message (1 row(s) affected).

Now insert an invalid set of values (based on your rules):

insert myruletab values (101,'Ralph',$20.00)
go

Now you can see how these two insertions together greatly increase the way you can control the domain of valid values for a column.

Of course, you can again use SQL Enterprise Manager to perform all these operations. You can access the Manage Rules dialog by right-clicking the Rules folder and choosing New Rule from the context menu (see Figure 14.5). As before, you also can access this by choosing Manage | Rules.

The Manage Rules dialog works exactly like the Manage Defaults dialog box you looked at earlier.

Figure 14.5. The Manage Rules dialog box.

ANSI Constraints and Declarative-Integrity Mechanisms

You can use ANSI constraints and declarative integrity to ensure the accuracy of your data--including the Identity property and the following constraints: default, check, primary key, foreign key, and unique.

The Identity Property

The Identity property was introduced in SQL Server 6.0. It enables you to use system-generated values in your tables. This is similar to the auto-number datatype in Microsoft Access and Microsoft FoxPro databases. You are allowed a single column in each table with the Identity property.

Typically, Identity columns are used to generate system-assigned keys. To enforce entity integrity, you need to have a way to uniquely identify every row in a table. If there is no natural column or set of columns that does this for you, you might want to create an Identity column.

You can use the Identity property only if the column it's being assigned to is an integer or is compatible with an integer. Therefore, you can use the following datatypes:

You can use numeric and decimal only if they have a scale of 0--such as numeric(12,0). They also must not allow nulls. You might want to use these datatypes that scale a bit more, because the Identity property doesn't re-use values by default and won't wrap around.

Each time you perform an insert into a table with the Identity property enabled for a column, the next available value is inserted into that column.

IDENTITY [(seed, increment)]

where

If you don't specify a seed or increment, each defaults to 1. Hence, the first row has a value of 1, the next 2, and so on.

The Identity property is used during a CREATE TABLE or ALTER TABLE statement. For example,

Create table mytable5
(col1 int not null IDENTITY(1,100),
 col2 char(5) not null)

results in a value of 1 for col1 when the first row is added, then 101, then 201, and so on. Because Identity columns don't wrap, you can see that you might run out. Because you can't change datatypes in SQL Server after you create a table, make sure that you select a datatype large enough to handle any conceivable values you might have for this column.

If you use IDENTITY without the optional parameters, it is set to (1,1).

You can refer to the column by using the keyword IDENTITYCOL instead of the proper column name. Because only one column in each table can have the Identity property set, SQL Server always can figure out which column you're referencing. In the preceding example, you could run

select identitycol from mytable5

and SQL Server would return only the data for col1.

To get information about an Identity column on a table, you can run the sp_help system-stored procedure, specifying the table name as a parameter. You also can use the system function IDENT_INCR or IDENT_SEED.

IDENT_SEED(`tablename')
IDENT_INCR(`tablename')

where

From a programmatic perspective, an important question comes up right away. How do you know what value was inserted last? In comes the global value @@IDENTITY. Every time you insert a value into a table that has an Identity column, the @@IDENTITY value is updated.

Try the code here to see how @@IDENTITY works; this example uses mytable5:

Create table myidenttab
(col1 int not null IDENTITY(1,100),
 col2 char(5) not null)
go
insert myidenttab (col2) values (`howdy')
select @@identity

(Note: It should be 1.)

Insert myidenttab(col2) values (`Movie')
Select @@identity

(Note: Now it should be 101.)

When you run the TRUNCATE TABLE statement, it resets the Identity value back to the initial seed. Ordinary deletes--even deletes of every row in the table--do not have this effect, however. Try it here:

Delete myidenttab
Go
Insert myidenttab (col2) values (`Zebra')
Select @@identity

(Note: It should be 201, even though it's the only row in the table.)

Truncate table myidenttab
Go
Insert myidenttab (col2) values (`howdy')
Select @@identity

(Note: It should be 1 again.)

Identity values are kept as a pool in memory. It's not guaranteed that every one will be used, because some transactions may be canceled, and sometimes server crashes occur. Also, at times (especially after server crashes), Identity values may attempt to be re-used. If you want your Identity columns to be unique, you need to have a unique index on that column.

Some people want to re-use Identity values. By default, however, you cannot manually insert into Identity columns. You can use the identity_insert option, however, to override that rule for a single table from a single session.

The following code turns on the capability to directly insert into a table's Identity column:

Set identity_insert [database.[owner.]]tablename ON|OFF

You must be the SA, DBO, or the table owner to turn on this option.

To use this insertion capability on mytable5, run this T-SQL code. You must run it all from within the same ISQL/w window (or SQL Query Tool window) because of the session-specific SET statement.

set identity_insert myidenttab on
go
insert myidenttab (col1, col2) values (2,'jolly')
go
set identity_insert myidenttab off
go
select * from myidenttab

This code inserts the row you requested into the table. You must specify the column list, even if you are specifying a value for every column in the table.

ANSI Constraints

ANSI constraints are functionally very similar to the traditional objects you looked at earlier. However, they are not separate objects; they are part of the definition of the tables in your database. You can use ANSI constraints to enforce domain integrity with default and check constraints, as with defaults and rules, or referential integrity, with primary keys and foreign keys. You also can enforce entity integrity with unique constraints and/or primary keys.

Using ANSI constraints can be a major improvement over using defaults and rules. There's no need to keep a separate set of objects that are used with your tables and no need to keep track of bindings. Constraints are stored in the sysconstraints, sysreferences, syscomments, and sysobjects system tables in each database.

Now examine the syntax of constraints as an extension to the CREATE TABLE and ALTER TABLE statements.

CREATE TABLE [database.[owner].]table_name
({col_name column_properties [constraint [constraint
 [...constraint]]]| [[,] constraint]}[[,] {next_col_name |
 next_constraint}...])
[ON segment_name]

This syntax is the same syntax you worked with on Day 5, "Creating Tables," except now you will focus on where it says constraint. Constraints follow this form:

[CONSTRAINT <name>] <Type of Constraint> [<Constraint Options>]

ALTER TABLE [database.[owner.]]table_name [WITH {CHECK |
 NOCHECK}]
{{CHECK | NOCHECK} CONSTRAINT {constraint_name | ALL}|
[ADD {col_name column_properties [column_constraints]    |
[[, ] table_constraint]}
[, {next_col_name | next_table_constraint}]...]|
   [DROP CONSTRAINT]constraint_name [, constraint_name2]...]}

This code is of the form shown earlier for CREATE TABLE.

Two forms of constraints exist: column level and table level. Column-level constraints are applied at the column level of the create table, and table-level constraints are added as if they were additional columns. Examples are the easiest way to differentiate the two forms of constraints.

Column Level

Create table mytablea
(col1 int not null CONSTRAINT DF_a_col1 DEFAULT (0))

Table Level

Create table mytableb
(col1 int not null)
alter table mytableb ADD
CONSTRAINT DF_b_col1 DEFAULT (0) for col1

Default Constraints

Default constraints are very much like SQL Server defaults. However, default constraints apply only to columns--never to user-defined datatypes. You cannot apply default constraints to columns that are also Identity columns. You also cannot use default constraints with columns defined with the timestamp datatype. The difference here is that the default is part of the column instead of being bound to the column. Default constraints are enforced only during inserts, just as are SQL Server default objects.

Column Level

 [CONSTRAINT constraint_name]
DEFAULT {constant_expression | niladic-function | NULL}

Table Level

[CONSTRAINT constraint_name]
DEFAULT {constant_expression | niladic-function | NULL}
FOR col_name

The CONSTRAINT constraint_name part of the syntax is optional. This is the part that specifies that you are adding a constraint explicitly (always a good idea from a documentation perspective), as well as giving the constraint a name. If you do not name a constraint, it ends up with a name like this:

DF__mytab__col1__6BC40900

Usually, therefore, it's a good idea to name your constraints; you'll soon see that you might want to run some operations that require you to name the constraint with which you're working.

The keyword DEFAULT is next, followed by a constant appropriate for the datatype, NULL, or a niladic function. Niladic functions include the following:

CURRENT_TIMESTAMP Gets the current date and time. Equivalent to select getdate().
SYSTEM_USER Gets the current login name. Equivalent to select suser_name().
CURRENT_USER Gets the current database user name. Equivalent to select user_name().
USER The same as CURRENT_USER.
SESSION_USER The same as CURRENT_USER.

Functionally speaking, there is no difference here between table-level and column-level default constraints.

Here are a couple of examples to show default constraints:

CREATE TABLE defaulttab1
( intcol int NOT NULL CONSTRAINT df_intcol DEFAULT 0,
  char5col char(5) NOT NULL DEFAULT "Hello",
  anumber numeric(10,0) NOT NULL
)

Note that the first constraint is named, but the second one is not; therefore, it has a system-assigned name:

Alter table defaulttab1
ADD moneycol money NULL CONSTRAINT df_moneycol DEFAULT $2.00,
CONSTRAINT df_anumber DEFAULT 100 FOR anumber
Go

Run sp_help to verify that the constraints are properly on the table, and you will see something like this in the constraint section of the report:

constraint_type            constraint_name               constraint_keys
------------------------------------------------------------------------
DEFAULT on column char5col DF__defaultta__char5__6F9499E4     (`Hello')
DEFAULT on column anumber  df_anumber                         (100)
DEFAULT on column intcol   df_intcol                          (0)
DEFAULT on column moneycol df_moneycol                        (2.00)

Notice the system-assigned name for the default constraint you didn't name.

Check Constraints

Check constraints function very much like rules. They provide a mechanism to enforce domain integrity for your columns. Unlike other ANSI constraints, you can have as many check constraints as you want on a single column. Check constraints have many of the same restrictions as default constraints, such as with columns with the timestamp datatype or the Identity property. Check constraints are checked during inserts and updates, just as rules are.

Check constraints can do something that rules cannot do, however. Check constraints can refer to other columns as part of their enforcement of conditions. You can only do that with table-level constraints, however.

Column Level

[CONSTRAINT constraint_name]
CHECK [NOT FOR REPLICATION] (expression)

Table Level

[CONSTRAINT constraint_name]
CHECK [NOT FOR REPLICATION] (expression)

where

As noted previously, only table-level constraints can have references to multiple columns.

Examples are probably the best way to see how check constraints work:

Create table checktable
(col1 int not null CONSTRAINT ck_col1
     CHECK (col1 between 1 and 100),
 col2 char(5) null,
 zip_code char(5) null,
 col4 int not null,
 CONSTRAINT ck_col4 CHECK (col4 > col1),
 CONSTRAINT ck_zip_code CHECK
(zip_code like `[0-9][0-9][0-9][0-9][0-9]')
)
Alter table checktable
ADD CONSTRAINT ck_col2 CHECK (col2 like `H%')
go

Note that the rules now will be enforced. For example,

Insert checktable values (1,'Howdy','99901',2)

works, but

insert checktable values (2,'Howdy','8834A',3)

fails with the message

Msg 547, Level 16, State 2
INSERT statement conflicted with COLUMN
CHECK constraint `ck_zip_code'. The conflict occurred in
 database `pubs', table `checktable', column `zip_code'
Command has been aborted.

Primary-Key Constraints

Primary-key constraints are used for a combination of referential integrity and entity integrity. Every column used for a primary key must be defined with the NOT NULL attribute, and only one primary-key constraint may exist on a single table. The primary-key constraint may be referenced by foreign-key constraints. Some processes, such as replication or open database connectivity (ODBC) applications, may require declared ANSI primary keys.

Primary-key constraints are an implied creation of a unique index. By default, a unique clustered index is created.

Column Level

 [CONSTRAINT constraint_name]PRIMARY KEY [CLUSTERED | NONCLUSTERED]
(col_name [, col_name2 [..., col_name16]])
[WITH FILLFACTOR = fillfactor][ON segment_name]

Table Level

[CONSTRAINT constraint_name]PRIMARY KEY [CLUSTERED | NONCLUSTERED]
(col_name [, col_name2 [..., col_name16]])
[WITH FILLFACTOR = fillfactor][ON segment_name]

where

There is no real functional difference here between column-level and table-level constraints. However, with a column-level constraint, the col_name parameter is optional. If you don't specify it, it's assumed to be on the column with which you put the constraint.

You can create a primary key on a single column or on up to 16 columns. This code creates a unique clustered index on col1 of table pktable:

Create table pktable
(col1 int not null CONSTRAINT pk_col1 PRIMARY KEY,
 col2 char(5) null
)

This code creates a unique nonclustered index on col1 of table pktable2:

Create table pktable2
(col1 int not null CONSTRAINT pk2_col1      PRIMARY KEY nonclustered (col1),
 col2 char(5) null
) 

This code creates a unique clustered index on (col1, col2) of table pktable3:

Create table pktable3
(col1 int not null,
 col2 char(2) not null,
 col3 int null,
 CONSTRAINT pk3_col1col2 PRIMARY KEY (col1, col2)
)

In all instances, you can view the index, but you cannot manipulate it directly. If you attempt to drop the index on the last table, for example, you get an error such as this:

Msg 3723, Level 16, State 1
Explicit DROP INDEX not allowed on index `pktable3.pk3_col1col2'.
 It is being used for PRIMARY KEY constraint enforcement.

Unique Constraints

Unique constraints allow you to create unique indexes, just as primary keys, but with a bit more flexibility. You typically create unique constraints if you have more than one column or a set of columns that could be valid primary keys. Creating unique constraints serves two purposes: It documents the potential key choices, and it allows foreign keys on other tables to reference the unique constraints (in addition to allowing foreign keys to reference primary-key constraints).

You also can create unique constraints on columns that allow nulls. You can have more than one unique constraint on a table.

Column Level

 [CONSTRAINT constraint_name]    UNIQUE [CLUSTERED | NONCLUSTERED]
 (col_name)][WITH FILLFACTOR = fillfactor][ON segment_name]

Table Level

 [CONSTRAINT constraint_name]UNIQUE [CLUSTERED | NONCLUSTERED]
(col_name [, col_name2 [..., col_name16]])
[WITH FILLFACTOR = fillfactor]    [ON segment_name]

Just as before, the name of the unique constraint is optional. Also, just as with primary-key constraints, at the column level, you don't have to list any columns. It assumes you're referencing the column on which you create the constraint.

So, just a quick example of this:

Create table myuniquetable
(col1 int not null CONSTRAINT pk_myuniquetable PRIMARY KEY,
 col2 char(20) not null CONSTRAINT u_myuniquetable UNIQUE
)

This code creates a primary key as well as a unique constraint. Both are unique indexes on the table myuniquetable.

Foreign-Key Constraints

Foreign-key constraints protect referential integrity between tables. You create a foreign key on a table that references another table's primary-key or unique constraint. This method restricts data modifications against the table with the primary key, as long as related rows exist in the tables with the foreign keys. This method also prevents data from being added (or updated) on the table with the foreign-key constraint that would not contain valid data from the referenced table(s).

Creating a foreign key does not create an index on the table; however, it's likely that this is a good candidate for an index. Therefore, you typically need to follow your creation of tables with foreign keys with CREATE INDEX statements. You can refer back to tables in the same database only when creating foreign-key constraints. You must have the appropriate permissions (select or references) on the table you refer back to, and any single table can have a maximum of 31 foreign keys pointing back to it. There is no way to extend this limit.

Column Level

[CONSTRAINT constraint_name][FOREIGN KEY [(col_name)]]
REFERENCES [owner.]ref_table [(ref_col)]

Table Level

[CONSTRAINT constraint_name]    FOREIGN KEY (col_name
    [, col_name2 [..., col_name16]])REFERENCES [owner.]ref_table
 [(ref_col [, ref_col2     [..., ref_col16]])]

Again, the constraint name is optional. As with the other referential constraints, you don't have to have the column name referenced locally if it's a single-column constraint. Also, you don't have to name the column on the other table if the columns have the same name.

If you reference multiple-column primary-key/unique constraints, you must be careful to reference them in the same order between your column list in the FOREIGN KEY list and the REFERENCES list. Self-references are supported, so you can reference the table to itself (with another column).

In this example, you will create an employee table and an order table (which was entered by an employee). To verify that a valid employee entered the order, you can program the functionality or declare it with foreign keys. Then, when someone tries to delete an employee, he or she won't be allowed to do so as long as there are orders for that employee.

Create table emp
(emp_id int not null CONSTRAINT pk_emp PRIMARY KEY,
 emp_name char(30) not null)
go
create table orders
(order_id int not null CONSTRAINT pk_order PRIMARY KEY,
 emp_id int not null CONSTRAINT fk_order
FOREIGN KEY (emp_id) REFERENCES emp (emp_id)
)
go
insert emp values (1,'Joe Smith')
insert emp values (2,'Ann Jones')
insert orders values (1,1)
insert orders values (2,2)
go

All of this works fine so far. Now try to insert an order for an employee who doesn't exist:

Insert orders values (3,3)
go
Msg 547, Level 16, State 2
INSERT statement conflicted with COLUMN FOREIGN KEY constraint
 `fk_order'. The conflict occurred in database `pubs',
table `emp', column `emp_id'
Command has been aborted.

Now try to delete an employee who has an order:

Delete emp where emp_id = 1
go
Msg 547, Level 16, State 2
DELETE statement conflicted with COLUMN REFERENCE
constraint `fk_order'. The conflict occurred in database `pubs',
 table `orders', column `emp_id'
Command has been aborted.

A use of the self-referencing behavior looks something like this:

Create table emp_manager
(emp_id int not null CONSTRAINT pk_emp_mgr PRIMARY KEY,
 mgr_id int not null CONSTRAINT fk_emp_mgr FOREIGN KEY
REFERENCES emp_manager (emp_id),
  emp_name char(30) not null)

This means that every manager also must be a valid employee:

Insert emp_manager values (1,1,'Ann Jones')
Insert emp_manager values (2,1,'Tom Smith')

This works fine, but now try to reference someone who doesn't exist yet:

Insert emp_manager values (3,4,'Bob Newett')

Here, you get a similar message as before--that the foreign-key constraint was violated. This can be very useful in many real-world scenarios.

As you can see, foreign keys are quite powerful. However, they force you to use a database in a particular fashion--the delete restrict functionality. Delete restrict means that deletions of primary-key (or unique constraint) rows are not allowed if any foreign keys point to them. You first must delete any rows on the tables with the foreign-key references before you can delete the primary-key rows.

Another way you might want to deal with data is by delete cascade functionality. This functionality implies that the DELETE from the emp table also would delete all related rows in the orders table, for example. It's somewhat unlikely that you would use this functionality in most business environments; it's far more likely that you would do something like set an inactive status flag for the employee and not delete all orders taken by that employee when he or she leaves the company. SQL Server does not implement this functionality; with declarative referential integrity, you have to program it yourself.

Dropping Constraints

You can drop a constraint with the ALTER TABLE statement. To drop the foreign-key constraint in the last example, you run

Alter table emp_manager DROP CONSTRAINT fk_emp_mgr

and the foreign-key constraint is dropped.

However, if you try to drop a primary-key constraint (or unique constraint) that still has foreign-key references, you will not be able to do so. With the emp_manager table, for example, if you try to drop the primary-key constraint (without having dropped the preceding foreign key) with the code

Alter table emp_manager DROP CONSTRAINT pk_emp_mgr

you get this error message:

Msg 3725, Level 16, State 0
Constraint `pk_emp_mgr' is being referenced by
table `emp_manager', foreign key constraint `fk_emp_mgr'.
Msg 3727, Level 16, State 0
Unable to drop constraint. See previous errors.

SQL Enterprise Manager

All this functionality can be accomplished with Enterprise Manager. To access this dialog box, choose Manage | Tables (or right-click on a table and choose Edit from the context menu). The Manage Tables dialog box appears. Click the icon of the table with the green plus sign on it to access the advanced features, as shown in Figure 14.6.

Figure 14.6. Using the advanced options of the Manage Tables dialog box.

You can see tabs available for Primary Key/Identity, Foreign Keys, Unique Constraints, and Check Constraints. To step through setting up a new table with one of each constraint, select <new> from the list of tables, and then enter information in the columns, as shown here and in Figure 14.7:

Column Name Datatype Size Nulls Default
pkcol int
fkcol int 3
chkcol char 10
defcol int 0
altpkcol int

Figure 14.7. The Manage Tables dialog box with your new table.

Now click the Advanced Features icon. Click the down arrow to access the Primary Key Column Name drop-down list and select pkcol, as shown in Figure 14.8. In the Type section, select Non-Clustered. Then, select pkcol from the Identify Column drop-down list. Now click Add to set the primary key and Identity properties.

Figure 14.8. The primary key and Identity properties.

Next, select the Foreign Keys tab, as shown in Figure 14.9. Select dbo.emp - pk_emp_1_10 as the referenced table, and select fkcol from the Foreign Key Columns drop-down list. Name the foreign key fk_ourtable and click Add.

Next, select the Unique Constraints tab, as shown in Figure 14.10. Add a new unique constraint named unique_ourtable, and select altpkcol as the column name. Leave the default of Non-Clustered selected in the Type section and click Add.

Finally, select the Check Constraints tab, as shown in Figure 14.11. Here, add a check constraint called ck_ourtable. Set the constraint to chkcol like `h%' and then click Add.

Figure 14.9. The Foreign Keys tab.

Figure 14.10. The Unique Constraints tab.

Figure 14.11. The Check Constraints tab.

Click the Advanced Features icon again to hide the constraint dialogs, and click the floppy disk icon (to the left of Advanced Features) to save the table. When prompted, name the table ourtable.

Now, run sp_help on the table from the SQL Query Tool or ISQL/w and view the constraints.

Notice that you've created one of each kind of constraint on this table, and you've used the Identity property. The Enterprise Manager Manage Tables dialog allows quite a bit of functionality in a small amount of space.

Deferring Constraints

Foreign-key and check constraints can be deferred when they are created. Primary-key, unique, and default constraints cannot be deferred (because they create indexes in the case of primary-key and unique constraints, and defaults are never checked for previous data). When you alter a table, add one of these constraints, and specify the NOCHECK option, existing data is not validated. If you alter a table and add a check constraint, existing data is verified for validity, as shown in this example:

Create table mytesttable
(col1 char(5) not null
)
go
insert mytesttable values (`Howdy')
insert mytesttable values (`Grant')

Now try to create a constraint that will fail for some of the data:

alter table mytesttable
add constraint ck_mytest check (col1 like `h%')

It fails, and you see this:

Msg 549, Level 16, State 2
data exists in table `mytesttable', database `pubs', that
violates CHECK constraint `ck_mytest' being added.
ALTER command has been aborted.
Msg 1750, Level 16, State 0
Unable to create constraint. See previous errors.

However, if you create the same constraint with the NOCHECK option, it works:

alter table mytesttable
WITH NOCHECK
add constraint ck_mytest check (col1 like `h%')

The same is true of foreign-key constraints. Be careful when using this option, because you are recognizing that you may have invalid data in your database.

Disabling Constraints

You can disable constraints temporarily so that you can add invalid data that violates your check constraints or data that violates your foreign-key constraints. To do this, run the ALTER TABLE command WITH NOCHECK CONSTRAINT <constraintname> or with the WITH NOCHECK ALL option to disable all check and foreign-key constraints.

Continuing the preceding example, the following disables constraint checking:

alter table mytesttable
NOCHECK CONSTRAINT ALL

You then can add invalid data (data that violates the check constraint). You can re-enable the constraint by running

alter table mytesttable
CHECK CONSTRAINT ALL

Summary

You can enforce data integrity in several ways. Traditional integrity-enforcement options in SQL Server revolve around rules, defaults, and user-defined datatypes. You also can enforce data integrity, as well as referential integrity, by using ANSI constraints. The proper choice is entirely up to your individual application needs. ANSI constraints and declarative referential integrity are typically preferred, because they tend to perform better and are integrated into the definitions of objects. If you need some functionality, however, such as delete cascade referential integrity, you will not be able to use declarative referential integrity. You will need to use programming such as triggers or stored procedures.

You also learned about the Identity property. This property can help you create automatic key values when no natural primary key is available for a table. These values are extremely useful in most database designs.

Q&A

Q What options can I use to enforce entity integrity?

A
Unique indexes, primary keys, and unique constraints.

Q What does the Identity property do to a column?


A
It generates an automatic value for an integer column.

Q Can a rule refer to another column in a table?


A
No. Only table-level check constraints can refer to another column.

Workshop

This Workshop provides quiz questions to help you solidify your understanding of the material covered. Try to understand the quiz before continuing on to the next day's lesson. Answers are provided in Appendix B, "Answers."

Quiz

1. Is the following a valid default definition?
	   Create default mydefault as `UNKNOWN'
1A. Yes.

2. Can you drop a rule while it's still bound to a column, even if there's no data in the table?

2A. No; it must be referenced elsewhere.


3. Is a numeric datatype allowed for Identity columns?

3A. Yes, as long as the scale is 0.


4. Can a foreign key refer to a table in a different database?

4A. No; foreign keys can refer only to tables in the same database.


5. Can a unique constraint be deferred or disabled?

5A. No; it creates a unique index that cannot be disabled.


6. Can you drop the indexes created by some constraints?

6A. Not directly; you must manage them by controlling the constraints.


Previous chapterNext chapterContents


Macmillan Computer Publishing USA

© Copyright, Macmillan Computer Publishing. All rights reserved.