Special Edition Using Microsoft SQL Server 6.5 Second Edition

Previous chapterNext chapterContents


- 8 -
Adding, Changing, and Deleting Information in Tables

SQL Server permits you to change the column values of tables and add new columns to a table.
See how to complete standard add, change, and delete operations on your database tables.
You'll learn to combine data from multiple tables using relational joins and the UNION statement.
SQL Server permits you to remove selected rows from a table with the DELETE statement.

Understanding how to retrieve information from your database, as outlined in the last chapter, is only the start of making full use of SQL Server. As you saw in Chapter 4, "Data Modeling, Database Design, and the Client/Server Model," it takes some planning to implement a good database structure. By planning, you introduce some intricacies into how you work with the information in the database. This is because, in many cases, you must work with more than one table to retrieve or update the information for a given transaction.

Information that is stored in multiple tables is often combined in a single query, called a Join. A Join lets you combine rows logically across tables, producing a single output table.

The tables must be created with related columns of data, typically by creating a common key between the tables. You must be able to issue queries that not only combine but also eliminate values from multiple tables. This is where the restrictive clauses of the SELECT statement come into play. The goal is always that only the requisite rows appear in the resultant record set. The typical database operations, add, change, and delete, are accomplished with INSERT, UPDATE and DELETE statements, respectively. These statements can operate on one or more rows in your table and can be directed against a logical view across multiple tables.

In addition to changing the values contained in existing columns within table rows, you'll find that you may need to add additional columns of information to a database table. You use the ALTER TABLE statement to add one or more columns to an already existing table, as well as change other characteristics of the table.

In this chapter, you learn to write queries that retrieve rows of related information from multiple tables. You also learn to update and delete the rows of a database table and to change the characteristics of database tables by adding columns. n

Adding Rows

Of all the operations you perform on your database tables, the act of adding information to your database is probably the most basic. After all, you have to have the information in the system before you can write a really great client/server application.

To add information to your tables, you use the INSERT statement. The INSERT statement lets you indicate the table and columns you're inserting information into, and the values that are to be inserted. The syntax of the INSERT statement is as follows:

INSERT [into] [target] [(columns)] [DEFAULT | values | statement]

The into clause is optional, but you might want to consider including it because it makes it a bit easier to read. It's sort of like commenting your code. It doesn't impact performance positively or negatively, so the added clarification on the statement might prove useful at a later time.

The target parameter can refer to one of two things: Table--Indicates the name of the table into which you want to insert the values. This can also take the form, as indicated in Chapter 7, "Retrieving Data with Transact-SQL," of specifying the database.owner.tablename you want to insert into. This is probably the option you'll use most often. View--You can use views to insert information into the underlying tables. For more information about this technique of updating your tables, see Chapter 10, "Managing and Using Views." In most cases, you'll be calling out the name of the table you'll be inserting into, but you must indicate one of the two options, or SQL Server won't know where to store the information you're inserting.

Columns tells SQL Server into which columns you'll be inserting. More importantly, however, it specifies two additional items: order of the incoming information, and whether any columns are being excluded from the incoming data.

For example, if you have a table with Name, Address, and Phone as columns, your insert statement might begin as follows:

insert into addresses (Name, Phone)...

This example would let you insert two values, but would skip the address column.


CAUTION: If you call out the columns in your INSERT statement and do not provide information for a column that does not allow NULLs, the insert fails.

You can work around this by using the DEFAULT option, indicating a value that should be used if a value is not provided in the INSERT statement. The easier approach is to define a Default Constraint on the table. You can find out more about this option in Chapter 6 in the "Default Constraint" section and in Chapter 13, "Managing and Using Rules, Constraints, and Defaults."


The last option gives the values that should be placed in the database to SQL Server. For example, to insert a record with the partial field listing used in the previous code snippet, the statement should look like the following:

insert into addresses (Name, Phone) values ("Caitlin Wynkoop", "520-555-1212")

This statement results in the single row being inserted with the updated name and phone number . The Address column is null, or populated by the default value assigned to it.

If you want to insert a row that simply contains all of the default values that you've defined as DEFAULT constraints for the table, you can use the DEFAULT VALUES clause on the INSERT statement. When you do, the values for the columns in the table are all set up to contain the defaults or NULL, whichever is defined for the table with which you are working.

The final option for inserting information into your tables is to use a SELECT statement to gather the values to be inserted. This may seem a bit strange at first, but you'll use this often when you consider moving information from one table to another for backup purposes, structure changes, or simply so you can work with the data on a test table, rather than on the production tables.

Here's a simple example of this type of INSERT statement:

insert into addresses 
     (name, phone, address) 
     values (select name, phone, address 
     from prod_addresses)

In this example, the prod_addresses table contains the source information. Of course, it could also contain other columns, beyond those being pulled as the source for the addresses table, but, in this example, it's only required that you insert the three columns indicated.

What happens is that a single, new row is created in the addresses table for each row in the prod_addresses table. Note that the columns do not need to be specified in the physical order in which they appear on the addresses table.

The same is true of the prod_addresses table. The columns are indicated in the order in which they are needed to populate the first table. Keep in mind that the order, although not important in and of itself, must match between the tables. If it doesn't, you end up with one of two things happening. First, you have values from the source table showing up in the wrong columns in your target table. This problem can be a difficult one to track down because it generates no error indication from SQL Server.

Second, the insert fails because the data types are not correct from one table to the other. For example, if you try to insert a CHAR type from the second table into an INT in the target, SQL Server is unable to copy the information into the target.

In the next section, you see how you can use the UPDATE statement to make changes to values in your database tables.

Making Changes to Existing Information

An update or change can be performed in SQL Server in two ways. Under certain conditions, changes can be made directly to the rows of database tables. When a direct update of the table row can be made, the operation is done quickly and requires little overhead to perform the operation. An update directly to the rows of a table is referred to as an update in place.

A second way that you can change the rows of a table is an indirect or deferred update. In such an update, the change is made by deleting the row to be modified and then inserting the row as a new row with the new values in place. Although it typically still occurs quickly, a deferred update is slower because two operations are required to make a change in the row of a table: the delete, and the insertion.

The conditions under which a direct update can be performed are primarily determined by restrictions set on the database table. The following conditions must be met for a direct update to be performed on a table:

Also, a number of conditions must be met for an update in place to be performed on updates that change a single row:

The following set of conditions must be met for an update in place that changes multiple rows:


See Chapter 12, "Understanding Transactions and Locking," in the section titled "Understanding Locks," for more information.

See Chapter 11, "Managing and Using Indexes and Keys," in the section "Defining Indexes," for help about creating indexes.

See Chapter 15, "Creating and Managing Triggers," in the section titled "Using INSERT and UPDATE Triggers," to learn more about other factors that may come into play when you update your tables. For example, triggers can be set up to automatically start a process when you do an update. It is important to understand this functionality as you design your system.


If needed, perhaps because you'll be making many subsequent updates on your database tables, you can plan the table design so that all updates are direct. You can consider all the restrictions for direct updates to ensure that your updates are performed as quickly as possible.

You can use the query option, SHOWPLAN, to determine whether an update was direct or deferred. SHOWPLAN shows you exactly what SQL Server is doing behind the scenes when you execute a query. For example, for a simple query, Listing 8.1 is what SHOWPLAN indicates.

Listing 8.1 Output from SHOWPLAN Operation

STEP 1
The type of query is SELECT (into a worktable)
GROUP BY
Vector Aggregate
FROM TABLE
wwwlog 
Nested iteration
Table Scan
TO TABLE
Worktable 1
STEP 2
The type of query is SELECT
FROM TABLE
Worktable 1
Nested iteration

You turn on SHOWPLAN as a toggle. By issuing a SET SHOWPLAN ON command, all queries for that session with SQL Server will include the information like that shown in the previous listing. Key items of note include Table Scan entries. Table scans are almost never a good thing and can be downright crippling on large tables. What a table scan means is that SQL Server couldn't use any existing index to retrieve the information in the manner you requested. Instead, SQL Server read each and every row in the table to determine how it compared to your criteria. This is a much slower process than that of working an index on a table.

There will be cases where you cannot prevent a table scan, but if you can add an index that makes sense, you should consider doing so. The index will improve your performance, sometimes dramatically and make your users much happier with you.

The Process of Updating Rows

Obviously, it's likely that users will change the information in your database after it's initially entered. You can use an UPDATE statement to modify the existing column values of table rows. The simplified syntax of an UPDATE statement is as follows:

UPDATE table_name
SET column_name_1 = value,......column_name_n = value
WHERE column_name comparison operator value

The first thing you need to indicate is the table name; this can be specified with the database and owner as prefixes to the table name.

As with the SELECT statement, you use the WHERE clause to identify the rows to be changed. The WHERE clause, used as part of a SELECT statement, narrows the scope of your selection of rows that will be returned or affected by the query. In an UPDATE statement, the WHERE clause is used to identify the rows that are changed, instead of the rows to be displayed.


TIP: You can use the UPDATE statement to change erroneous entries or misspellings for the column values of existing rows of a table.

In the following example, the values for the department and badge columns of the Employees table are changed for the employee Bob Smith. If more than one row has an employee named Bob Smith, the department and badge number of each row is changed.

update employees
set department = `SALES', badge = 1232
where name = `Bob Smith'


NOTE: You can also UPDATE views as well as tables with the UPDATE statement. You simply use the name of the View in place of the table name in the UPDATE clause of the UPDATE statement. In many different operations, views are treated the same as tables. For more information on working with Views in this manner, see Chapter 10, "Managing and Using Views."

You can use UPDATE to change multiple rows that match the criteria specified by the WHERE clause. In the following example, all rows that contain the department SALES are changed to MARKETING:

update employees
set department = `MARKETING'
where department = `SALES'


CAUTION: You must be careful to specify only the rows you want changed. If you omit a WHERE clause from an UPDATE statement, the change specified in the SET clause is made to every row of the table. The following example shows a change that is made to all rows of a table:

update employees
set wageclass = `W0'

There are usually two reasons why an UPDATE statement doesn't contain a WHERE clause. First, where the WHERE was inadvertently omitted, or, second, because you purposely want to change a column for all rows of a table. For example, when you added a new column to a table with the ALTER TABLE command, you may have assigned a null value for the new column for all existing rows of the table. You can use an UPDATE statement without a WHERE clause to add a non-null value to the new column for all rows.




TIP: You can use a SELECT count(*) statement with the same criteria, specifically, your WHERE clause, that you plan to use in your UPDATE statement to learn the number of rows that will be subsequently changed by your UPDATE statement. By first determining the number of rows that will be affected by your UPDATE, you're more likely to notice any mistakes in your criteria.


See the section in Chapter 9, titled "Using COUNT," For more information on using the Count function.

SET Clause Options

You can also use an expression or the keywords DEFAULT and NULL in the SET clause of an UPDATE statement. If you use an expression in a SET clause instead of a constant value, the expression is first evaluated, and its result is assigned to the rows that are specified in the UPDATE statement.

In the following example, a raise in the hourly rate is given to all employees by updating the rate column of the pays table:

update pays
set rate=rate+2

You can use the keyword, J, to change the column value of the specified rows of a table to nulls. The table column that is to be assigned a null value must have been created with the NULL characteristic originally. In the following example, an employee who has been moved out of his current department, but not yet assigned to another department, has his department changed to a null:

update employees
set department=null
where name=`Bob Smith'

You can also use the UPDATE statement to assign a DEFAULT value if a default value has been associated with the table column. In the following example, the department for an employee is changed to the default value that was previously established and associated with the department column:

update employees
set department=default
where name ='Sally Springer'


See the sections in Chapter 13, titled "Creating Defaults" and "Binding Defaults." For more information on how you can create values that are used as defaults and put into place in your tables automatically by SQL Server.


NOTE: As indicated in the earlier section on using the Insert statement, if a default doesn't exist for the column and the column permits nulls, the column value is changed to a null.

Deleting Rows

Removing rows from a database table is another operation that you must be able to perform to maintain a database. Use a DELETE FROM statement to remove table rows. The syntax of a DELETE [FROM] statement is as follows:

DELETE [FROM] table_name
WHERE column_name = `value'


CAUTION: The DELETE statement is one to be taken seriously and cautiously. It's extremely easy to remove all rows from a table and, before you know it, you'll be looking for your most recent backup to restore your table values. There are few, if any, cases where you'll not want a very explicit WHERE clause with your DELETE statement, and you should always reread your statement prior to pressing the EXECUTE button.

You don't need to use the keyword FROM. It's optional in the DELETE statement. You can delete rows from tables, as well as update tables, through views.

In the following example, the operation of the DELETE statement removes all rows that match the criteria specified in the WHERE clause of the DELETE statement. In this case, we're removing all rows that contain the department "SALES."

delete from employees
where department = `SALES'


TIP: You can first use a COUNT function in a SELECT statement that has an identical WHERE clause to your DELETE statement to determine the number of rows that will be subsequently removed.

Keep in mind, when you are using the DELETE statement, that the sort order, which was selected when you installed SQL Server, is very important. If you find that upper- and lower-case specifics as part of the WHERE clause don't seem to have any effect on the criteria, it's likely that the sort order is case-insensitive. If you want all your subsequent DELETE and UPDATE statements to be case-sensitive, you might want to update SQL server. Use SQL setup from the SQL server program group to effectively reinstall SQL Server with a case-sensitive sort order specified.


See the section in Chapter 3, titled "Installing Server and Client Software," for more information on setting SQL Server options.

You can use a DELETE FROM statement to remove multiple rows and individual rows. However, use the DELETE FROM statement carefully. If you don't use a WHERE clause in a DELETE FROM statement, all table rows are removed, leaving you with an empty table. You'll receive no warning before the DELETE FROM statement is executed. In the following example, all rows of the specified table are deleted:

delete from employees


NOTE: The execution of a DELETE statement without a WHERE clause that removes all rows of a table is most often an accident. If you want to delete all rows of a table, but still keep the table intact, you should use the TRUNCATE statement. The syntax of TRUNCATE table is as follows:

Truncate table_name

The advantage of using a TRUNCATE TABLE statement is that the removal of rows is faster than with an equivalent DELETE statement. The TRUNCATE statement is faster because it removes pages of information that contain multiple table rows at a time whereas the DELETE statement removes individual rows at a time. However, you can't recover table rows with the TRUNCATE TABLE statement. Unlike the DELETE statement, the TRUNCATE statement does not maintain a copy of the deleted rows even if it's part of a defined transaction.

TRUNCATE TABLE and DELETE TABLE retain the database table. If you want to permanently remove a table, as well as all of the rows that it contains, you can use the DROP TABLE statement, which uses the following syntax:

ROP TABLE table_name

After you drop a table, you can't recover the rows that it contained except from a previously made backup copy of the table.




TIP: Another advantage of the TRUNCATE statement is that it doesn't log the removal of the information in the transaction log. If you have a situation where your transaction log has become full, you can still use the TRUNCATE statement to remove rows and free-up space in the database.

Adding Columns with ALTER TABLE

You primarily use the ALTER TABLE command to add more columns to an existing table. You're limited in the operations you can perform on existing columns. For example, you can't delete a column or change the size or datatype of an existing column.


NOTE: Using the ALTER TABLE statement you can implement changes to the datatype of existing columns. The ALTER TABLE statement in SQL Server and Transact-SQL, however, doesn't permit datatype changes to existing rows. In Microsoft SQL Server, you must create a new table, and then read the rows out of the old table into the new table to effect such a change.

The syntax of the ALTER TABLE statement is as shown in Listing 8.2.

Listing 8.2 Alter Table Database Syntax

ALTER TABLE [[<database.>]<owner.>]<table_name>
ADD <column_name> <datatype> NULL [constraint],
<column_name> <datatype> NULL...] [constraint]
[WITH NOCHECK]
[DROP [CONSTRAINT]
      constraint_name [..., constraint_name_n]]

When ALTER TABLE is executed, it doesn't expand existing rows. It changes only the internal description of the added columns in the system tables. Each time an existing row is read from the disk, SQL Server adds the additional null entry for the new column or columns before it's available to a user.

When a new row is written to the disk, SQL Server creates the new row with the additional column and its value. SQL Server writes the row with the additional column unless no value is specified for the new row and its value remains a null. In Listing 8.3, sp_help is used to display the existing characteristics of a table in which three columns are defined:

Listing 8.3 Database Specifics for the Employees3 Table sp_help employees3

Name                           Owner                       Type
------------------------------ --------------------------------------
employees3                     dbo                         user table
Data_located_on_segment        When_created
------------------------------ --------------------------
default                        Jul 5 1994 10:08PM
Column_name     Type            Length Nulls
Default_name    Rule_name
-------------- -------------- ------ ---- -------------- ------
name            char            30     0     (null)          (null)
department      char            30     0     (null)          (null)
badge           int             4      0     (null)          (null)
Object does not have any indexes.

No defined keys for this object.

ALTER TABLE is used to add a new column to the table. You use the sp_help procedure to verify that the new columns have been added to the table. In Listing 8.4, SELECT causes all rows of the new table, including nulls in the new column for all rows, to be displayed:

Listing 8.4 Alter Table and Query Table Output

alter table employees3
add wageclass char(2) null
sp_help employees3
Name                           Owner                          Type
------------------------------ ------------------------------ ----------------
employees3                     dbo                            user table
Data_located_on_segment        When_created
------------------------------ --------------------------
default                        Jul 5 1994 10:08PM
Column_name     Type            Length Nulls
Default_name    Rule_name
-------------- -------------- ------ ---- -------------- --------------
name            char            30     0     (null)          (null)
department      char            30     0     (null)          (null)
badge           int             4      0     (null)          (null)
wageclass       char            2      1     (null)          (null)
Object does not have any indexes.
No defined keys for this object.
select * from employees3
name                           department                     badge   wageclass
------------------------------ ------------------------------ ------ --------
Stan Humphries                 Field Service                  3211    (null)
Fred Stanhope                  Field Service                  6732    (null)
Sue Sommers                    Logistics                      4411    (null)
Lance Finepoint                Library                        5522    (null)
Mark McGuire                   Field Service        1997      (null)
Sally Springer                 Sales                9998      (null)
Ludmilla Valencia              Software             7773      (null)
Barbara Lint                   Field Service        8883      (null)
Jeffrey Vickers                Mailroom             8005      (null)
Jim Walker                     Unit Manager         7779      (null)
Bob Smith                      SALES                1234      (null)
(11 row(s) affected)

You can use an UPDATE statement to define values for new columns that are added to a table with ALTER TABLE.

The null values are inserted when a new column is added to the table with the ALTER TABLE statement. In Listing 8.5, all table rows have a new value added to the column that was added with an earlier UPDATE TABLE statement. A subsequent SELECT statement is used to display all rows of the table that include the new column values.

Listing 8.5 Update and Select Output

update employees3
set wageclass='w4'
(11 row(s) affected)
select * from employees3
name                      department            badge       wageclass
---------------------- ---------------------- ----------   --------
Stan Humphries            Field Service        3211        w4
Fred Stanhope             Field Service        6732        w4
Sue Sommers               Logistics            4411        w4
Lance Finepoint           Library              5522        w4
Mark McGuire              Field Service        1997        w4
Sally Springer            Sales                9998        w4
Ludmilla Valencia         Software             7773        w4
Barbara Lint              Field Service        8883        w4
Jeffrey Vickers           Mailroom             8005        w4
Jim Walker                Unit Manager         7779        w4
Bob Smith                 SALES                1234        w4

(11 row(s) affected)

You can also define a new column that you've added to a table with the identity characteristic. In Chapter 5 you learned that the identify characteristic permits you to define an initial value for the first row of the table, the seed, and a value that is added to each successive column to automatically generate a new column value, the increment.


NOTE: You can't assign the Identity characteristic to an existing column. Only new columns that are added to a table with the ALTER TABLE command can be defined with the identity characteristic. Also, if the value, automatically generated for a new column by the identity mechanism, exceeds the allowable values for the column's datatype, the ALTER TABLE statement fails, and an error is displayed.


See the section in Chapter 5, titled "identity Property," for more information on establishing an identity column.

In the following example, an additional column is added to table pays, which is defined with the identity characteristic and can be subsequently used as a row number:

Alter table pays
add row_number identity(1,1)

You can also add one or more columns to a table using the SQL Enterprise Manager. To add a column to an existing table through the SQL Enterprise Manager, follow these steps:

1. Right-click a selected table to which you want to add a column.

2. Choose Edit from the menu.

3. Enter one or more columns in the Manage Tables dialog box. You enter a column name, choose a datatype and a size for the datatype.

4. Click the Save Table tool on the toolbar to keep the additional columns that you've added to a table. In Figure 8.1, an additional column, wageclass, is added to the employees table.

FIG. 8.1
You can't deselect the null property on a column added to an existing table.


TIP: You can also double-click a table to open it and edit its structure.

Changing the Width of a Table Column

Remember, that you can't use ALTER TABLE to change the size of an existing table column or its datatype. You can use ALTER TABLE only to add a new column. You also can't change the size of the column of a table through the SQL Enterprise Manager. You can, however, drop a column or narrow its datatype by creating a new table with smaller, but compatible, datatypes and fewer columns.

In Listing 8.6, a new table is created in which the name and datatype of the first column are identical to the first column in an existing table, but the first column in the new table is smaller in size. In the new table, the second column is defined as VARCHAR instead of CHAR, as it's defined in the second column of the existing table. The new table's third column is defined as SMALLINT instead of INT, as it's defined in the corresponding Badge column in the existing table. The SMALLINT datatype uses half the storage space of INT.

After the new table is created, all rows of the older table are loaded into the new table with an INSERT statement. A SELECT statement is then used to display the rows of the new table.

Listing 8.6 Create Table and Select Output

create table employees4
(name char(15), department varchar(20),badge smallint)
insert into employees4
select name,department,badge from employees
(11 row(s) affected)
select * from employees4
name            department           badge
-------------- -------------------- ------
Stan Humphries  Field Service        3211
Fred Stanhope   Field Service        6732
Sue Sommers     Logistics            4411
Lance Finepoint Library              5522
Mark McGuire    Field Service        1997
Sally Springer  Sales                9998
L. Valencia     Software             7773
Barbara Lint    Field Service        8883
Jeffrey Vickers Mailroom             8005
Jim Walker      Unit Manager         7779
Bob Smith       SALES                1234
(11 row(s) affected)

The sp_help procedure in Listing 8.7 shows the difference between datatypes in the corresponding columns of the two tables used in the example. The example shows only the relevant parts of the display returned by sp_help.

Listing 8.7 Table Details for the Employees Table

sp_help employees
Name                           Owner                          Type
------------------------------ ------------------------------ ------
employees                      dbo                       user table
...
name            char            20     0     (null)          (null)
department      char            20     0     deptdefault     (null)
badge           int             4      0     (null)          (null)
...
sp_help employees4
Name                           Owner                          Type
------------------------------ ------------------------------ ------
employees4                     dbo                         user table
-------------- -------------- ------ ---- -------------- --------
...
name            char            15     0     (null)          (null)
department      varchar         20     0     (null)          (null)
badge           smallint        2      0     (null)          (null)
...

The INSERT table statement successfully completes because the data from the earlier table is compatible with the columns defined for the new table. If the data isn't compatible between the tables, you receive an error message. In Listing 8.8, a new table that defines a column as a character datatype is created. The attempted insertion of the corresponding column from one table results in an error message because the datatypes can't be implicitly converted.

Listing 8.8 Creating the Table and then Selecting the Results from the INSERT Operation

create table onecolumn
(badge char(4))
insert into onecolumn
select badge from employees
Msg 257, Level 16, State 1
Implicit conversion from datatype `int' to `char' is not allowed.
Use the CONVERT function to run this query.


NOTE: If you're transferring a large number of rows between tables, you can first set a database option called select into/bulkcopy. If the select into/bulkcopy option is set, your rows are copied into a new table faster because SQL Server keeps less information in its transaction logs about your operation. The lack of complete log information about your operation, which prevents an undo or rollback operation to be done later, is probably not important because you still have the rows in the original table intact should the need arise to undo any operations.
From an ISQL/W command line, the select into/bulk copy option can be set on or off by issuing the following command:

sp_dboption database_name, `select into/bulkcopy', TRUE|FALSE

For example, the following command turns on select into/bulkcopy for the database employees:

sp_dboption database_employees, `select into/bulkcopy', true






TIP: You can also change a database option using the graphical interface of the SQL Enterprise Manager rather than a command line.

Removing a Column from a Table

Although you can't remove a column from a table with the ALTER TABLE command, you can remove a column from a table through a series of operations. You also can't remove a column from a table with the SQL Enterprise Manager. First, create a new table that you define with all but one of the columns in an existing table. Then use an INSERT statement to copy rows from the original table to the new table, minus the column that you didn't define in the new table.

In the following example, a new table, which contains only two of the three columns defined in an existing table is defined; INSERT is used with a SELECT statement that references only two of the three columns of the original table in the SELECT clause:

create table employees5
(name char(20), badge int))
insert into employees5
select name,badge from employees

Adding Constraints with ALTER TABLE

You can also use the ALTER TABLE command to add, drop, apply, or bypass constraints or checks on existing tables. Constraints are defined to provide data integrity on added columns. The ALTER TABLE statement, like the CREATE TABLE statement, enables you to add a column to a table with primary and foreign key, unique, and check and default constraints. You can add or drop constraints to or from a table without adding a new column. The syntax for constraints is identical to the syntax used for defining constraints in the CREATE TABLE statement.


See the section in Chapter 13, titled "Creating and Using Constraints," for more information about defining constraints.

In the following example, a unique CONSTRAINT is added to the badge column for the table employees2:

ALTER TABLE employees2
ADD
CONSTRAINT badgeunc UNIQUE NONCLUSTERED (badge)


NOTE: Microsoft added a number of additional options to the ALTER TABLE statement in version 6 of SQL Server. All the additions were characteristics that were made to a table in other ways before version 6. You can continue to use the older and more direct ways of changing table characteristics. For example, an index, default, or rule can be defined and subsequently associated with a table using CREATE INDEX, CREATE RULE, or CREATE DEFAULT commands.
The changes that were made to the ALTER TABLE statement, as well as many other statements, allow Transact-SQL to meet the specifications of a standardized specification of SQL, ANSI SQL. The addi-tions for ANSI compatibility result in multiple ways of performing the same operations, sometimes using different keywords or syntax.

You can easily drop a constraint from a table using the DROP CONSTRAINT clause of the ALTER TABLE statement. You simply specify the name of the constraint to be removed from a table after the keywords DROP CONSTRAINT. For example, to remove a DEFAULT constraint on the department column for the employees table, enter the following statement:

alter table employees
drop constraint department_default

Using the WITH NOCHECK Clause

You can add a NOCHECK clause to an ALTER TABLE statement to specify that a CHECK or FOREIGN KEY constraint shouldn't be applied on the existing rows of a table. The constraints added with the ALTER TABLE statement that contain the WITH NOCHECK clause are in effect only for rows that are subsequently changed or inserted. You can use a NOCHECK clause in an ALTER TABLE statement when you're certain that the existing data doesn't violate the constraints to speed up the execution of the ALTER TABLE statement.

You can't use WITH NOCHECK to override the initial checking of PRIMARY KEY and UNIQUE constraints. By default, SQL Server applies the constraints to existing rows in the table, as well as new rows that are added or changed later. You'll receive an error message and the ALTER TABLE statement will fail if existing data violates your constraint.

You can also specify that a CHECK constraint, which is added to a table through the ALTER TABLE statement, isn't applied to the existing rows of a table through the NOT FOR REPLICATION clause. NOT FOR REPLICATION operates as though the WITH NOCHECK clause were added to the ALTER TABLE statement. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement for a different purpose than the WITH NOCHECK clause.

If you set up the automatic copying of a table and table rows from one server system to another, the actual work of ensuring the server system that receives a copy of the data is done by an intermediate server. The NOT FOR REPLICATION clause is added to an ALTER TABLE statement to prevent the table copy on the intermediate server from being checked--an unnecessary operation.


See the section in Chapter 17, titled "Setting Up and Managing Replication," for more information.

Adding Constraints Through the SQL Enterprise Manager

You can add table and column constraints through the SQL Enterprise Manager. To add a constraint to a table or column through the Enterprise Manager, follow these steps:

1. Right-click a selected table to which you want to add a constraint in the main window of the Server Manager.

2. Choose Edit from the menu. You can also double-click the table.

Click the Advanced Features tool on the toolbar. You can click the Primary Key/Identity, Foreign Keys, Unique Constraints, or Check Constraints tabs to create each type of constraint.

3. Enter the requisite information in the Constraint box that is displayed after you click a tab. For example, in Figure 8.2, a Check Constraint is entered on the department's column for the employees table to prevent any department that isn't one of three department values from being entered and stored.

FIG. 8.2
The
Not for Replication check box can also be checked when the Check Constraint option is defined.

4. Click the Save Table toolbar button to apply the constraint to the table.

Performing Relational Joins

You can combine and display the rows of different tables and operate on the data by using the same statements used in single tables. The rows of multiple tables can be combined in various ways. The first way is called an equi-join or natural join.

You perform a natural join or equi-join by matching equal values for rows in shared columns between multiple tables. You must define one of the two tables so that a column from one of the tables is duplicated in the second table. The column from the original table can be its primary key if it also is a column with values that make the rows of the table unique. The duplicate column that is added to a second table is referred to as a foreign key.

You define a foreign key to permit rows from different tables to be related. In a sense, the matching columns are used to form virtual rows that span a database table. Although each table is limited to 250 columns, matching columns to combine rows from multiple tables can result in an almost unlimited number of columns that can be combined across tables.

You use a standard SELECT statement with a WHERE clause to retrieve data from two or more tables. The syntax of a SELECT statement used to join two tables is as follows:

SELECT column_name_1,...column_name_n
FROM table_name_1, table_name_2
WHERE primary_key_column=
foreign_key_column

The CREATE TABLE and INSERT statements in Listing 8.9 are used to create a new table, to be used for subsequent relational join examples, in the same database. Each row that is added to the pays table matches one row of the employees table.

Listing 8.9 Populating the pays table

create table pays
(hours_worked int, rate int,badge int)
go
insert into pays
values (40,10,3211);
go
insert into pays
values (40,9,6732);
go
insert into pays
values (52,10,4411);
go
insert into pays
values (39,11,5522);
go
insert into pays
values (51,10,1997);
go
insert into pays
values (40,8,9998);
go
insert into pays
values (55,10,7773);
insert into pays
values (40,9,8883);
go
insert into pays
values (60,7,8005);
go
insert into pays
values (37,11,7779);
go

In Listing 8.10, three columns from two tables are displayed after the Badge column is used to combine the rows that have matching Badge numbers:

Listing 8.10 Selecting from the Related Tables

select name, department,hours_worked
from employees,pays
where employees.badge=pays.badge
name                 department
hours_worked
-------------------- -------------------- --
Stan Humphries       Field Service        40
Fred Stanhope        Field Service        40
Sue Sommers          Logistics            52
Lance Finepoint      Library              39
Mark McGuire         Field Service        51
Sally Springer       Sales                40
Ludmilla Valencia    Software             55
Barbara Lint         Field Service        40
Jeffrey Vickers      Mailroom             60
Jim Walker           Unit Manager         37
(10 row(s) affected)

An equi-join doesn't eliminate any of the table columns from the temporary tables that are formed by the joining of tables. You must use a WHERE clause to match the corresponding rows of the tables. You also shouldn't use the asterisk wildcard character to reference all columns of the combined tables. If you use an asterisk, the columns with matching values are displayed twice.

In Listing 8.11, the rows of two tables are accessed without using a WHERE clause. SQL Server forms a cross-product of the rows in both tables. If you don't try to combine the rows using matching columns, each row of the second table in the FROM clause is added to every row of the first table. The Badge column is displayed from both tables because the asterisk wildcard is used in the SELECT clause.

Listing 8.11 Selecting from the Related Tables Without the WHERE Clause

select *
from employees,pays
name               department      badge    hours_worked rate   badge
------------------ -------------- -------- ------------ ------  ----
Stan Humphries     Field Service   3211     40           9       6732
Stan Humphries     Field Service   3211     40           10      3211
Stan Humphries     Field Service   3211     52           10      4411
Stan Humphries     Field Service   3211     39           11      5522
Stan Humphries     Field Service   3211     51           10      1997
Stan Humphries     Field Service   3211     40           8       9998
Stan Humphries     Field Service   3211     55           10      7773
Stan Humphries     Field Service   3211     40           9       8883
Stan Humphries     Field Service   3211     60           7       8005
Stan Humphries     Field Service   3211     37           11      7779
Fred Stanhope      Field Service   6732     40           9       6732
...
(100 row(s) affected)

The combination of all rows of the second table with the first table results in a cross-product, also called a Cartesian Product, of the two tables. In the example, the employees table and the pays table each contain 10 rows, so the resultant cross-product creates 100 rows in the temporary table. However, only 10 of the 100 rows belong together. The badge numbers match in one out of every 10 rows between the two tables.


TIP: The SELECT statement operates on multiple tables whether or not they were designed to be combined with a relational join. It's important to always use a WHERE clause, which eliminates rows that don't have matching column values.

If you don't use a WHERE clause, you receive a temporary table that contains the cross-product of the number of rows in the first table multiplied by the number of rows in the second table. For example, two tables that each contains only 100 rows joined without a WHERE clause return 10,000 rows.


If you reference one of the columns used to match the rows across both tables, you must indicate the table in which the column is defined. Any time you reference a column that has the same name in multiple tables, you must somehow specify which column is from which table to prevent ambiguity. The following example displays an error because SQL Server doesn't know from which table to display the Badge column:

select badge
from employees,pays
where employees.badge=pays.badge
Msg 209, Level 16, State 1
Ambiguous column name badge

To avoid ambiguity, the table columns used for matching rows are preceded by the table in which they're defined and separated by a period (.). In the following example, the Badge column is displayed from the first table by preceding the name of the Badge column with its table name:

select employees.badge
from employees,pays
where employees.badge=pays.badge
badge
----------
3211
6732
4411
5522
1997
9998
7773
8883
8005
7779
(10 row(s) affected)

Using Range Variables

In the previous example, the name of the table is used to prevent ambiguity when referencing table columns in a SELECT statement when multiple tables are referenced. In fact, what appears in the examples to be the name of the table preceding the column name is actually a range variable.


NOTE: Other dialects of SQL refer to a range variable as an alias.

Range variables are symbolic references for tables that are specified in the FROM clause of a SELECT statement. You can use a range variable in a preceding clause, such as the SELECT clause, or in a clause that comes after the FROM clause, such as a WHERE clause. Define a range variable by specifying a character constant following the name of a table in the From clause of a SELECT statement, as in the following syntax:

...
From table_name_1 range_name_1, ...,table_name_n range_name_n
...

You can define a range variable for each table that is specified in the FROM clause. You can use as many as 30 characters, the limit for any permanent or temporary object in Transact-SQL, to define the range variable. A range variable can be defined to provide a shorter reference to a table in a SELECT statement. In Listing 8.12, a range variable is defined for each table. The range variables are used in both the SELECT and WHERE clauses:

Listing 8.12 Selecting Specific Rows from the Related Tables

select e.badge,p.id
from employees e,pays p
where e.badge=p.id
badge       id
---------- ----
3211        3211
3211        3211
3211        3211
3211        3211
6732        6732
4411        4411
5522        5522
1997        1997
9998        9998
7773        7773
8883        8883
8005        8005
7779        7779
3211        3211
3211        3211
3211        3211
3211        3211
(17 row(s) affected)

Range variables are so named because after they're defined, the symbolic reference applies to, or ranges through, the table. As in the previous example, you can define a range variable to be a single character and use it as a short nickname for a table.

Range variables can be quite handy because several tables with long names can be specified in a SELECT statement. You can combine rows from as many as 16 tables in SQL Server using Transact-SQL. If you don't explicitly define range variables in the FROM clause, they're implicitly created using the complete name of each table.

In the following example, the rows from three tables are combined using the Badge columns that are defined in all the tables. The range variables are implicitly defined to the table names and are used in both the SELECT and WHERE clauses. The WHERE clause first combines the first and second tables. The first and third tables use the AND Boolean operator:

select pays.badge, name,department, pay rate
from employees,pays, salaries
where employees.badge = pays.badge
and employees.badge=salaries.badge


NOTE: Transact-SQL automatically establishes range variables for use in queries. If you don't specify a range variable for the name of a table in the FROM clause, a range variable is created with the same name as the table, as in the following example:

select name
from employees

This snippet is internally rewritten as

select employees.name
from employees employees

Although it seems unnecessary to create range variables when only a single table is named in the query, they are mandatory when you reference tables that contain columns with the same names in the same query.


Using Many-to-One and One-to-Many Joins

You may not have tables that have only one corresponding row in each table. In previous examples, only a single row in the employees table matches the value of a single row in the pays table. It's possible that you'll have to create or work with tables in which more than one entry is a match for the entries in another table.

In the ensuing examples, rows have been added with identical badge numbers in the employees table. Three employees have been added, each with a last name of Smith and each with the same badge number. In the tables referenced in the following examples, badge isn't defined as a primary key; so duplicate badge numbers can be present. Three employees with last names of Humphries have also been added, each with the same badge number as the original employee, Stan Humphries. Listing 8.13 shows the rows of the employees table after the additional seven rows are added.


NOTE: Although you usually define a primary and foreign key by using the corresponding columns of tables that you subsequently want to combine rows from for display, you aren't required to define the columns as keys. SQL Server permits you to perform joins on tables that don't have primary or foreign key definitions. You should realize that the assignment of primary and foreign keys to a table isn't required, though it's often desirable, if you combine data from different tables.

Listing 8.13 Selecting Rows in a Sorted Order

select * from employees
order by badge
name                 department           badge
-------------------- -------------------- ----
Bob Smith            SALES                1234
Henry Smith          Logistics            1234
Susan Smith          Executive            1234
Mark McGuire         Field Service        1997
Gertie Humphries     Sales                3211
Stan Humphries       Field Service        3211
Stan Humphries Jr    Sales                3211
Winkie Humphries     Mailroom             3211
Sue Sommers          Logistics            4411
Lance Finepoint      Library              5522
Fred Stanhope        Field Service        6732
Ludmilla Valencia    Software             7773
Jim Walker           Unit Manager         7779
Jeffrey Vickers      Mailroom             8005
Barbara Lint         Field Service        8883
Sally Springer       Sales                9998
(16 row(s) affected)

The pays table is unaltered and contains only the original 10 rows, as shown in Listing 8.14.

Listing 8.14 Selecting Rows from the Pays Table in a Sorted Order

select * from pays
order by badge
hours_worked rate        badge
------------ ---------- ----
51           10          1997
40           10          3211
52           10          4411
39           11          5522
40           9           6732
55           10          7773
37           11          7779
60           7           8005
40           9           8883
40           8           9998
(10 row(s) affected)

You can combine tables that have an unequal number of matching rows. Listing 8.15 joins the employees table, in which two sets of entries match a single entry for the badge column in the pays table. The join of the employees table with the pays table is called many-to-one.

Listing 8.15 Creating the Join Across Tables

select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
name                 badge     hours_worked   rate
----                 ----     ------------   ----
Fred Stanhope        6732      40             9
Stan Humphries       3211      40             10
Gertie Humphries     3211      40             10
Stan Humphries Jr    3211      40             10
Winkie Humphries     3211      40             10
Sue Sommers          4411      52             10
Lance Finepoint      5522      39             11
Mark McGuire         1997      51             10
Sally Springer       9998      40             8
Ludmilla Valencia    7773      55             10
Barbara Lint         8883      40             9
Jeffrey Vickers      8005      60             7
Jim Walker           7779      37             11
(13 row(s) affected)

If you switch the order of the joined tables, it becomes a one-to-many join. Listing 8.16 returns the same rows that were returned in the previous example:

Listing 8.16 Creating the Join Across Tables by Employee Badge

select name,pays.badge,hours_worked,rate
from pays,employees
where pays.badge=employees.badge
name                 badge       hours_worked   rate
----                 ----       ------------   ----
Fred Stanhope        6732        40             9
Stan Humphries       3211        40             10
Gertie Humphries     3211        40             10
Stan Humphries Jr    3211        40             10
Winkie Humphries     3211        40             10
Sue Sommers          4411        52             10
Lance Finepoint      5522        39             11
Mark McGuire         1997        51             10
Sally Springer       9998        40             8
Ludmilla Valencia    7773        55             10
Barbara Lint         8883        40             9
Jeffrey Vickers      8005        60             7
Jim Walker           7779        37             11
(13 row(s) affected)

Using Many-to-Many Joins

You may also want to join tables where more than one row matches more than one row in a second table, which is referred to as a many-to-many join. In Listing 8.17, two tables are combined after one row is added to pays with a 3211 badge number, 73 hours_worked, and rate of 31.

Listing 8.17 Showing the New Row Added

select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
name                 badge       hours_worked  rate
----                 ----       ------------  ----
Fred Stanhope        6732        40            9
Stan Humphries       3211        40            10
Gertie Humphries     3211        40            10
Stan Humphries Jr    3211        40            10
Winkie Humphries     3211        40            10
Sue Sommers          4411        52            10
Lance Finepoint      5522        39            11
Mark McGuire         1997        51            10
Sally Springer       9998        40            8
Ludmilla Valencia    7773        55            10
Barbara Lint         8883        40            9
Jeffrey Vickers      8005        60            7
Jim Walker           7779        37            11
Stan Humphries       3211        73            31
Gertie Humphries     3211        73            31
Stan Humphries Jr    3211        73            31
Winkie Humphries     3211        73            31
(17 row(s) affected)

The additional row is added to the displayed temporary table . If the row value is restricted to only the badge number 3211, eight rows are returned.

Many-to-many queries are often not desirable and can produce results that are difficult to follow. In most cases, it's best to implement either a one-to-many or a many-to-one relationship, even if it entails adding an intermediary table (see Listing 8.18).


For more information on database design approaches, see Chapter 4, "Data Modeling, Database Design, and the Client/Server Model."

Listing 8.18 Showing the Many-to-Many Relationship

select name,pays.badge,hours_worked,rate
from employees,pays
where employees.badge=pays.badge
and pays.badge=3211
name                 badge       hours_worked  rate
----                 ----       ------------  ----
Stan Humphries       3211        40            10
Gertie Humphries     3211        40            10
Stan Humphries Jr    3211        40            10
Winkie Humphries     3211        40            10
Stan Humphries       3211        73            31
Gertie Humphries     3211        73            31
Stan Humphries Jr    3211        73            31
Winkie Humphries     3211        73            31
(8 row(s) affected)

Using Outer Joins

In the previous join examples, we excluded the rows in either of the two tables that didn't have corresponding or matching rows.

Previous examples, in which the rows of two tables were joined with a WHERE statement, included all rows of both tables. However, a query that includes all rows from both tables is probably never useful, except to understand the way in which SQL Server combines the rows. You can combine any two or more tables with a WHERE clause and receive a set of rows that were never meant to be combined and thus receive a meaningless result.

In practice, you'll combine only the rows from tables that have been created to be matched together. Tables that are designed to be combined have common columns of information so that a WHERE clause can be included in a query to eliminate the rows that don't belong together, such as those that have identical values.


NOTE: You must ensure that the information used to combine tables-- the corresponding values in common columns-- remains valid. If a one-to-many relationship exists, and the value in one table is changed, the corresponding identical value, or values, must also be updated in other tables.
Referential integrity involves ensuring that you have valid information in common columns across tables used to join tables. You'll read more about referential integrity in subsequent chapters. Chapter 15, "Creating and Managing Triggers," discusses the mechanism for maintaining referential integrity and Chapter 11, "Managing and Using Indexes and Keys," discusses the common table columns on which joins are based.

Use outer joins to return table rows that have both matching and nonmatching values. You may need to return the rows that don't contain matching values in the common table columns for either one table or the other tables specified in the SELECT statement.

If, for example, you join the employees table with the pays table used in the previous examples, you can specify the return of rows with matching values along with rows without matching values. The specification of the outer join is positional, which means that you use a special symbol that precedes or follows the comparison operator in the WHERE clause of a SELECT statement.

An outer join references one of the two tables joined using the table's position in the WHERE clause. A left-outer join specifies the table to the left of a comparison operator, and a right-outer join specifies the table to the right of a comparison operator. The following table shows the symbol combination used for outer joins.

Symbol Combination Join
*= Left-outer join
=* Right-outer join

A left-outer join (*=) retains non-matching rows for the table on the left of the symbol combination in a WHERE statement. A right-outer join (=*) retains non-matching rows for the table on the right of the symbol combination.

In Listing 8.19, a SELECT statement specifies a join in the WHERE clause to return only rows that contain matching values in a common column for the two tables:

Listing 8.19 Join Limited by a WHERE Clause SELECT *

from employees,pays
where employees.badge=pays.badge
name               department     badge     hours_worked rate     badge
----------------   -------------- -------- ------------ -------- ------
Stan Humphries     Field Service  3211      40           10       3211
Gertie Humphries   Sales          3211      40           10       3211
Stan Humphries Jr. Sales          3211      40           10       3211
Winkie Humphries   Mailroom       3211      40           10       3211
Fred Stanhope      Field Service  6732      40           9        6732
Sue Sommers        Logistics      4411      52           10       4411
Lance Finepoint    Library        5522      39           11       5522
Mark McGuire       Field Service  1997      51           10       1997
Sally Springer     Sales          9998      40           8        9998
Ludmilla Valencia  Software       7773      55           10       7773
Barbara Lint       Field Service  8883      40           9        8883
Jeffrey Vickers    Mailroom       8005      60           7        8005
Jim Walker         Unit Manager   7779      37           11       7779
Stan Humphries     Field Service  3211      73           31       3211
Gertie Humphries   Sales          3211      73           31       3211
Stan Humphries Jr. Sales          3211      73           31       3211
Winkie Humphries   Mailroom       3211      73           31       3211
(17 row(s) affected)

In Listing 8.20, a left-outer join is used in the WHERE clause of a SELECT statement to specify that both rows containing matching values for a common column and the rows from the left table, employees, are included in the rows returned. This join might be useful if you needed to find out what employees don't have a pay rate associated with them.

Before the following query was executed, additional rows were added to the employees table that don't have corresponding values in a common column in the pays table:

Listing 8.20 Left-Outer Join Limited by a WHERE Clause

select *
from employees,pays
where employees.badge*=pays.badge
name               department    badge     hours_worked rate     badge
------------------ ------------ -------- ------------ -------- ----
Stan Humphries     Field Service  3211      40           10       3211
Stan Humphries     Field Servic   3211      73           31       3211
Fred Stanhope      Field Service  6732      40           9        6732
Sue Sommers        Logistics      4411      52           10       4411
Lance Finepoint    Library        5522      39           11       5522
Mark McGuire       Field Service  1997      51           10       1997
Sally Springer     Sales          9998      40           8        9998
Ludmilla Valencia  Software       7773      55           10       7773
Barbara Lint       Field Service  8883      40           9        8883
Jeffrey Vickers    Mailroom       8005      60           7        8005
Jim Walker         Unit Manager   7779      37           11       7779
Bob Smith          SALES          1234      (null)     (null)   (null)
Bob Jones          Sales          2223      (null)     (null)   (null)
Gertie Humphries   Sales          3211      40           10       3211
Gertie Humphries   Sales          3211      73           31       3211
Stan Humphries Jr. Sales          3211      40           10       3211
Stan Humphries Jr. Sales          3211      73           31       3211
Winkie Humphries   Mailroom       3211      40           10       3211
Winkie Humphries   Mailroom       3211      73           31       3211
Susan Smith        Executive      1234      (null)     (null)   (null)
Henry Smith        Logistics      1234      (null)     (null)   (null)

(21 row(s) affected)


NOTE: Recall that null values don't match, so rows that contain nulls in the primary and foreign key columns display only with outer joins and not with equi-joins.

In Listing 8.21, a right-outer join is used in the WHERE clause of a SELECT statement to specify that both rows that contain matching values for a common column, and the rows from the right table, pays, are included in the rows returned. Two additional rows, which don't have corresponding values in a common column in the employees table, are first added to the pays table.

Listing 8.21 Right-Outer Join Limited by a WHERE Clause

insert into pays
values (40,10,5555)
insert into pays
values (40,10,5555)
select *
from employees,pays
where employees.badge=*pays.id
name               department     badge     hours_worked rate     id
------------------ ---------- -------- ------------ -------- ----
Stan Humphries      Field Service 3211      40           10       3211
Gertie Humphries    Sales         3211      40           10       3211
Stan Humphries Jr.  Sales         3211      40           10       3211
Winkie Humphries    Mailroom      3211      40           10       3211
Fred Stanhope       Field Service 6732      40           9        6732
Sue Sommers         Logistics     4411      52           10       4411
Lance Finepoint     Library       5522      39           11       5522
Mark McGuire        Field Service 1997      51           10       1997
Sally Springer      Sales         9998      40           8        9998
Ludmilla Valencia   Software      7773      55           10       7773
Barbara Lint        Field Service 8883      40           9        8883
Jeffrey Vickers     Mailroom      8005      60           7        8005
Jim Walker          Unit Manager  7779      37           11       7779
Stan Humphries      Field Service 3211      73           31       3211
Gertie Humphries    Sales         3211      73           31       3211
Stan Humphries Jr.  Sales         3211      73           31       3211
Winkie Humphries    Mailroom      3211      73           31       3211
(null)              (null)        (null)    40           10       5555
(null)              (null)        (null)    40           10       5555

(19 row(s) affected)


TIP: Left- and right-outer joins can be used to show rows that contain nulls that wouldn't have corresponding entries across tables, and would only be displayed with other non-matching entries.

Combining Query Results with UNION

Use a UNION to combine the results of two or more queries. A UNION merges the results of the first query with the results of a second query. UNION implicitly removes duplicate rows between the queries. A UNION returns a single results set that consists of all the rows that belong to the first table, the second table, or both tables.

You should define the queries that contain a UNION clause so that they're compatible. The queries should have the same number of columns and a common column defined for each table. You can't use a UNION within the definition of a view.

The syntax for queries that include a UNION clause is shown in Listing 8.22.

Listing 8.22 Using the UNION Clause

SELECT column_name_1, ..., column_name_n
FROM table_name_1, ... , table_name_n
WHERE column_name comparison_operator value
[GROUP BY...]
[HAVING ...
UNION
SELECT column_name_1, ..., column_name_n
FROM table_name_1, ... , table_name_n
WHERE column_name comparison_operator value
[GROUP BY...]
[HAVING...]
[ORDER BY...]
[COMPUTE...

In Listing 8.23, the badge numbers that are common to both tables are displayed using two Select statements that are bound with a UNION clause. The ORDER BY clause is used after the last query to order the final results. The ORDER BY clause appears only after the last SELECT statement. Recall that UNION implicitly removes duplicate rows, as defined by the query.

Listing 8.23 Using the UNION Clause with an Order By Clause

select badge from employees
union
select badge from pays
order by badge
badge
----------
1234
1997
3211
4411
5522
6732
7773
7779
8005
8883
9998

(11 row(s) affected)

In Listing 8.24, the same set of queries is used except the ALL keyword is added to the UNION clause. This retains query-defined duplicates, only the Badge column, in the resultant rows. The duplicate rows from both tables are retained.

Listing 8.24 Using the UNION Clause

select badge from employees
union all
select badge from pays
order by badge
badge
------
1234
1234
1234
1997
1997
3211
3211
3211
3211
3211
3211
4411
4411
5522
5522
6732
6732
7773
7773
7779
7779
8005
8005
8883
8883
9998
9998
(27 row(s) affected)

In Listing 8.25, the datatypes referenced in the query for one of the two columns aren't compatible; the execution of the example returns an error because of this:

Listing 8.25 Data Conversions in the Use of UNION

select name,badge from employees
union
select hours_worked,badge from pay
Msg 257, Level 16, State 1
Implicit conversion from datatype `char' to `int' is not allowed.

 Use the CONVERT function to run this query.

You can use a UNION clause with queries to combine the rows from two compatible tables and merge the rows into a new third table. To illustrate this merge, in which the results are kept in a permanent table, a new table is created that has the same datatypes as the existing employees table. Several rows are first inserted into the new table (see Listing 8.26).

Listing 8.26 Inserting Rows into the Test Table

create table employees2
(name char(20),department char(20),badge int)
go
insert into employees2
values (`Rod Gilbert','Sales',3339)
go
insert into employees2
values (`Jean Ratele','Sales',5551)
go
insert into employees2

values (`Eddie Giacomin','Sales',8888)

Each table now has rows that contain employees records. If you use a UNION clause to combine the SELECT statements along with INSERT INTO, the resultant rows can be retained in a new table.

The SELECT statement that references the employees table uses WHERE to restrict the rows returned to only those with the Sales department. All three rows of the employees2 table are in the Sales department; so no WHERE clause is necessary (see Listing 8.27).

Listing 8.27 Creating the Work Table Employees3

create table employees3
(name char(20),department char(20),badge int))
go
insert into employees3
select * from employees
where department='Sales'
union
select * from employees2
(6 row(s) affected)
select * from employees3
name                 department           badge
----                 ----------           ----
Eddie Giacomin       Sales                8888
Gertie Humphries     Sales                3211
Jean Ratele          Sales                5551
Rod Gilbert          Sales                3339
Sally Springer       Sales                9998
Stan Humphries Jr    Sales                3211

(6 row(s) affected)

As in the previous example, you could use UNION to combine multiple tables, or combinations of selected columns and rows from tables, into an existing or new table. You could have tables with identical columns at different office locations in which rows are added throughout the day.

At the end of the work day, you can use a set of SELECT statements with a UNION clause to add the separate collection tables to a master table at a central location. After the rows are copied to the master table, the rows of the collection tables can be removed using a DELETE FROM statement without a WHERE clause, or a TRUNCATE statement as mentioned earlier in this chapter.

You can combine up to 16 SELECT statements by adding additional UNION clauses between each set of SELECT statements. You can use parentheses to control the order of the UNIONs. The SELECT statements within parentheses are performed before those that are outside parentheses. You don't need to use parentheses if all the UNIONs are UNION ALL. You also don't need parentheses if none of the UNIONs are UNION ALL.

In Listing 8.28, three tables are combined by using a UNION clause. IN is used with the first table to specify multiple forms of one department. The second table doesn't use WHERE, and all rows are selected. The third table specifies only a single case-sensitive department name, assuming the sort order is case-sensitive.

You don't need parentheses to control the order of the merges because no UNIONs use ALL. The resultant rows are ordered by Badge and Name using an ORDER BY clause that can appear only after the last SELECT statement.

Listing 8.28 Creating Merged Output

select name,badge
from employees
where department in (`SALES','Sales','sales')
union
select name,badge
from employees2 
union 
select name,badge
from employees3
where department='Sales'
order by badge,name
name                 badge
----                 ----
Bob Smith            1234
Gertie Humphries     3211
Stan Humphries Jr    3211
Rod Gilbert          3339
Jean Ratele          5551
Eddie Giacomin       8888
Sally Springer       9998
(7 row(s) affected)

Reality Check

There are some techniques you can use to help safeguard your applications against the changes that you'll be implementing using Add, Change, Delete operations as outlined here. For example, you can do the old stand-by; you can back up the table to another table until you're certain all changes are appropriate and no applications have been changed.

You can also use Transactions if you're writing an application to complete these tasks. By putting a transaction around these types of update statements, you can roll back the transaction if you later determine that you don't need to make the changes after all. Of course the scope of the transaction is only good until a COMMIT is reached, but you do have some leeway.


See the section in Chapter 12 starting with "Using and Understanding Transactions," for additional information about transations.

One other thing that happens frequently is that you find out several users are using your database system from applications and systems you might not have been aware of. Remember, the whole point of ODBC and especially SQL Server as an intelligent server engine is to allow open access to your databases. That means that a user could be coming in to your database and using the structures there from Excel, Access, Visual Basic, or other application environments--all in addition to any applications specifically developed to use the database.

When you change the structure of a table, you run the risk of breaking the structures that users need in their tools. One way to avoid this disruption is to provide your users with Views on the database. Have them link their applications to these Views and you'll be able to provide a level of abstraction between the user and the underlying database tables. This is a simple way to keep from breaking applications just for making basic database table structure changes.


For more information about using Views, see Chapter 10, "Managing and Using Views."

In short, keep a close eye on who is using your database and how. You can do this by locking down the security and requiring logins that you control. You do this not so much because of a need to control, but to better understand user needs when you consider any changes.

From Here...

In this chapter, you learned to add and delete rows from a database table, as well as change the characteristics of a table. You also learned how to perform operations on multiple tables using relational joins and UNION statements. Additional related information can be found in the following chapters:


Previous chapterNext chapterContents


Macmillan Computer Publishing USA

© Copyright, Macmillan Computer Publishing. All rights reserved.