On Day 10, "Data Retrieval," and Day 11, "Data Retrieval (Advanced Topics)," you looked at retrieving data using the SELECT statement. You saw how you could retrieve only some of the columns in a table and use the WHERE clause to restrict the rows to be returned. You looked at manipulating the data with numeric, string, and date functions. You also learned how you could summarize the data returned by using the GROUP BY and COMPUTE aggregates. Finally, you looked at using the SELECT statement to retrieve data from more than one table using the JOIN operation or by writing subqueries.
Today's lesson focuses on modifying the data in your SQL Server tables using the INSERT, UPDATE, and SELECT statements.
The basic INSERT statement adds one row to a table at a time. Variations of the basic INSERT statement allow you to add multiple rows by selecting data from another table or by executing a stored procedure. In any of these cases, you must know something about the structure of the table into which you are inserting. The following can be useful:
You learn about constraints and identity columns on Day 14, "Data Integrity." Here is the syntax for the INSERT statement:
INSERT [INTO] {table_name | view_name} [(column_list)] EXECute { procedure_name | @procedure_name_var} [[@parameter_name=] {value | @variable [OUTPUT] | DEFAULT} [, [@parameter_name =] {value | @variable [OUTPUT] | DEFAULT}]...]
The simplest method for finding the number of columns, along with their names and datatypes, is to use the Manage Tables dialog box in the Enterprise Manager (see Figure 12.1). Follow these steps to open the Manage Tables dialog box:
Figure 12.1. The Manage Table dialog box.
If you don't have Enterprise Manager readily available, you can execute the sp_help procedure to get the same information. The following output shows only part of the information returned by sp_help.
Examine the publishers table's structure:
sp_help publishers
Column_name Type Length ---------- ------- ---------- pub_id char 4 pub_name varchar 40 city varchar 20 state char 2 country varchar 30
ANALYSIS: The output shows you the column name, the datatype, and the position of each column. You need this information when building your INSERT statements.
The simplest form of the INSERT statement requires a value for every column, in the order the columns were defined. This is the order as shown in the previous examples. You can execute the following command to insert a single row into the publishers table:
insert into publishers values(`9956', `A New Publisher', `Poulsbo', `WA', `USA')
If you have permission to INSERT into this table and you are not violating any constraints, then you should get the following message from the SQL Server:
(1 row(s) affected)
NOTE: Because the columns were all character datatypes, all values are enclosed in quotation marks. Numeric values are not enclosed in quotation marks.
The simplest form of the INSERT statement requires that you explicitly supply a value for each column; those values must be in the correct sequence. If you want to supply the values in a different order, or if you do not want to supply an explicit value for a column, you can use another variant of the INSERT statement.
The following INSERT statement has a list of column names before the VALUES clause, and that list includes only a subset of the column names in the table; the values list then only needs to have values for the columns listed.
insert into publishers(state, pub_id) values(`AK', `9932')
What happens to the columns that aren't mentioned in the column list? If you insert a new row, every column must have some value. If you don't supply a value, SQL Server must be able to determine one. In order for SQL Server to determine a value, every column not mentioned in the list of columns must meet one of the following criteria:
Defaults and identity columns are discussed on Day 14. NULLs and timestamp columns are discussed on Day 5, "Creating Tables."
There is not a value supplied for the publisher name, the city, or the country in the preceding INSERT statement. In the publishers table, the publisher name and city columns both allow NULLs, and the country has a default value of USA. After executing the INSERT statement, you can run the following query to see the row that you have just inserted.
select * from publishers where pub_id = `9932'
Notice the NULL values and the default value for country:
pub_id pub_name city state country ------ -------- --------- ---- ------ 9932 (null) (null) AK USA
If you tried to execute an INSERT statement, leaving out values for columns that did not meet one of the listed criteria, you get an error.
This INSERT does not supply a value for the pub_id column:
insert into publishers(pub_name, city, state) values(`The Best Books', `New Orleans', `LA') Msg 233, Level 16, State 2 The column pub_id in table publishers may not be null.
ANALYSIS: The error message is not entirely complete. The problem is not that the pub_id column doesn't allow NULLs, but that it also does not have a default value and it isn't an identity column.
There is one more variation of the simple, single row INSERT statement, which is used when you don't want to include a list of column names, but do want SQL Server to use default values where they exist (this includes NULLs and identity values). You can use the keyword DEFAULT in the actual values list, as a way of telling SQL Server that it should determine what value should be used. For example:
insert into publishers values(`9950', DEFAULT, DEFAULT,`AK', DEFAULT)
If every column in a table has some kind of default value that SQL Server can determine, there is one more variation you can use. You can simply tell SQL Server to use all default values by using the keywords DEFAULT VALUES, as in the following INSERT statement:
insert into publishers default values
The statement won't run either, because it violates the NOT NULL constraint on the pub_id column. See Day 5 for more information on the NOT NULL constraint.
NOTE: When client applications like Microsoft Visual Basic or PowerBuilder connect to a database and add data, they are using the INSERT/VALUES statements at which you just looked.In addition, unbound grid type controls used in the visual front-end tools do not automatically add data to the database. A developer would have to write some code to loop through all of the values changed in the grid and then INSERT/VALUES those rows into the database.
All of the preceding INSERT statements insert a single row into a table. If you want to insert more than one row at a time, you must have a source where those rows already exist. That source is typically another table, or a join between two or more other tables. In this form of the INSERT statement, you use a subquery to determine the rows of data for insertion. Subqueries are discussed in Day 11. The subquery's result set becomes the set of rows to be inserted. The number of columns in the subquery's result set must match the number of columns in the table, and the columns must have compatible datatypes. In this first example, you create a table to keep track of addresses.
create table address_list (name varchar(20) not null, address varchar(40) not null, city varchar(20) not null, state char(2) )
ANALYSIS: This command did not return data, and it did not return any rows.
This table has four character fields, so any SELECT statement you use to populate it must return four character columns. Here is an example:
insert into address_list select stor_name, stor_address, city, state from stores (6 row(s) affected)
NOTE: Unlike the subqueries you read about in yesterday's lesson, there are no parentheses around the subquery used with an INSERT statement.
The column names used in the subquery are ignored; the table already has column names associated with each field.
You can execute another INSERT statement to add more rows to your address_list table. Suppose you want to add names and addresses from the author's name. Instead of a single name, the authors table has a first name column (au_fname) and a last name column (au_lname). The address_list table is expecting just a single value for its name column, so you can concatenate the last and first names. You can include a comma and space in the concatenated result. For example:
insert into address_list select au_lname + `, ' + au_fname, address, city, state from authors (23 row(s) affected)
In this example, the concatenated last and first name fields is longer than the width of the name column in the new address_list table. SQL Server does not return an error message when you execute this statement. When trying to insert a character string of length L1 into a column defined with maximum length L2, if L1 is greater than L2, SQL Server only inserts the first L2 characters into the table. Take a look at the values in the address_list table:
select * from address_list name address city state -------------------- ---------------------- ------------------ ---- White, Johnson 10932 Bigge Rd. Menlo Park CA Green, Marjorie 309 63rd St. #411 Oakland CA Carson, Cheryl 589 Darwin Ln. Berkeley CA O'Leary, Michael 22 Cleveland Av. #14 San Jose CA Straight, Dean 5420 College Av. Oakland CA Smith, Meander 10 Mississippi Dr. Lawrence KS Bennet, Abraham 6223 Bateman St Berkeley CA Dull, Ann 3410 Blonde St Palo Alto CA Gringlesby, Burt PO Box 792 Covelo CA Locksley, Charlene 18 Broadway Av. San Francisco CA Greene, Morningstar 22 Graybar House Rd Nashville TN Blotchet-Halls, Regi 55 Hillsdale Bl Corvallis OR Yokomoto, Akiko 3 Silver Ct Walnut Creek CA del Castillo, Innes 2286 Cram Pl. #86 Ann Arbor MI DeFrance, Michel 3 Balding Pl Gary IN Stringer, Dirk 5420 Telegraph Av Oakland CA MacFeather, Stearns 44 Upland Hts Oakland CA Karsen, Livia 5720 McAuley St Oakland CA Panteley, Sylvia 1956 Arlington Pl Rockville MD Hunter, Sheryl 3410 Blonde St Palo Alto CA McBadden, Heather 301 Putnam Vacaville CA Ringer, Anne 67 Seventh Av Salt Lake City UT Ringer, Albert 67 Seventh Av Salt Lake City UT Eric the Read Books 788 Catamaugus Ave Seattle WA Barnum's 567 Pasadena Ave Tustin CA News & Brews 577 First St Los Gatos CA Doc-U-Mat: Quality L 24-A Avogadro Way Remulade WA Fricative Bookshop 89 Madison St Fremont CA Bookbeat 679 Carson St Portland OR (29 row(s) affected)
In this last example you create a table to keep track of the names of all the publishers and titles of all the books each publisher has published.
create table publisher_list (pub_name varchar(40) NULL, title varchar(80) NULL)
In order to populate this table, you need to join the publishers and titles tables. You should make it an outer join so those publishers who do not currently have any books published are included.
insert into publisher_list select pub_name, title from publishers left outer join titles on publishers.pub_id = titles.pub_id
SQL Server has one more option for inserting rows into a table. If a stored procedure returns a single result set and you know the number and type of columns that the result set contains, you can INSERT into a table using the results returned when calling that stored procedure.
You write your own stored procedures in Day 15, "Views, Stored Procedures, and Triggers," but for now you can use a system stored procedure. In Day 4, "Devices and Databases," you looked at the system procedure called sp_spaceused, which returns information about the space usage of a single table. You can create a table to hold the results of running this procedure, and INSERT a row into this table at regular intervals. With the latter, you can monitor the table's growth over time.
Take a look at the output of the sp_spaceused procedure (note that your results may vary slightly):
sp_spaceused publishers name rows reserved data index_size unused ----------- ------ ---------- ---- ----------- -------- publishers 11 32 KB 2 KB 4 KB 26 KB
ANALYSIS: This procedure returns six columns, which are all character strings. Although it looks like the second column is numeric, it really isn't. The way you could determine the type is to either examine the code for the sp_spaceused stored procedure (you see how to do that in Day 15), or to try to create the table with an integer column. Notice the error message you get when you try to insert a row.
The following table should be able to hold the results:
create table space_usage (table_name varchar(30) not null, rows varchar(9), reserved varchar(10), data varchar(10), index_size varchar(10), unused varchar(10) )
To insert into this table you can execute the stored procedure sp_spaceused:
insert into space_usage execute sp_spaceused publishers
WARNING: You must make sure that if the stored procedure returns more than one result set, all the results must have the same number of columns and return the same type of data in the corresponding columns.
One very nice extension to this capability to insert rows from a stored procedure is the capability to insert rows from a remote stored procedure. If you have a procedure on a remote server that selects all rows from a table, you can execute that procedure to copy all the rows from the remote server to the local one. You must specify the following in your remote procedure call to run a remote procedure:
For example, if you had a SQL Server named Wildlife with a database named Water, a procedure named Fish, and an owner of dbo, you could run the following query:
insert into local_table execute remote_Wildlife.Water.dbo.Swim
There are a few additional considerations when inserting data into a table with an identity column. These issues are covered in Day 14, when you learn more about data integrity. Day 14 also covers restrictions when attempting to insert values where there are check constraints on referential integrity constraints on the table.
The second data modification statement allows you to remove one or more rows from a table.
You use the DELETE statement to remove rows from a SQL Server table.
DELETE [FROM] {table_name | view_name} [WHERE clause]
NOTE: The word FROM is optional, as is the WHERE clause.
The following DELETE statement removes all rows from the sales table:
delete sales (21 row(s) affected)
To remove only a subset of rows in a table, the WHERE clause allows you to qualify the rows to be removed. The WHERE conditions can include any of the conditions that you learned about in Day 10, "Data Retrieval," including relational operators (<, >, and =), and the keywords IN, LIKE, BETWEEN, and so on.
The following DELETE statement removes all books with a pub_name of New Moon Books from the publisher_list table:
delete publisher_list where pub_name = `New Moon Books' (5 row(s) affected)
A single DELETE statement can only remove rows from a single table. However, SQL Server does allow you to include another table in your DELETE statement for use as a lookup table. The lookup table usually appears in a subquery. In the next example, remove all titles published by New Moon Books. The titles table is the one to be modified, but it only contains the publisher ID, not the publisher name. You need to look in the publishers table to find the publisher ID from New Moon Books, which then determines the rows in the titles table for removal.
delete publisher_list where pub_name = (select pub_name from publishers where pub_id = `9956') (1 row(s) affected)
ANALYSIS: The subquery accesses the publishers table and returns a single value for pub_name. That value is then used to determine which rows in publisher_list you are going to delete; that is, all rows with a pub_name equal to the returned value. Keep in mind that no more than one row will be returned, because pub_id is the primary key of the publishers table. If more than one row could be returned by the subquery, you would have to use IN instead of the equals symbol.
This next example uses a lookup table that returns more than one value. Remove all rows from the sales table that indicate the sale of business books. The sales table holds the title_id value of the book sold, but not its type. You must access the titles table to find which title_ids correspond to business books. Because you deleted the information in the sales table earlier in this chapter, you need to run the RePopSales.SQL script to repopulate the sales table before you run the following query.
NOTE: Before you delete all rows from the sales table, you are going to make a temporary copy of the sales table using the SELECT INTO statements. Before you can use the SELECT INTO statements, you must mark the database option SELECT INTO/Bulkcopy to true. Please enter all the code in the following examples. Execute your query after each GO statement. The code between the /* and */ is just a comment and doesn't need to be entered.
/* Use the pubs database. */
USE pubs GO
/* Set the database option to Select Into/Bulkcopy. */
EXEC sp_dboption pubs, `select into/bulkcopy', true GO
/* Create the temporary table to hold your sales information. */
SELECT * INTO tmpSales FROM sales GO
/* Use the DELETE statement to remove rows from the sales table. */
DELETE sales GO
/* Verify that there are no rows in the sales table. */
SELECT * FROM sales GO stor_id ord_num ord_date qty payterms title_id ------- ------- -------- --- -------- --------- (0 row(s) affected)
/* Reload the sales table from the tmpSales table. */
INSERT INTO sales SELECT * FROM tmpSales GO
/* Verify that your information has been recovered. */
SELECT * FROM sales GO
/* You should now see 16 rows of table information. */
/* Turn off the Select Into/Bulkcopy Database option. */
EXEC sp_dboption pubs, `select into/bulkcopy', false GO delete sales where title_id in (select title_id from titles where type = `business') (5 row(s) affected)
ANALYSIS: The subquery accesses the titles table and returns a list of title_id values. Those values are then used to determine which rows in sales you are going to delete--that is, all rows with a title_id equal to any of the returned values.
Transact-SQL has an extension that allows you to write DELETE statements using a FROM clause containing multiple tables. This makes the DELETE appear as a join operation, although only one table is having rows deleted. The functionality provided is the same as using subqueries. The second table is used only as a lookup table.
The following examples show how the DELETE statements use multiple tables in a FROM clause:
delete publisher_list where pub_name = (select pub_name from publishers where pub_id = `0877')
This can be rewritten as:
delete publisher_list from publisher_list, publishers where publisher_list.pub_name = publishers.pub_name and pub_id = `0877'
ANALYSIS: The choice of whether to use the subquery method or the join method depends mainly on personal preference. You may prefer the subquery method, because there is no confusion as to which table is being modified and which table is only being used as a lookup table. You should also be aware that the join method is non-ANSI standard.
In the beginning of this section you saw an example of a DELETE statement with no WHERE clause, which deletes every row in the table. There is another alternative if you really want to remove all data from a table while leaving the table structure intact. This statement is TRUNCATE TABLE.
truncate table sales
Unlike the DELETE statement, this statement does not return a message about the number of rows affected. There are some other differences between DELETE with no WHERE clause and TRUNCATE TABLE.
The third data modification statement you look at is the UPDATE statement, which allows you to change the value of columns within an existing row.
UPDATE {table_name | view_name} SET column_name1 = {expression1 | NULL | (select_statement)} [, column_name2 = {expression2 | NULL | (select_statement)}...] [WHERE search_conditions]
The SET clause, with which you specify the columns to be updated, is the part of the statement that is new.
As with the DELETE statement, the WHERE clause is optional.
The following update statement changes the ytd_sales (year-to-date sales) column in the titles table to 0 for every row. An example of what you might want to do at the start of every year follows:
update titles set ytd_sales = 0 (18 row(s) affected)
Without a WHERE clause, this statement changes the value of the ytd_sales column to 0 in every row in the table.
The following example updates the city column for the publisher Algodata Infosystems:
update publishers set city = `El Cerrito' where pub_name = `Algodata Infosystems'
An UPDATE statement can make the new value in the column dependent on the original value. The following example changes the price of all psychology books to 10 percent less than the current price:
update titles set price = price * 0.90 where type = `psychology'
An UPDATE statement can change more than one column. The word SET only occurs once, and the different columns to be changed are separated by commas. The following update statement increases the price of all popular computing books by 20 percent and appends the string `(price increase)' to the notes field of the same rows.
update titles set price = price * 1.2, notes = notes + ` (price increase)' where type = `popular_comp'
A single UPDATE statement can only change rows from a single table. However, SQL Server does allow you to include another table in your UPDATE statement to be used as a lookup table. The lookup table usually appears in a subquery. The subquery can appear in either the WHERE clause or the SET clause of the UPDATE statement. In the next example, change the publisher of all business books to New Moon Books.
update titles set pub_id = (select pub_id from publishers where pub_name = `New Moon Books') where type = `business' (4 row(s) affected)
ANALYSIS: The publisher name only appears in the publishers table, but it is the titles table that needs to be modified. The subquery accesses the publishers table and returns the publisher ID for New Moon Books. This value is used as the new value in the pub_id column of titles.
Just like for DELETE statement, Transact-SQL has an extension that allows you to write UPDATE statements using a FROM clause containing multiple tables. This makes the UPDATE appear as a join operation, although only one table is having rows modified. The functionality provided is the same as that used for subqueries; the second table is used only as a lookup table.
The following examples show how the previous UPDATE statement can be rewritten using multiple tables in a FROM clause:
update titles set pub_id = publishers.pub_id from titles, publishers where type = `business' and publishers.pub_name = `New Moon Books'
The choice of whether to use the subquery method or the join method depends mainly upon personal preference. Just as in the DELETE statement, the subquery method seems much clearer regarding which table is modified, what pub_id's new value will be, and what rows are changing. You should also remember that the join method is non-ANSI standard.
There are also some UDPATE statements that are more complicated to write using the join method. One such case is if the UPDATE statement uses subqueries for both the SET clause and the WHERE clause.
This next example changes the publisher of all psychology books published by New Moon Books to Binnet & Hardley.
update titles set pub_id = (select pub_id from publishers where pub_name = `Binnet & Hardley') where type = `psychology'and pub_id = (select pub_id from publishers where pub_name = `New Moon Books') (4 row(s) affected)
ANALYSIS: Again, the publisher name only appears in the publishers table, but it is the titles table that needs to be modified. The first subquery accesses the publishers table and returns the publisher ID for Binnet & Hardley. This pub_id value is used as the new value in the pub_id column of titles. The second subquery accesses the publishers table again, to return the pub_id value for New Moon Books. This pub_id is used to determine which rows in the titles table need to be updated.
Because the publishers table would need to appear twice--once for determining the new value of pub_id and once for determining the pub_id of the rows to be changed--this UPDATE statement would be much more difficult to write using the join method.
Today you learned the SQL Statements used to modify data in your SQL Server tables. You can add new rows to a table with the INSERT statement; you can do so either one row at time or by many rows coming from another table.
You can remove rows from a table with the DELETE statement, and you can change values in existing rows with the UPDATE statement.
Whenever you are performing data modifications, you need to remember the logging that SQL Server does. Every new row is written to the transaction log, every deleted row is written to the transaction log, and with most updates, two versions of the row are written to the log: the row before the changes are made, and the row after the changes are made. The only exceptions are when doing the SELECT INTO operation, which doesn't log the new rows in a table, and the TRUNCATE TABLE operation, which doesn't log the deleted rows. There is no magic switch to turn off logging for the server.
The Workshop provides quiz questions to help you solidify your understanding of the material covered, as well as exercises to provide you with experience in using what you've learned. Try to understand the quiz and exercise answers before continuing on to the next day's lesson. Answers are provided in Appendix B.
The exercises assume that your titles table in the pubs database is in its initial condition. If you have been modifying the table to practice the statements in this module, you have to rebuild the pubs database. To do this, please run the INSTPUB.SQL script in your \MSSQL\Install folder.
© Copyright, Macmillan Computer Publishing. All rights reserved.