Table of Contents

Chapter 13

Transferring Data

Certification Objectives *

Importing Data and Objects *

Exporting Data and Objects *

Tools for Transferring Data *

From the Classroom *

Monitoring Data Import and Export *

Using BCP *

INSERT Statement *

SELECT INTO Statement *

BULK INSERT *

Transfer Manager *

Certification Objectives

One of the most important areas of data management is transferring data. SQL Server provides many tools and utilities such as data transformation services, bulk copy program, and BULK INSERT to transfer data into and out to SQL Server. As the database gets larger, database administrators will often find a need to copy data from one database system to another. Often the administrator will find importing and exporting data a daunting task. This is because there are many tools available. Knowing which one to use can be very complex. Choosing the right tools is crucial to importing and exporting data since it often involves transferring data between SQL Server and a non-SQL Server database such as Oracle or IBM DB2. Transferring data enables companies to have heterogeneous database systems and lower their reliance on a single RDBMS. This chapter will review and compare the many tools available on SQL Server 7.0 in order to demystify data transfer. Once you master the art of copying data, data management will seem much less complicated, and your job will be much easier.

Transferring Data and Objects

The core of SQL Server is its data and subsequent objects. Often a database administrator will come upon a situation where he or she must transfer the data and objects to another system. In SQL Server, you are able to transfer objects, security, and data in and out of SQL Server. This is simplifies administrating the database network and the various relational database management systems.

Transferring data and objects can be categorized in two classes: importing and exporting. Importing is the transfer of data into SQL Server. Exporting is the transfer of data out of the SQL Server. With importing and exporting data, data does not have to "belong" to a specific SQL Server. Other client applications and relational database management systems can access it, therefore making the data much more versatile and flexible.

Importing Data and Objects

Importing of data and objects is often used to do one of two things: a one-time migration, or a constant updating of data. When a company is migrating from one database management system (such as Oracle or Access) to SQL Server, they must import the data and the other objects. Some examples of these objects would be schema and security. Since SQL Server will replace the older database, data can be imported so that SQL Server can be used for all data-related tasks immediately. This process implements a one-time import. Therefore, use of the old database management system is eliminated, and use of the SQL Server can start right away.

Another use for importing data is to manually upgrade the SQL Server data. For example, say you had a database that needs weekly updates of data from a non-SQL Server database that is updated with multiple client applications. With this situation, you would need to do weekly imports on the data. This situation is different from the one-time import but this is a requirement for many companies.

Exporting Data and Objects

Exporting data and objects is done less frequently than importing. This is because SQL Server allows client applications to access and edit the data directly within SQL Server. Data and object exports, however, still occur. One reason to export data from SQL Server is if a client application can only access data in an ASCII format. SQL Server can export the data so that the client application can read it. A different situation in which you would need to export data might be the following: if you wanted to convert some of the data into another format, such as an older spreadsheet format that does not have ODBC. You would export the data into a text file format and have the spreadsheet import it so that the user can view the data.

Tools for Transferring Data

Now that you know the scenarios in which you import and export data, you need to be aware of the available tools. SQL Server 7.0 comes with many tools for you to transfer data and objects. Knowing which tool to use can increase your effectiveness with managing data. The available tools are:

The rest of this chapter will review and compare these tools. It will also specify the optimal tool to use for every scenario. Selecting which tool you are going to use will depend on what your requirements are for copying data such as the following:

From the Classroom

Monitoring Data Import and Export

No matter which tool you use to import and export data in SQL 7.0, you will need to monitor the results of your actions. This is especially important for jobs that run automatically, as you will not be at the system to see error messages.

DTS includes two features that are very useful for monitoring your package to ensure it has run correctly. Each of these are outlined below:

BCP has the option "–o output_file" that records the status. Like DTS, you will be able to use this file to determine if the job has executed as planned.

David Smith, MCSE + Internet

Using Data Transformation Services (DTS)

Data Transformation Services (DTS), as shown in Figure 13-1, is a powerful new tool introduced in SQL Server 7.0. DTS is used to transfer data across heterogeneous sources using OLE DB. This gives SQL Server access to a variety of database sources from Microsoft Excel to Oracle. This is a major improvement over the older SQL Transfer Manager, which only allows transfer between SQL Server. Not all database sources have access to an OLE DB provider. Because of this, Microsoft implements the OLE DB Provider for ODBC driver. OLE DB Provider is implemented to make any ODBC data source an OLE DB provider.

fig13-1.gif (9362 bytes)

Figure 1: Data Transformation Wizard

You can use the Data Transformation Service Import and Export Wizards to transfer objects such as indexes, stored procedures, and referential integrity constraints across SQL Server 7.0 computers. One of the most helpful features in the DTS Import and Export Wizards is allowing you to map a destination column to a source column. This allows you to specify exactly to where you want an entire column to copy, as well as changing the destination column’s datatype.

The Data Transformation Wizards allows you to specify how to transform the data when you are copying to a destination table. You can use VBScript, Jscript, or the ActiveX scripting engine to write the data transformation scripts.

Once you have access to the data, you can create a DTS package to import and export data over ODBC and OLE DB. The DTS package is an object within DTS that saves the entire set of commands you specified. This save enables the entire package to be run again without having to go through the DTS Wizard prompts. Programmers can also edit the DTS package interface through an OLE Automation interface implemented in Dtspkg.dll. They can do this by using development languages that support object linking and embedding (OLE).

Another feature of the Data Transformation Services is the advanced transformation properties. They are used to define how data can be converted between the source and destination database. Before being copied, the DTS Data Pump checks to see if the copy can be done validly according to the transformation flags that you define in the advanced transformation properties. If any of the columns fail the check, the copy operation is aborted. This ensures that the copy is done specifically to your specification.

Using the Import and Export Wizards

DTS requires you to first use the Export Wizard to export the data, and then uses the import wizard to import the data. Exercise 13-1 and Exercise 13-2 will show you how to transfer data with Data Transformation Services Import and Export Wizards. To perform this exercise, you need to have Microsoft Access installed, as well as all the sample databases.

Exercise 13-1: Using DTS to export data

  1. Select Start | Programs | SQL Server 7.0 | SQL Enterprise Manager
  2. Expand the SQL Server group, and then expand the tree of the SQL Server you want to import your data into.
  3. Expand Database and click on your desired database. For this exercise, choose pubs. Select Action | Task | Import Wizard to start the Data Transformation Import Wizard.
  4. Click Next after you read the introduction screen.
  5. You should now see the Choose a Data Source screen as shown in Figure 13-2.
  6. fig13-2.gif (8055 bytes)

    Figure 2: Choose a Data Source

  7. Next to Source, you are able to choose the different file types you are able to import data from. In this exercise, choose Microsoft Access. Next to File Name, you must specify the path of the database. If you installed Microsoft Access to its default path, it should be C:\Program Files\Microsoft Office\Office\Samples\Northwind.mdb. If your database uses security, you can specify the User Name and Password in their corresponding boxes. For this exercise, you should leave both fields blank. Click Next.
  8. You should now see the Choose a Destination dialog box as shown in Figure 13-3. This dialog box prompts you to import the data.
  9. fig13-3.gif (8876 bytes)

    Figure 3: Choose a Destination

  10. Next to destination, you can select the data format. Choose Microsoft SQL Server 7.0 Only (OLE DB Provider), which is the default in this exercise Next to server, you can select the server that contains the database you will import into. By default, your local server should be already selected. While choosing your destination, you can choose the type of authentication you wish to use when logging in to SQL Server while performing the import. For this exercise, choose Use SQL Server authentication. Next to User Name, type in sa. Next to Password, type your password for sa if you have one. Next to Database, you can select the databases available on the server you selected. For this exercise, choose the pubs database. Click Next.
  11. You should now see the Specify Table Copy or Query screen. You can specify whether to copy all the records from one or more tables, or use a query to filter the data to a specific subset. In this exercise, choose Copy table from the source database. Click Next.
  12. .You should now see the Select Source Tables dialog box as shown in Figure 13-4. In this dialog box, you are presented with a three-column grid. The three columns are:

For this exercise, select all the Source Tables, leave the Destination table to the default, and set the Transform to the default setting. Click Next.

fig13-4.gif (9567 bytes)

Figure 4: Select Source Tables

  1. The next screen you should see is the Save, Schedule and Replicate Package dialog box (Figure 13-5). This dialog box is prompting whether or not you want to save the DTS package. A DTS package is the entire set of instructions you set for this import session. If you save this, you don’t have to run it again. Check Save Package on SQL Server if you will do this again at a later time. Uncheck it if this is a one-time import. If the import will be done a later time, you can schedule the number of times required. To do this, click on Schedule DTS package For Later Execution. For this exercise, make sure Save Package on SQL Server is not checked. If you check this setting, you save all your settings in a package, and you would not have to make those selections all over again. In this exercise, make sure only Run Immediately is checked and click Next.
  2. fig13-5.gif (6159 bytes)

    Figure 5: Save, Schedule and Replicate Package

  3. The next screen is the Completing the DTS Wizard dialog box (Figure 13-6). This dialog box shows you a summary of the choices you made. If this is not correct, you can click on Back to change the settings. If it is correct, click on Finish.

fig13-6.gif (10892 bytes)

Figure 6: Completing the DTS Wizard

Exercise 13-2: Importing Data with DTS

  1. Select Start | Programs | SQL Server 7.0 | SQL Enterprise Manager.
  2. Expand the SQL Server group and then expand the tree of the SQL Server you want to import your data into.
  3. Expand Database and click on your desired database. For this exercise, choose pubs. Select Action | Task | Import Wizard to start the Data Transformation Import Wizard.
  4. Click Next after you read the introduction screen.
  5. You should see the Choose a Data Source dialog box (Figure 13-7). Next to source, you can select the data format. In this exercise, choose the default which is Microsoft SQL Server 7.0 Only (OLE DB Provider.) Next to server, you can select the server that contains the database you will import to. By default, your local server should be already selected. While choosing your destination, you can choose the type of authentication you wish to use when logging in to SQL Server while performing the import. For this exercise, choose Use SQL Server authentication. Next to User Name, type in Sa. Next to Password, type your password for sa, if you have one. Next to Database, you can select the databases available on the server you selected. For this exercise, choose the pubs database. Click Next.
  6. fig13-7.gif (8843 bytes)

    Figure 7: Choose a Data Source

  7. The next screen is the Choose a Destination dialog box, as shown in Figure 13-8. This is similar to the Choose a Data Source dialog box in the DTS Import Wizard. In destination, we will choose Text File. Once you select this, a sub-screen appears prompting you for the destination of the text file. In this example, type C:\temp.txt. Click Next.
  8. fig13-8.gif (6637 bytes)

    Figure 8: Choose a data destination

  9. You should now see the Specify Table Copy or Query screen. You can specify whether to copy all the records from one or more tables or use a query to filter the data to a specific subset. In this exercise, choose Copy table from the source database. Click Next.
  10. You should see the Select Destination File Format as shown in Figure 13-9. This dialog box prompts you for the text file’s format you want to export. This dialog box has nine fields for you to set:

Click Next.

fig13-9.gif (8099 bytes)

Figure 9: Select Destination File Format

  1. The next screen you should see is the Save, Schedule and Replicate Package dialog box. This dialog box is prompting whether you want to save the DTS package. In this exercise, make sure only Run Immediately is checked and click Next.
  2. The next screen is the Completing the DTS Wizard dialog box. This dialog box shows you a summary of the choices you made. If this is not correct, you can click on Back to change the settings. If it is correct, click on Finish.

Using the Bulk Copy Program (BCP)

SQL Server’s bulk copy program, BCP, is a command-line based program that is used to copy data into and out of SQL Server. A database administrator would usually use this to transfer large amounts of data to SQL Server from or to another database management system. To use BCP, you must first export the SQL Server data to a data file. BCP then imports the data file into SQL Server.

In order for you to use BCP, you must have the appropriate permission to access the data in the table:

The BCP Command

Bulk copy program (BCP) is a program that must be run from a DOS prompt. BCP.exe can be found in the BINN subdirectory under the SQL Server installation directory. The syntax is:

bcp {[[database_name.][owner].]{table_name | view_name} | "query"}
{in | out | queryout | format} data_file [-m max_errors] [-f format_file] [-e err_file]
[-F first_row] [-L last_row] [-b batch_size] [-n] [-c] [-w] [-N] [-6] [-q] [-C code_page]
[-t field_term] [-r row_term] [-i input_file] [-o output_file] [-a packet_size]
[-S server_name] [-U login_id] [-P password] [-T] [-v] [-R] [-k] [-E] [-h "hint [,...n]"]

Table 13-1 lists the arguments and switches for you to control the way you want BCP to execute.

Arguments and switch Function
Database_name Name of the database. Required if source database is not the current database.
Owner Name of the owner. Optional if you own the source table or view. Required if you do not own the table.
Table_name Name of the table being copied.
View_name Name of the view being copied.
Query Transact SQL statement that returns a result set.
in | out | queryout | format Specified in for import, out for export, and the format creates a format file.
data_file The full path of the datafile used in the copy operation.
-m max_errors The maximum number of errors that can occur before copy operation is operated. Default is 10.
-f format_file The full path of the optional format file.
-e err_file The full path of the optional error file where BCP stores the rows that were not copied.
-F first_row The number of the first row to copy.
-L last_row The number of the last row to copy.
-b batch_size The number of rows copied in a batch. Default is all rows.
-n Sets the data’s datatype as native. This is best used when transferring data from one SQL Server to another. Since data is native, no conversions are required. Transferring data with the native format is the fastest way to copy since there is no need for conversion of data types and character formats.
-c Sets the data’s datatype as character. The character format is best used with non-SQL Server database type transfers such as Excel or Oracle. This format uses tabs between fields and new line characters at the end of every row. This is a universal format since data is outputted into a text file type format.
-w This is the same as the character format except it uses datacode data types so that transfers between two code pages will not result in a loss of character data. This should only be used on transfers between two heterogeneous databases with different code pages.
-N This is the same as the native format. The exception is it uses datacode data types so that transfers between two code pages will not result in a loss of character data. This should only be used on transfers between two heterogeneous databases with different code pages.
-6 Performs the bulk copy operation using SQL Server 6.x datatypes.
-q Requires quoted identifiers
-C code_page Sets the code page of the data in the data file. Can be set to ACP, OEM, RAW, or a specific code page number.
-t field_term Sets the field terminator.
-r row_term Sets the row terminator.
-i input_file Specifies the name of the file where all responses to the prompts during an BCP interactive mode.
-o output_file Specified the name of the file where output is redirected during a BCP session.
-a packet_size Sets the size of network packets sent during BCP in bytes.
-S server_name Specifies the SQL Server to connect to.
-U login_id Supplies the Login ID to connect to SQL Server.
-P password Supplies the password to connect to SQL Server.
-T Specifies that bcp connect via a trusted connection.
-v Reports the bcp version number.
-R Specifies currency, date, and time are copied in a regional format.
-k Specified that blank columns should be set to NULL.
-E Specifies identity columns must contain values.
-h "hint [,...n]" Specifies hints are to be used during bulk copy operations.

Table 1: BCP keywords and switches

Using BCP

In Exercise 13-3, we will bulk copy data from one table to another.

Exercise 13-3: Using bulk copy program to transfer data

  1. Click on Start | Run. Next to Open, type command.
  2. Click OK.

  3. At the command prompt, type BCP pubs..employee out exercise.dat –T –n.
  4. Your screen should look similar to Figure 13-10. This command exports the contents of the Employee table in the pubs database to the exercise.dat file. The contents are stored in their native format.

    fig13-10.gif (4572 bytes)

    Figure 10: BCP in the Command Prompt

  5. Now type BCP pubs..employee out exercise.txt –T –c.

This command exports the contents of the Employee table in the pubs database to the exercise.txt file. The contents of the file are stored as a text format.

Loading Data Using Other Methods

Though the main ways to copy data to and from SQL Server are with Data Transformation Services and Bulk Copy Program, you are still able to use other methods such as the INSERT, SELECT INTO, and BULK INSERT statement. There are different reasons why you should choose each. You should also carefully assess which tool you can use to effectively transfer data in the specification deemed appropriate.

INSERT Statement

Like BCP, you can use the Insert statement to add data from an external OLE DB provider to SQL Server in one or more rows. In order for you to run the INSERT command, you must be the table owner, or have the privilege transferred to another user. The syntax for an INSERT statement is:

INSERT [INTO] {table_name | view_name} [(column_list)] {DEFAULT VALUES | values | derived_table | execute_statement}

Table 13-2 lists Insert statement keywords and switches.

Keyword or Switch Function
INTO Optional word
Table_name | view_name Specifies the table name or the view name where the data will be inserted.
Column_list The list of columns in which data will be inserted.
DEFAULT VALUES Inserts the default values for all the columns.
Values The values going to be added to the columns.
Derived_table A SELECT statement used to retrieve values to be added to an existing table
Execute_statement A EXECUTE statement that returns data with SELECT or READTEXT statements.

Table 2: Insert statement keywords and switches

SELECT INTO Statement

The Select Into statement is used to retrieve one row of data and assign the value to another row:

SELECT [select_list] INTO {:hvar [,...]} select_options

Table 13-3 lists SELECT INTO statement keywords and switches.

Keyword or Switch Function
Select_list The list of table columns or expressions to retrieve data from.
:hvar The host variables to receive for the select_list keyword.
Select_options One or more Transact SQL SELECT statement. GROUP BY, HAVING, COMPUTE, CUBE, and ROLLUP clauses are not supported.

Table 3: SELECT INTO statement keywords and switches

On the Job: Before you execute the SELECT INTO statement, make sure the user has CREATE TABLE permissions in the destination database.

BULK INSERT

Bulk insert is used to copy data into SQL Server using the functionality of the BCP utility with a Transact-SQL statement rather than from the command prompt. This is the fastest way to transfer data from a text file to SQL Server. Since BULK INSERT cannot transform data, you should only use BULK INSERT to copy data. Speed is the most crucial factor in this situation.

The syntax for the BULK INSERT statement is:

BULK INSERT [['database_name'.]['owner'].]{'table_name' FROM data_file}
[WITH
(
[ BATCHSIZE [= batch_size]]
[[,] CHECK_CONSTRAINTS]
[[,] CODEPAGE [= 'ACP' | 'OEM' | 'RAW' | 'code_page']]
[[,] DATAFILETYPE [=
{'char' | 'native' | 'widechar' | 'widenative'}]]
[[,] FIELDTERMINATOR [= 'field_terminator']]
[[,] FIRSTROW [= first_row]]
[[,] FORMATFILE [= 'format_file_path']]
[[,] KEEPIDENTITY]
[[,] KEEPNULLS]
[[,] KILOBYTES_PER_BATCH [= kilobytes_per_batch]]
[[,] LASTROW [= last_row]]
[[,] MAXERRORS [= max_errors]]
[[,] ORDER ({column [ASC | DESC]} [,...n])]
[[,] ROWS_PER_BATCH [= rows_per_batch]]
[[,] ROWTERMINATOR [= 'row_terminator']]
[[,] TABLOCK]
)
]

Table 13-4 lists BULK INSERT statement arguments and their functions.

Arguments Function
‘database_name’ Name of the database. If not specified, BULK INSERT will use current database.
‘owner’ Table or view owner. If user is owner, than this argument is optional.
‘table_name’ Name of the table or view to bulk insert into.
Data_file Full path of the data file which BULK INSERT will copy data from.
BATCHSIZE [= batch_size] Sets the number of rows in the batch. Each batch counts as a single transaction. By default, all data will be copied as one batch.
CHECK_CONSTRAINTS Enables that all constraints are checked during the bulk insert.
CODEPAGE [= 'ACP' | 'OEM' | 'RAW' | 'code_page'] Sets the type of codepage of the data in the data file. The codepage value of ‘ACP’ will convert columns of char, varchar, or text data type from ANSI/Microsoft Windows code page to the SQL Server code page. The codepage value of ‘OEM’ which is also the default will convert columns of char, varchar, or text data type from the system OEM code page to the SQL Server code page. The codepage value of ‘RAW’ is the fastest option because it will not process any conversions. The ‘code page’ specifies the code page number.
DATAFILETYPE [=
{'char' | 'native'| 'widechar' | 'widenative'}]
Tells BULK INSERT to perform the bulk copy with the ‘char,’ ‘native,’ ‘widechar,’ or ‘widenative’ data file type.
FIELDTERMINATOR [= 'field_terminator'] Sets the field terminator for use with char or Unicode char (widechar) data files. The default is the tab character (\t).
FIRSTROW [= first_row] Sets the first row for the BULK INSERT operation to copy. The default is one, which is the first row.
FORMATFILE [= 'format_file_path'] Sets the full path of the format file. The format file is a file that describes the data file which contains the responses created using bcp. This argument should be used when the data file is different than the table or view. An example would be a different number of columns, different ordered columns, or different column delimiters.
KEEPIDENTITY This argument is used to tell BULK INSERT that the identity values are present in the file being imported. If this is not specified, SQL Server will automatically use seed and increment values to set the values for the rows.
KEEPNULLS Sets the columns with no values to contain a NULL value instead of having default values inserted.
KILOBYTES_PER_BATCH [= kilobytes_per_batch] Sets the size of data per batch in kilobytes.
LASTROW [= last_row] Sets the last row for the BULK INSERT operation to copy. The default is zero, which is the last row.
MAXERRORS [= max_errors] Sets the maximum number of errors which can occur before the bulk insert operation is aborted.
ORDER ({column [ASC | DESC]} [,...n]) Sets how the data in the data file is sorted. You can improve the performance of the bulk insert if the data is sorted the same as the clustered index on the table. By default, no ordering is assumed.
[[,] ROWS_PER_BATCH [= rows_per_batch] Sets the number of rows per batch. ROWS_PER_BATCH cannot be used at the same time as BATCHSIZE.
ROWTERMINATOR [= 'row_terminator'] Sets the row terminator used for widechar and char data files. The default is newline character (\n).
TABLOCK Sets a table-level lock for the bulk insert operation. Set the table-level only for the duration of the bulk insert operation. When you do this, you increase the performance of the operation by only having the table locked when necessary.

Table 4: BULK INSERT statement arguments

On the Job: In order to run the BULK INSERT, you must be a member of the sysadmin fixed server role.

Transfer Manager

SQL Server version 6.0 and earlier came with a separate tool called SQL Transfer Manager. This graphical user interface tool enabled you to transfer data easily and quickly. Version 6.5 of SQL Server added this functionality to SQL Server Enterprise Manager. Unlike the BCP, the SQL Transfer Manager was able to transfer both data and objects, such as triggers and stored procedures, between databases. With version 7.0, SQL Server introduces the new Data Transformation Wizard (DTS), which replaces the functionality of SQL Transfer Manager. DTS now allows storing commands in packages, scripting capabilities with JScript and VBScript, and access to OLE DB providers.

Exam watch: Though SQL Transfer Manager no longer exists in SQL Server 7.0, it is still included in the exam objectives for Exam 70-028 as of this writing. It is important that you are aware of the history of SQL Transfer Manager and its relationship with DTS and BCP.

Certification Summary

There are many corporations that use heterogeneous data sources, so it is very important that the database administrator is familiar with transferring data. Transferring data can be categorized as importing data and exporting data. Importing is the transfer of data into SQL Server. Exporting is the transfer of data out of the SQL Server.

SQL Server comes with two main tools used to import and export data: the Data Transformation Wizard (DTS) and the bulk copy program (BCP). The DTS import and export wizards are GUI SQL tools that transfer data across heterogeneous sources using OLE DB. This gives SQL Server access to a variety of database sources from Microsoft Excel to Oracle. SQL Server’s bulk copy program, BCP, is a command-line based program that is used to copy data into and out of SQL Server.

Though the main ways to copy data to and from SQL Server are with Data Transformation Services and Bulk Copy Program, you are still able to use other methods. Some of these methods include INSERT, SELECT INTO, and BULK INSERT statement. You can use the Insert statement to add data from an external OLE DB provider to SQL Server in one or more rows. The Select Into statement is used to retrieve one row of data and assign the value to another row. Bulk insert is used to copy data into SQL Server using the functionality of the BCP utility with a Transact-SQL statement, rather than from the command prompt.

Transferring data is a critical task for a database administrator, since many corporations need to import and export data from heterogeneous data sources. In order to utilize the transfer capabilities of SQL Server, you must be familiar with the tools available to you. By knowing the tools at your disposal, you will be in a better position to make the optimum choice in a given data transfer scenario.

Two-Minute Drill