Page 7
Page 8
Page 9
Page 10
Over the years, there have been many definitions of database. For our purposes, a database is an organized collection of data serving a central purpose. It is organized in the sense that it contains data that is stored, formatted, accessed, and represented in a consistent manner. It serves a central purpose in that it does not contain extraneous or superfluous data. A phone book is a good example of a database. It contains relevant data (that is, names) that allow access to phone numbers. It does not contain irrelevant data, such as the color of a person's phone. It stores only what is relevant to its purpose. Most often, a database's purpose is business, but it may store scientific, military, or other data not normally thought of as business data. Hence, there are business databases, scientific databases, military databases, and the list goes on and on. In addition, data can not only be categorized as to its business, but also its format. Modern databases contain many types of data other than text and numeric. For example, it is now commonplace to find databases storing pictures, graphs, audio, video, or compound documents, which include two or more of these types.
When discussing databases, and database design in particular, it is commonplace to refer to the central purpose a database serves as its business, regardless of its more specific field, such as aerospace, biomedical, or whatever. Furthermore, in real life a database is often found to be very, very specific to its business.
In earlier days, programmers who wrote code to serve Automatic Data Processing (ADP) requirements found they frequently needed to store data from run to run. This became known as the need for persistent storage; that is, the need for data to persist, or be saved, from one run of a program to the next. This fundamental need began the evolution of databases as we know them. A secondary need, simple data storage, also helped give rise to databases. Online archiving and historical data are a couple of specific examples. Although files, directories, and file systems could provide most general data storage needs, including indexing variations, databases could do what file systems did and more.
Modern databases typically serve some processing storage need for departments or smaller organizational units of their parent organization or enterprise. Hence, we use the terms enterprise-wide database, referring to the scope of the whole organization's business; the department-wide database, referring to the level of a department; and the workgroup database, usually referring to some programming or business unit within a department. Most often, databases are found at the department-wide and workgroup levels.
Occasionally one finds databases that serve enterprise-wide needs, such as payroll and personnel databases, but these are far outnumbered by their smaller brethren. In fact, when several departmental databases are brought together, or integrated, into one large database, this is the essence of building a data warehouse (DW). The smaller databases, which act as the data sources for the larger database, are known as operational databases. However, this is nothing new. An operational database is just one that produces data, which we have known for years as a production database. Only in the context of building a DW do you find production databases also referred to as operational databases, or sometimes operational data stores. With the advent of Internet technology, databases and data warehouses now frequently serve as back ends for Web browser front ends.
Page 11
When workgroup databases are integrated to serve a larger, departmental need, the result is typically referred to as a data mart (DM). A DM is nothing more than a departmental-scale DW. As you can imagine, just as with the term database, the term data warehouse has yielded a multitude of definitions. However, when you're integrating several smaller databases into one larger database serving a broader organizational need, the resulting database can generally be considered a DW if it stores data historically, provides decision support, offers summarized data, serves data read-only, and acts essentially as a data sink for all the relevant production databases that feed it.
Otherwise, if a database simply grows large because it is a historical database that's been storing data for a long period of time (such as a census database) or because of the type of data it must store (such as an image database) or because of the frequency with which it must store data (such as a satellite telemetry database), it is often referred to as a very large database (VLDB).
What qualifies as a VLDB has changed over time, as is to be expected with disk storage becoming denser and cheaper, the advent of small multiprocessing machines, the development of RAID technologies, and database software growing, or scaling, to handle these larger databases. Currently, a general guideline is that any database of 100GB or larger can be considered a VLDB. As little as a few years ago, 10GB was considered the breakpoint.
A Database Management System (DBMS) is the software that manages a database. It acts as a repository for all the data and is responsible for its storage, security, integrity, concurrency, recovery, and access. The DBMS has a data dictionary, sometimes referred to as the system catalog, which stores data about everything it holds, such as names, structures, locations, and types. This data is also referred to as metadata, meaning data about data. The lifespan of a piece of data, from its creation to its deletion, is recorded in the data dictionary, as is all logical and physical information about that piece of data. A Database Administrator (DBA) should become intimate with the data dictionary of the DBMS, which serves him or her over the life of the database.
Security is always a concern in a production database, and often in a development or test database too. It is usually not a question of whether or not to have any security, but rather how much to have. A DBMS typically offers several layers of security, in addition to the operating system (OS) and network security facilities. Most often, a DBMS holds user accounts with passwords requiring the user to login, or be authenticated, in order to access the database.
DBMSs also offer other mechanisms, such as groups, roles, privileges, and profiles, which all offer a further refinement of security. These security levels not only provide for enforcement, but also for the establishment of business security policies. For example, only an authenticated user who belongs to an aviation group may access the aviation data. Or, only an authenticated user who has the role of operator may back up the database.