Databases demystified ebook download




















In the tradition of Bruce A. Tate's Seven Languages in Seven Weeks, this book goes beyond your basic tutorial to explore the essential concepts at the core each technology. With each database, you'll tackle a real-world data problem that highlights the concepts and features that make it shine.

Unicode is a critical enabling technology for developers who want to internationalize applications for global environments. But, until now, developers have had to turn to standards documents for crucial information on utilizing Unicode. In Unicode Demystified, one of IBM's leading software internationalization experts covers every key aspect of Unicode development, offering practical examples and detailed guidance for integrating Unicode 3.

Writing from a developer's point of view, Rich Gillam presents a systematic introduction to Unicode's goals, evolution, and key elements.

It is far more common to include them in a text document that accompanies the diagram. A specialist who performs logical database design is called a database designer, but often the data modeler or database administrator DBA performs this design step.

The final design step is physical database design, which involves mapping the logical design to one or more physical designs—each tailored to the particular DBMS that will manage the database and the particular computer system on which the database will run.

The person who performs physical database design is usually the DBA. However, we work through the DBMS to implement the physical layer, making it difficult to separate the two layers.

In most DBMS implementations, defaults are used if the location and space allocation are not explicitly specified. Because so much of the physical implementation is buried in the DBMS definitions of the logical structures, we have elected not to try to separate them here. During logical database design, physical storage properties filename, storage location, and sizing information may be assigned to each database object as we map them from the conceptual model, or they may be omitted at first and added later in a physical design step that follows logical design.

For time efficiency, most data modelers and DBAs perform the two design steps logical and physical in parallel. Tables The primary unit of storage in the relational model is the table, which is a 2-D structure composed of rows and columns.

Each row represents one occurrence of the entity that the table represents, and each column represents one attribute for that entity. The process of mapping the entities in the conceptual design to tables in the logical design is called normalization and is covered in detail in Chapter 6.

Often, an entity in the conceptual model maps to exactly one table in the conceptual model, but this is not always the case. For reasons you will learn with the normalization process, entities are commonly split into multiple tables, and in rare cases, multiple entities may be combined into one table. A relational database table is a 2-D structure composed of rows and columns. It is the primary unit of storage in the relational model. Figure shows a listing of part of the Northwind Orders table.

It is important to remember that a relational table is a logical storage structure and usually does not exist in tabular form in the physical layer. In most DBMS products, the DBA assigns a table to a logical structure called a tablespace, and each tablespace is implemented using one or more operating system files in the physical layer.

It is quite common for multiple tables to be placed in a single tablespace. However, large tables may be placed in their own tablespace or split across multiple tablespaces, which is called partitioning. It is usually implemented using one or more operating system files in the physical layer.

Each table must be given a unique name by the DBA who creates it. The maximum length for these names varies a lot among RDBMS products, from as few as 18 characters to as many as Table names should be descriptive and should reflect the name of the real-world entity they represent. By convention, some DBAs always name entities in the singular and tables in the plural, and you will see this convention used in the Northwind database. I prefer that both be named in the singular, but obviously there are other learned professionals with counter opinions.

It is essential to establish naming standards at the outset so that names are not assigned in a haphazard manner, which only leads to confusion later. As a case in point, Microsoft Access permits embedded spaces in table and column names, which is counter to industry standards.

You may wish to set standards that forbid the use of names with embedded spaces and names in mixed case because such names are nonstandard and make any conversion between database vendors that much more difficult. Columns and Data Types As already mentioned, each column in a relational table represents an attribute from the conceptual model. The column is the smallest named unit of data that can be referenced in a relational database.

Each column must be assigned a unique name within the table and a data type. A data type is a category for the format of a particular column. Data types provide several valuable benefits: The column is the smallest named unit of data that can be referenced in a relational database. Each column must be assigned a name and a data type. For example, if you subtract a number from another number, you get a number as a result; but if you subtract a date from another date, you get a number representing the elapsed days between the two dates as a result.

For example, numbers can often be stored in an internal numeric format that saves space, compared with merely storing the numeric digits as a string of characters. Figure shows the table definition of the Northwind Orders table from Microsoft Access the same table listed in Figure The data type for each column is listed in the second column from the left.

The data type names are usually self-evident, but if you find any of them confusing, you can find definitions of each in the Microsoft Access help pages. Most vendors did their own thing for many years before sitting down with other vendors to develop standards, and this is no more evident than in the wide variation of data type options across the major RDBMS products.

Today there are ANSI standards for relational data types, and the major vendors support all or most of the standard types. One could say in jest that the greatest thing about database standards is that there are so many from which to choose each vendor having their own. In terms of industry standards for relational databases, Microsoft Access is probably the least compliant and MySQL the most compliant of the most popular products. Given the many levels of standards 41 42 D ata b a s e s De mys tifieD compliance and all the vendor extensions, the DBA must have a detailed knowledge of the data types available on the particular DBMS that is in use in order to successfully deploy the database.

And, of course, great care must be taken when converting logical designs from one vendor to another. As always, the devil is in the details, meaning that these are not identical data types, merely equivalent. Constraints A constraint is a rule placed on a database object typically a table or column that restricts the allowable data values for that database object in some way. Constraints are most important in relational databases because constraints are the way we implement both the relationships and business rules specified in the logical design.

Each constraint is assigned a unique name to permit it to be referenced in error messages and subsequent database commands.

Primary Key Constraints A primary key is a column or a set of columns that uniquely identifies each row in a table. A unique identifier in the conceptual design is thus implemented as a primary key in the logical design. The small icon that looks like a door key to the left of the Order ID field name in Figure indicates that this column has been defined as the primary key of the Orders table. When we define a primary key, the RDBMS implements it as a primary key constraint to guarantee that no two rows in the table will ever have duplicate values in the primary key column s.

Note that for primary keys composed of multiple columns, each column by itself may have duplicate values in the table, but the combination of the values for the primary key columns must be unique among all rows in the table. A primary key is a column or a set of columns that uniquely identifies each row in a table. Primary key constraints are nearly always implemented by the RDBMS using an index, which is a special type of database object that permits fast searches of column values.

As new rows are inserted into the table, the RDBMS automatically searches the index to make sure the value for the primary key of the new row is not already in use in the table, rejecting the insert request if it is.

Referential Constraints To understand how the RDBMS enforces relationships using referential constraints, we must first understand the concept of foreign keys. It gets its name from the column s copied from another foreign table. In the Orders table definition shown earlier in Figure , the Employee ID column is a foreign key to the Employees table, the Customer ID column is 43 44 D ata b a s e s Demys tified a foreign key to the Customers table, and the Shipper ID column is a foreign key to the Shippers table.

In most relational databases, the foreign key must either be the primary key of the parent table or a column or set of columns for which a unique index is defined. This again is for efficiency. Most people prefer that the foreign key column s have names identical to the corresponding primary key column s , but again there are counter opinions, especially because like-named columns are a little more difficult to use in query languages.

It is best to set some standards up front and stick with them throughout your database project. Each relationship between entities in the conceptual design becomes a referential constraint in the logical design. A referential constraint sometimes called a referential integrity constraint is a constraint that enforces a relationship among tables in a relational database.

Microsoft Access provides a very nice feature for foreign key columns, but it takes a bit of getting used to. When you define a referential constraint, you can define an automatic lookup of the parent table rows, as was done throughout the Northwind database.

In Figure , the second column in the table is listed as Employee ID. Similarly, the Customer column of the table displays the customer name, and the Ship Via column displays the shipping company name. This is a convenient and easy feature for the database user, and it prevents a nonexistent customer, employee, or shipper from being associated with an order. The beauty of database constraints is that they are automatic and therefore cannot be circumvented unless the DBA disables or removes them.

At first, you probably wondered why anyone would ever want automatic deletion of child rows. Consider the Orders and Order Details tables. If an order is to be deleted, why not delete the order and the line items that belong to it in one easy step?

However, with the Employee table, we clearly would not want that option. If we attempt to delete Employee 9 from the Employee table perhaps because he or she is no longer an employee , the RDBMS must check for rows assigned to Employee ID 9 in the Orders table and reject the delete request if any are found.

It would make no business sense to have orders automatically deleted when an employee left the company. In most relational databases, an SQL statement is used to define a referential constraint. SQL is introduced in Chapter 4. SQL Structured Query Language is the language used in relational databases to communicate with the database.

Many vendors also provide GUI graphical user interface panels for defining database objects such as referential constraints. For Microsoft Access, Figure shows the Relationships panel that is used for defining referential constraints. For simplicity, only the Orders table and its immediate parent and child tables are shown in Figure These constraints are defined by simply dragging the name of the primary key in the parent table to the name of the foreign key in the child table.

A pop-up window is then automatically displayed to allow the definition of options for the referential constraint, as shown in Figure At the top of the Edit Relationships panel, the two table names appear with the parent table on the left and the child table on the right. Under each table name, are rows for selection of the column names that constitute the primary key and the foreign key.

An update of primary key values is a rare situation. Think carefully here. There are times to use this, such as the constraint between Orders and Order Details, and times when the option can lead to the disastrous unwanted loss of data, such as deleting an employee perhaps accidentally and having all the orders that employee handled automatically deleted from the database. Intersection Tables The discussion of many-to-many relationships earlier in this chapter pointed out that relational databases cannot implement these relationships directly and that an intersection table is formed to establish them.

Figure shows the implementation of the Order Details intersection table in Microsoft Access. Understanding this arrangement is fundamental to understanding how relational databases work.

Each orders table row may have many related order Details rows one for each line item for that particular order , but each order Details row belongs to one and only one Orders table row.

Integrity Constraints As already mentioned, business rules from the conceptual design become constraints in the logical design.

An integrity constraint is a constraint as defined earlier that promotes the accuracy of the data in the database. The key benefit 49 50 D ata b a s e s De mys tified is that these constraints are invoked automatically by the RDBMS and cannot be circumvented unless you are a DBA no matter how you connect to the database.

A null value in a relational database is a special code that can be placed in a column that indicates that the value for that column in that row is unknown. A null value is not the same as a blank, an empty string, or a zero—it is indeed a special code that has no other meaning in the database.

A uniform way to treat null values is an ANSI standard for relational databases. However, there has been much debate over the usefulness of the option because the database cannot tell you why the value is unknown.

The other dilemma is that null values are not equal to anything, including other null values, which introduces three-valued logic into database searches. With nulls in use, a search can return the condition true the column value matches , false the column value does not match , or unknown the column value is null. The developers who write the application programs have to handle null values as a special case. Figure shows the definition of the Order Date column of the Orders table.

Note that the column is not required because the Required option is set to No. The outcome of the statement must be a logical true or false, with an outcome of true allowing the column value to be placed in the table, and a value of false causing the column value to be rejected with an appropriate error message.

This rule prevents order dates from the past dates earlier than the current date from being entered by making sure that the value supplied for the column is greater than or equal to the current date.

If this were not the case, a constraint like this one would cause error conditions anytime an old row was retrieved from the database. Although the syntax of the option will vary for other databases, the concept remains the same. If we choose to implement this constraint in the database, as opposed to leaving it up to application logic, we need the database to prevent new rows from being added to the Orders table if the Account Receivable row for the customer has an overdue amount that is greater than zero.

In this example, we want the trigger to fire whenever a new row is inserted into the Orders table. The trigger obtains the overdue amount for the customer from the Account Receivable table or wherever the column is physically stored. If this amount is greater than zero, the trigger will raise a database error that stops the insert request and causes an appropriate error message to be displayed. Views A view is a stored database query that provides a database user with a customized subset of the data from one or more tables in the database.

Said another way, a view is a virtual table because it looks like a table and for the most part behaves like a table, yet it stores no data only the defining query is stored. During logical design, each view is created using an appropriate method for the particular database. In Microsoft Access, views are called queries and are created using the Query panel. Figure shows the Microsoft Access definition of a simple view query that lists active products.

The view defined in Figure contains only four columns of a table that contains more than ten columns. Figure shows a portion of the query results. We explore the Microsoft Access Query panel in detail in Chapter 3. Chapter 2 E x p l o r i n g Re l at i o n a l D ata b a s e C o m p o nen t s Quiz Choose the correct responses to each of the multiple-choice questions.

Table B. Column C. View D. Referential constraint E. Index 2. A primary key constraint is implemented using which type of object in the logical design? Index 3. A circle and a vertical tick mark near the end of the line B.

Two vertical tick marks near the end of the line E. Valid types of relationships among entities are A. One-to-one B. One-to-many C. One-to-many-to-one D. None-to-many E. Many-to-many 5. Examples of a business rule are A. An employee must be at least 18 years old. Employees below pay grade 6 are not permitted to modify orders. Every order may belong to only one customer, but each customer may have many orders. A referential constraint must refer to the primary key of the parent table. A database query that eliminates columns an employee should not see.

A primary key constraint: A. Must be defined for every database table B. Must reference one or more columns in a single table C. Guarantees that no two rows in a table have duplicate primary key values D.

Is usually implemented using an index E. Enforces referential integrity constraints 7. Major types of integrity constraints are A. Indexes D. Constraints enforced with triggers E. One-to-one relationships 8.

A referential constraint is defined: A. Using a database trigger B. Using the Relationships panel in Microsoft Access C. Using SQL in most relational databases D. Using the referential data type for the foreign key column s E.

In a view 9. A relational table: A. Appears in the conceptual database design B. Is composed of rows and columns C. Is the primary unit of storage in the relational model D. Must be assigned a data type E. Must be assigned a unique name A data type: A. May be selected based on business rules for an attribute B.

Provides a set of behaviors for a column that assists the database user C. Restricts the data that may be stored in a view D. Restricts characters allowed in a database column chapter 3 Forms-Based Database Queries This chapter provides an overview of forming and running database queries using the forms-based query tool in Microsoft Access. Even if you never intend to use Microsoft Access or another forms-based database query product, at least give this chapter a quick read because it will help you visualize database concepts.

Also keep in mind that Chapter 4 introduces SQL, the standard query language for all modern relational databases. Be able to use Microsoft Access and the video store sample database to create and run queries. The database user defines queries by entering sample data values directly into a query template to represent the result that the database is to achieve. An alternative query method uses a command-based query language, in which queries are written as text commands.

SQL is the ubiquitous command-based query language for relational databases. The emphasis with both forms-based and command-based query languages is on what the result should be rather than how the results are achieved.

The difference between the two is in the way the user describes the desired result—similar to the difference between using Microsoft Windows Explorer to copy a file versus using the MS-DOS copy command in the DOS command window to do the same thing. A command-based query language such as SQL requires queries to be entered by the database user as text commands. Personal computers, Microsoft Windows, the mouse, and many other modern computing amenities were unheard of then, but the interface was still graphical in nature.

A form was displayed, and database users typed sample data and simple commands in boxes, where today they would click an onscreen button using a mouse. IBM learned that most users preferred to use the method they learned first— human nature, it seems.

TERMS: Forms-based Query Language a forms-based query language uses a Gui panel on which database users define queries by entering sample data values directly into a query template to represent the result that the database query is to achieve. Experience has shown us that both methods are useful to know. Forms-based queries lend themselves well to individuals who are more accustomed to GUI Chapter 3 F o r m s - B a s e d D ata b a s e Q u e r i e s environments than to touch-typing commands.

However, database users familiar with command syntax and possessing reasonable typing skills can enter commandbased queries more quickly than their GUI equivalents, and command-based queries can be directly used within a programming language such as Java or C. Getting Started in Microsoft Access I am using Microsoft Access to present database query concepts that will provide a foundation for the database design theory that follows later in this book.

I will provide enough basic information about using Access so you can follow along on your computer as you explore forms-based queries. The queries used in this chapter all feature a video store sample database available from the McGraw-Hill web site, as explained in Appendix C.

You will have the best learning experience if you try the queries presented in this chapter as you read. Getting Started with the Video Store Sample Database This topic contains two sets of instructions: The first set is for using the sample database using Microsoft Access ; the second is for using the sample database using Microsoft Access Both versions of Access use the same format database MDB file, which is the same format used in Access Follow the steps in the procedure that applies to you.

If you have not already done so, follow the instructions in Appendix C for downloading the sample database file. Start Microsoft Access from your Start menu with no databases open.

The Getting Started panel, shown in Figure , is displayed. Click the Office button in the upper-left corner of the Getting Started panel the round button with the Microsoft Office logo on it , and then click Open. The Microsoft Access main panel is displayed with the video store database tables listed along the left margin, as shown in Figure Micro- soft Access automatically disables application code such as Visual Basic macros when a database is opened.

The sample database does not contain any such code, so you can also simply close the message by clicking the Close button the X to the far right of the Security Warning message.

Do not click the X at the upper-right corner of your screen; that will close Microsoft Access, and you will have to start all over. You will need to repeat this step every time you open the database. Windows will then launch Microsoft Access and open the database file in one continuous operation. The File panel is displayed as shown in Figure Along the left margin, click Open. Once connected to the database, a screen like the one shown in Figure will be displayed.

Microsoft Access automatically disables content such as Visual Basic macros when databases are opened for the first time. The sample database has no such content, so it is safe to click the Enable Content button. A side benefit of doing this is that you will not have to repeat this step if you close the database and subsequently reopen it.

However, Access is similar enough that you should have no difficulty following along while using it. You will learn the most if you try the examples in this chapter as you read. Appendix C contains an overview of the video store sample database, including an ERD entity-relationship diagram. Should you need to close Microsoft Access before completing this chapter, you can simply launch it later, and pick up where you left off.

If you do so, you will see a startup screen like the one shown in Figure or , and the video store database should be listed on it because you have previously opened it. For Access , look for the database under the Open Recent Database heading on the right; for side of the panel Access , look for it along the left side of the panel, or click the Recent option to display the Recent Databases panel.

Simply click the listed filename to open the database. If the database is not listed, you can download it by following the procedure in the previous topic. However, if this happens, you can just download the database again. Once you have started Access and connected to the video store database, the main panel is displayed with the Home ribbon selected, as shown in Figure The development of applications using Access is well beyond the scope of this book.

This chapter focuses on those components that are directly related to defining data structures and to managing the data stored in them. This user interface was new with Office Access is part of the Office suite of applications and is a radical departure from previous versions that used a series of drop-down menus. If you are accustomed to using the old interface, it takes a while to adapt to this new one.

A final option allows you to customize the toolbar. The icons are reasonably intuitive, but you can allow your cursor pointer to hover over each one for a second or two, and see the names of the options. These options are common to most Microsoft Office applications.

Directly below the Quick Access Toolbar are tabs for the major groupings of ribbon options available within Access. In previous versions, these were used to open drop-down menus; in Access and beyond, they are tabs that change the ribbon of options that appears immediately below the tabs. Figure shows the Home ribbon, for example.

Many of the Home ribbon options are related to building application components within Access forms, reports, and so forth , which are beyond the scope of database work. However, you will use the View option often, because it allows you to switch between the Design View, which shows the metadata that defines a database object, and the Datasheet View, which shows the data that is stored in the database object in rows and columns.

The Create ribbon, shown in Figure , provides options for creating templates, tables, forms, reports, and other types of objects. As you can see, the Tables group of options allows you to create relational tables using various tools.

Figure shows the External Data ribbon, which contains options for importing or linking data from external sources, exporting to external file formats, including most of the other Office applications, collecting data from e-mail, and linking to data lists on web pages. The Database Tools ribbon, shown in Figure , contains various tools that assist in managing the database.

The most important of these in terms of database design is the Relationships option, which you will study in the next section. First, though, we need to cover another important navigation feature in Access. You might have noticed the Navigation Pane along the left side of the panels we have examined thus far. This is an essential feature of Access because it provides a common method of organizing, listing, and opening accessing the objects stored in the database.

Once minimized, as shown in Figures through , you can maximize it by again clicking the double arrowhead pointing to the right this time. You can right-click the top of the pane to change the way it organizes the listed objects. In the video store database, the default organization is Object Type, with only tables displayed.

To switch to an organization other than Object Type, right-click the top of the pane, and click Category and then the organization you desire. For the exercises in this chapter to make sense, you should select either Tables or All Access Objects. You can expand any category as needed to view the list of objects in that category, and of course, minimize the categories that are not of current interest. Note that Access does not display headings for categories that have no objects in them.

If you have used older versions of Access, the object types should be familiar because they appeared on the main panel of those older versions. These hold the actual database data in rows and columns. These are called views in nearly all other relational databases. As noted earlier, Microsoft Access is not only a database, but also a complete development environment for building and running applications.

Digital Electronics Demystified. The field of teaching digital electronics has not changed significantly in the past 20 years. Many of the same books that first became available in the late s and early s are still being used as basic texts. Courses teaching introductory digital electronics will fill in the missing areas of information for students, but neither the instructors nor students have resources to explain modern technology and interfaces.

One assumption made by all the standard texts is that experimenting with digital electronics cannot Databases Demystified. Through clear language, step-by-step discussions, and quizzes at the end of each chapter, the author makes databases easy.



0コメント

  • 1000 / 1000