
User's Guide
Part III. Using SQL Anywhere
Chapter 16. Designing Your Database
The design process
There are five major steps in the design process.
For information about implementing the database design, see the chapter "Working with Database Objects".
Step 1: identify entities and relationships
To identify the entities in your design and their relationship to each other:
- Define high-level activities. Identify the general activities you will use this database for. For example, you may want to keep track of information about employees.
- Identify entities. For the list of activities, identify the subject areas you need to maintain information about. These will become tables. For example, hire employees, assign to a department, and determine a skill level.
- Identify relationships. Look at the activities and determine what the relationships will be between the tables. For example, there is a relationship between departments and employees. We give this relationship a name.
- Break down the activities. You started out with high-level activities. Now examine these activities more carefully to see if some of them can be broken down into lower-level activities. For example, a high-level activity such as maintain employee information
can be broken down into:
- Add new employees
- Change existing employee information
- Delete terminated employees
- Identify business rules. Look at your business description and see what rules you follow. For example, one business rule might be that a department has one and only one department head. These rules will be built into the structure of the database.
Entity and relationship example
Example
ACME Corporation is a small company with offices in five locations. Currently, 75 employees work for ACME. The company is preparing for rapid growth and has identified nine departments, each with its own department head.
To help in its search for new employees, the personnel department has identified 68 skills that it believes the company will need in its future employee base. When an employee is hired, the employee's level of expertise for each skill is identified.
Define high-level activities
Some of the high-level activities for ACME Corporation are:
- Hire employees
- Terminate employees
- Maintain personal employee information
- Maintain information on skills required for the company
- Maintain information on which employees have which skills
- Maintain information on departments
- Maintain information on offices
Identify the entities and relationships
We can identify the subject areas (tables) and relationships that will hold the information and create a diagram based on the description and high-level activities.
We use boxes to show tables and diamonds to show relationships. We can also identify which relationships are one-to-many, one-to-one, and many-to-many.
This is a rough E-R diagram. It will be refined throughout the chapter.
Break down the high-level activities
The lower-level activities below are based on the high-level activities listed above:
- Add or delete an employee
- Add or delete an office
- List employees for a department
- Add a skill
- Add a skill for an employee
- Identify skills for an employee
- Identify an employee's skill level for each skill
- Identify all employees that have the same skill level for a particular skill
- Change an employee's skill level
These lower-level activities can be used to identify if any new tables or relationships are needed.
Identify business rules
Business rules often identify one-to-many, one-to-one, and many-to-many relationships.
The kind of business rules that may be relevant include the following:
- There are now five offices; expansion plans allow for a maximum of 10.
- Employees can change department or office
- Each department has one department head
- Each office has a maximum of three telephone numbers
- Each telephone number has one or more extensions
- When an employee is hired, the level of expertise in each of several skills is identified
- Each employee can have from three to 20 skills
- An employee may or may not be assigned to an office
Step 2: identify the required data
To identify the required data:
- Identify supporting data.
- List all the data you will need to keep track of. The data that describes the table (subject) answers the questions who, what, where, when, and why.
- Set up data for each table.
- List the available data for each table as it seems appropriate right now.
- Set up data for each relationship.
- List the data that applies to each relationship (if any).
Identify supporting data
The supporting data you identify will become the names of the columns in the table. For example, the data below might apply to the Employee table, the Skill table, and the Expert In table:
Employee |
Skill |
Expert In |
Employee ID |
Skill ID |
Skill level |
Employee first name |
Skill name |
Date skill was acquired |
Employee last name |
Description of skill |
|
Employee department |
|
|
Employee office |
|
|
Employee address |
|
|
If you make a diagram of this data, it will look like this:
Things to remember
- When you are identifying the supporting data, be sure to refer to the activities you identified earlier to see how you will need to access the data.
- For example, if you know that you will need a list of all employees sorted by last name, make sure that you specify supporting data as Last name and First name, rather than simply Name (which would contain both first and last names).
- The names you choose should be consistent. Consistency makes it easier to maintain your database and easier to read reports and output windows.
- For example, if you choose to use an abbreviated name such as Emp_status for one piece of data, you should not use the full name (Employee_ID) for another piece of data. Instead, the names should be Emp_status and Emp_ID.
- It is not crucial that the data be associated with the correct table. You can use your intuition. In the next section, you'll apply tests to check your judgment.
Step 3: normalize the data
Normalization is a series of tests you use to eliminate redundancy in the data and make sure the data is associated with the correct table or relationship. There are five tests. In this section, we will talk about the three tests that are usually used.
For more information about the normalization tests, see a book on database design.
Normal forms
Normal forms are the tests you usually use to normalize data. When your data passes the first test, it is considered to be in first normal form, when it passes the second test, it is in second normal form, and when it passes the third test, it is in third normal form.
To normalize the data:
- List the data:
- Identify at least one key for each table. Each table must have a primary key.
- Identify keys for relationships. The keys for a relationship are the keys from the two tables it joins.
- Check for calculated data in your supporting data list. Calculated data is not normally stored in the database.
- Put data in first normal form:
- Remove repeating data from tables and relationships.
- Create one or more tables and relationships with the data you remove.
- Put data in second normal form:
- Identify tables and relationships with more than one key.
- Remove data that depends on only one part of the key.
- Create one or more tables and relationships with the data you remove.
- Put data in third normal form:
- Remove data that depends on other data in the table or relationship and not on the key.
- Create one or more tables and relationships with the data you remove.
Data and keys
Before you begin to normalize (test your data), simply list the data and identify a unique primary key for each table. The key can be made up of one piece of data (column) or several (a concatenated key).
The primary key is the set of columns that uniquely identifies rows in a table. The primary key for the Employee table is the Employee ID column. The primary key for the Works In relationship consists of the Office Code and Employee ID columns. Give a key to each relationship in your database by taking the key from each of the tables it connects. In the example, the keys identified with an asterisk are the keys for the relationship:
Relationship |
Key |
Office |
*Office code |
|
Office address |
|
Phone number |
Works in |
*Office code |
|
*Employee ID |
Department |
*Department ID |
|
Department name |
Heads |
*Department ID |
|
*Employee ID |
Assoc with |
*Department ID |
|
*Employee ID |
Skill |
*Skill ID |
|
Skill name |
|
Skill description |
Expert in |
*Skill ID |
|
*Employee ID |
|
Skill level |
|
Date acquired |
Employee |
*Employee ID |
|
Employee last name |
|
Employee first name |
|
Social security number |
|
Employee street |
|
Employee city |
|
Employee state |
|
Employee phone |
|
Date of birth |
Putting data in first normal form
- Remove repeating groups.
- To test for first normal form, remove repeating groups and put them into a table of their own.
- In the example below, Phone number can repeat. (An office can have more than one telephone number.) Remove the repeating group and make a new table called Telephone. Set up a relationship called Associated With between Telephone and Office.
Putting data in second normal form
- Remove data that does not depend on the whole key.
- Look only at tables and relationships that have more than one key. To test for second normal form, remove any data that does not depend on the whole key (all the columns that make up the key).
- In this example, the original Employee table specifies a key composed of two columns. Some of the data does not depend on the whole key; for example, the department name depends on only one of those keys (Department ID). Therefore, the Department ID, which the other employee data does not depend on, is moved to a table of its own called Department, and a relationship called Assigned To is set up between Employee and Department.
Putting data in third normal form
- Remove data that doesn't depend directly on the key.
- To test for third normal form, remove any data that depends on other data rather than directly on the key.
- In this example, the original Employee table contains data that depends on its key (Employee ID). However, data such as office location and office phone depend on another piece of data, Office code. They do not depend directly on the key, Employee ID. Remove this group of data along with Office code, which it depends on, and make another table called Office. Then we will create a relationship called Works In that connects Employee with Office.
Step 4: resolve the relationships
When you finish the normalization process, your design is almost complete. All you need to do is resolve the relationships.
Resolving relationships that carry data
Some of your relationships may carry data. This situation often occurs in many-to-many relationships.
When this is the case, change the relationship to a table. The key to the new table remains the same as it was for the relationship.
Resolving relationships that do not carry data
In order to implement relationships that do not carry data, you need to define foreign keys. A foreign key is a column or set of columns that contains primary key values from another table. The foreign key allows you to access data from more than one table at one time.
There are some basic rules that help you decide where to put the keys:
One to many In a one-to-many relationship, the primary key in the one is carried in the many. In this example, the foreign key goes into the Employee table.
One to one In a one-to-one relationship, the foreign key can go into either table. If it is mandatory on one side, but not on the other, it should go on the mandatory side. In this example, the foreign key (Head ID) is in the Department table because it is mandatory there.
Many to many In a many-to-many relationship, a new table is created with two foreign keys. The existing tables are now related to each other through this new table.
Step 5: verify the design
Before you implement your design, you need to make sure it supports your needs. Examine the activities you identified at the start of the design process and make sure you can access all the data the activities require:
- Can you find a path to get all the information you need?
- Does the design meet your needs?
- Is all the required data available?
If you can answer yes to all the questions above, you are ready to implement your design.
Final design
The final design of the example looks like this:
For more information about creating a database in PowerBuilder or InfoMaker, see the PowerBuilder
User's Guide
or InfoMaker User's Guide
.
