Relationships: One table (relation) may be linked with another in what is known as a relationship. Relationships may be built into the database structure to facilitate the operation of relational joins at runtime.
- A relationship is between two tables in what is known as a one-to-many or parent-child or master-detail relationship where an occurrence on the ‘one’ or ‘parent’ or ‘master’ table may have any number of associated occurrences on the ‘many’ or ‘child’ or ‘detail’ table. To achieve this, the child table must contain fields which link back the primary key on the parent table. These fields on the child table are known as a foreign key, and the parent table is referred to as the foreign table (from the viewpoint of the child).
- It is possible for a record on the parent table to exist without corresponding records on the child table, but it should not be possible for an entry on the child table to exist without a corresponding entry on the parent table.
- A child record without a corresponding parent record is known as an orphan.
- It is possible for a table to be related to itself. For this to be possible it needs a foreign key which points back to the primary key. Note that these two keys cannot be comprised of exactly the same fields otherwise the record could only ever point to itself.
- A table may be the subject of any number of relationships, and it may be the
parent in some and the child in others.
- Some database engines allow a parent table to be linked via a candidate key, but if this were changed it could result in the link to the child table being broken.
- Some database engines allow relationships to be managed by rules known as referential integrity or foreign key restraints. These will prevent entries onchild tables from being created if the foreign key does not exist on the parent table, or will deal with entries on child tables when the entry on the parent table is updated or deleted.
Relational Joins
The join operator is used to combine data from two or more
relations (tables) in order to satisfy a particular query. Two relations
may be joined when they share at least one common attribute. The join
is implemented by considering each row in an instance of each relation. A
row in relation R1 is joined to a row in relation R2 when the value of
the common attribute(s) is equal in the two relations. The join of two
relations is often called a binary join.
The join of two relations creates a new relation. The notation ‘R1 x
R2’ indicates the join of relations R1 and R2. For example, consider
the following:
Note that the instances of relation RI and R2 contain the same data
values for attribute B. Data normalisation is concerned with
decomposing a relation (e.g. R(A,B,C,D,E) into smaller relations (e.g.
R1 and R2). The data values for attribute B in this context will be
identical in R1 and R2. The instances of R1 and R2 are projections of
the
instances of R(A,B,C,D,E) onto the attributes (A,B,C) and (B,D,E)
respectively. A projection will not eliminate data values duplicate rows
are removed, but this will not remove a data value from any attribute.
The join of relations RI and R2 is possible because B is a common attribute. The result of the join is:
The row (2 4 5 7 4) was formed by joining the row (2 4 5) from
relation R1 to the row (4 7 4) from relation R2. The two rows were
joined since each contained the same value for the common attribute B.
The row (2 4 5) was not joined to the row (6 2 3) since the values of
the common attribute (4 and 6) are not the same.
The relations joined in’ the preceding example shared exactly one
common attribute. However, relations may share multiple common
attributes. All of these common attributes must be used in creating a
join. For example, the instances of relations R1 and R2 in the following
example are joined using the common attributes B and C:
Before the join:
After the join:
The row (6 1 4 9) was formed by joining the row (6 1 4) from relation R1 to the row
(1 4 9) from relation R2. The join was created since the common set of attributes (B and
C) contained identical values (1 and 4). The row (6 1 4) from R1 was not joined to the
row (1 2 1) from R2 since the common attributes did not share identical values - (1 4) in
R1 and (1 2) in R2.
The join operation provides a method for reconstructing a relation
that was decomposed into two relations during the normalisation process.
The join of two rows, however, can create a new row that was not a
member of the original relation. Thus invalid information can be created
during the join process.
Now suppose that a list of courses with their corresponding room
numbers is required. Relations R1 and R4 contain the necessary
information and can be joined using the attribute HOUR. The result of
this join is:
This join creates the following invalid information (denoted by the coloured rows):
• Smith, Jones, and Brown take the same class at the same time from two different instructors in two different rooms.
• Jenkins (the Maths teacher) teaches English.
• Goldman (the English teacher) teaches Maths.
• Both instructors teach different courses at the same time.
Another possibility for a join is R3 and R4 (joined on INSTRUCTOR). The result would be:
This join creates the following invalid information:
• Jenkins teaches Math I and Algebra simultaneously at both 8:00 and 9:00.
A correct sequence is to join R1 and R3 (using COURSE) and then
join the resulting relation with R4 (using both INSTRUCTOR and HOUR).
The result would be:
Extracting the COURSE and ROOM attributes (and eliminating the
duplicate row produced for the English course) would yield the desired
result:
The correct result is obtained since the sequence (R1 x r3) x R4 satisfies the lossless (gainless?) join property
A relational database is in 4th normal form when the lossless join
property can be used to answer unanticipated queries. However, the
choice of joins must be evaluated carefully. Many different sequences of
joins will recreate an instance of a relation. Some sequences are more
desirable since they result in the creation of less invalid data during
the join operation.
Suppose that a relation is decomposed using functional dependencies
and multi- valued dependencies. Then at least one sequence of joins on
the resulting relations exists that recreates the original instance with
no invalid data created during any of the join operations.
For example, suppose that a list of grades by room number is
desired. This question, which was probably not anticipated during
database design, can be answered without creating invalid data by either
of the following two join sequences:
The required information is contained with relations R2 and R4, but these relations
cannot be joined directly. In this case the solution requires joining all 4 relations.
The database may require a ‘lossless join’ relation, which is
constructed to assure that any ad hoc inquiry’ can be answered with
relational operators. This relation may contain attributes that are not
logically related to each other. This occurs because the relation must
serve as a bridge between the other relations in the database. For
example, the lossless join relation will contain all attributes that
appear only on the left side of a functional dependency. Other
attributes may also be required, however, in developing the lossless
join relation.
Consider relational schema R (A, B, C, D), A
B and C
D. Relations
and
are in 4th normal form. A third relation
however,
is required to satisfy the lossless join property. This relation can be
used to join attributes B and D. This is accomplished by joining
relations R1 and R3 and then joining the result to relation
R2. No invalid data is created during these joins. The relation
is the lossless join relation for this database design.
A relation is usually developed by combining attributes about a
particular subject or entity. The lossless join relation, however, is
developed to represent a relationship among various relations. The
lossless join relation may be difficult to populate initially and
difficult to maintain - a result of including attributes that are not
logically associated with each other.
The attributes within a lossless join relation often contain
multi-valued dependencies. Consideration of 4th normal form is important
in this situation. The lossless join relation can sometimes be
decomposed into smaller relations by eliminating the multi-valued
dependencies. These smaller relations are easier to populate and
maintain.
Comments
Post a Comment