Fourth Normal Form (4NF)
- We saw that BC-schema was in BCNF, but still was not an ideal design as it suffered from repetition of information. We had the multivalued dependency cname street ccity, but no non-trivial functional dependencies.
- We can use the given multivalued dependencies to improve the database design by decomposing it into fourth normal form.
- A relation schema R is in 4NF with respect to a set D of functional and multivalued dependencies if for all multivalued dependencies in of the form , where and , at least one of the following hold:
- is a trivial multivalued dependency.
- is a superkey for schema R.
- A database design is in 4NF if each member of the set of relation schemas is in 4NF.
- The definition of 4NF differs from the BCNF definition only in the use of multivalued dependencies.
- Every 4NF schema is also in BCNF.
- To see why, note that if a schema is not in BCNF, there is a non-trivial functional dependency holding on R, where is not a superkey.
- Since implies , by the replication rule, R cannot be in 4NF.
- We have an algorithm similar to the BCNF algorithm for decomposing a schema into 4NF:
result := ;
done := false;
while (not done) do
if (there is a schema in result
that is not in 4NF)
let be a nontrivial multivalued
dependency that holds on such that
is not in , and
else done = true;
- If we apply this algorithm to BC-schema:
- cname loan# is a nontrivial multivalued dependency and cname is not a superkey for the schema.
- We then replace BC-schema by two schemas:
Customer-schema=(cname, street, ccity)
- These two schemas are in 4NF.
- We can show that our algorithm generates only lossless-join decompositions.
- Let R be a relation schema and D a set of functional and multivalued dependencies on R.
- Let and form a decomposition of R.
- This decomposition is lossless-join if and only if at least one of the following multivalued dependencies is in :
- We saw similar criteria for functional dependencies.
- This says that for every lossless-join decomposition of R into two schemas and , one of the two above dependencies must hold.
- You can see, by inspecting the algorithm, that this must be the case for every decomposition.
- Dependency preservation is not as simple to determine as with functional dependencies.
- Let R be a relation schema.
- Let be a decomposition of R.
- Let D be the set of functional and multivalued dependencies holding on R.
- The restriction of D to is the set consisting of:
- All functional dependencies in that include only attributes of .
- All multivalued dependencies of the form where and is in .
- A decomposition of schema R is dependency preserving with respect to a set D of functional and multivalued dependencies if for every set of relations such that for alli, satisfies , there exists a relation r(R) that satisfies D and for which for all i.
- What does this formal statement say? It says that a decomposition is dependency preserving if for every set of relations on the decomposition schema satisfying only the restrictions on D there exists a relation r on the entire schema R that the decomposed schemas can be derived from, and that r also satisfies the functional and multivalued dependencies.
- We’ll do an example using our decomposition algorithm and check the result for dependency preservation.
- Let R=(A,B,C,G,H,I).
- Let D be
- R is not in 4NF, as we have and A is not a superkey.
- The algorithm causes us to decompose using this dependency into
- is now in 4NF, but is not.
- Applying the multivalued dependency (how did we get this?), our algorithm then decomposes into
- is now in 4NF, but is not.
- Why? As is in (why?) then the restriction of this dependency to gives us .
- Applying this dependency in our algorithm finally decomposes into
- The algorithm terminates, and our decomposition is and .
- Let’s analyze the result.
Figure 1: Projection of relation r onto a 4NF decomposition of R.
- This decomposition is not dependency preserving as it fails to preserve .
- Figure 1 shows four relations that may result from projecting a relation onto the four schemas of our decomposition.
- The restriction of D to (A,B) is and some trivial dependencies.
- We can see that satisfies as there are no pairs with the same A value.
- Also, satisfies all functional and multivalued dependencies since no two tuples have the same value on any attribute.
- We can say the same for and .
- So our decomposed version satisfies all the dependencies in the restriction of D.
- However, there is no relation r on (A,B,C,G,H,I) that satisfies D and decomposes into and .
- Figure 2 shows .
- Relation r does not satisfy .
- Any relation s containing r and satisfying must include the tuple .
- However, includes a tuple that is not in .
- Thus our decomposition fails to detect a violation of .
- We have seen that if we are given a set of functional and multivalued dependencies, it is best to find a database design that meets the three criteria:
- Dependency Preservation.
- If we only have functional dependencies, the first criteria is just BCNF.
- We cannot always meet all three criteria. When this occurs, we compromise on 4NF, and accept BCNF, or even 3NF if necessary, to ensure dependency preservation.