# Fourth Normal Form (4NF)

## Fourth Normal Form (4NF)

- We saw that
*BC-schema*was in BCNF, but still was not an ideal design as it suffered from repetition of information. We had the multivalued dependency*cname**street ccity*, but no non-trivial functional dependencies. - We can use the given multivalued dependencies to improve the database design by decomposing it into
**fourth normal form**. - A relation schema
*R*is in 4NF with respect to a set*D*of functional and multivalued dependencies if for all multivalued dependencies in of the form , where and , at least one of the following hold:- is a trivial multivalued dependency.
- is a superkey for schema
*R*.

- A database design is in 4NF if each member of the set of relation schemas is in 4NF.
- The definition of 4NF differs from the BCNF definition only in the use of multivalued dependencies.
- Every 4NF schema is also in BCNF.
- To see why, note that if a schema is not in BCNF, there is a non-trivial functional dependency holding on
*R*, where is not a superkey. - Since implies , by the replication rule,
*R*cannot be in 4NF.

- We have an algorithm similar to the BCNF algorithm for decomposing a schema into 4NF:
*result*:= ;*done*:= false;compute ;

**while (not***done*)**do****if**(there is a schema in*result*that is not in 4NF)

**then begin**let be a nontrivial multivalued

dependency that holds on such that

is not in , and

;

*result*=**end****else***done*= true; - If we apply this algorithm to
*BC-schema*:*cname**loan#*is a nontrivial multivalued dependency and*cname*is not a superkey for the schema.- We then replace
*BC-schema*by two schemas:*Cust-loan-schema=(cname, loan#)**Customer-schema=(cname, street, ccity)* - These two schemas are in 4NF.

- We can show that our algorithm generates only lossless-join decompositions.
- Let
*R*be a relation schema and*D*a set of functional and multivalued dependencies on*R*. - Let and form a decomposition of
*R*. - This decomposition is lossless-join if and only if at least one of the following multivalued dependencies is in :
- We saw similar criteria for functional dependencies.
- This says that for
**every**lossless-join decomposition of*R*into two schemas and , one of the two above dependencies must hold. - You can see, by inspecting the algorithm, that this must be the case for every decomposition.

- Let
- Dependency preservation is not as simple to determine as with functional dependencies.
- Let
*R*be a relation schema. - Let be a decomposition of
*R*. - Let
*D*be the set of functional and multivalued dependencies holding on*R*. - The
**restriction**of*D*to is the set consisting of:- All functional dependencies in that include only attributes of .
- All multivalued dependencies of the form where and is in .

- A decomposition of schema
*R*is dependency preserving with respect to a set*D*of functional and multivalued dependencies if for every set of relations such that for all*i*, satisfies , there exists a relation*r*(*R*) that satisfies*D*and for which for all*i*.

- Let
- What does this formal statement say? It says that a decomposition is dependency preserving if for every set of relations on the decomposition schema satisfying only the restrictions on
*D*there exists a relation*r*on the entire schema*R*that the decomposed schemas can be derived from, and that*r*also satisfies the functional and multivalued dependencies. - We’ll do an example using our decomposition algorithm and check the result for dependency preservation.
- Let
*R*=(*A*,*B*,*C*,*G*,*H*,*I*). - Let D be
*R*is not in 4NF, as we have and*A*is not a superkey.- The algorithm causes us to decompose using this dependency into
- is now in 4NF, but is not.
- Applying the multivalued dependency (how did we get this?), our algorithm then decomposes into
- is now in 4NF, but is not.
- Why? As is in (why?) then the restriction of this dependency to gives us .
- Applying this dependency in our algorithm finally decomposes into
- The algorithm terminates, and our decomposition is and .

- Let
- Let’s analyze the result.

**Figure 1:**Projection of relation*r*onto a 4NF decomposition of*R*.- This decomposition is not dependency preserving as it fails to preserve .
- Figure 1 shows four relations that may result from projecting a relation onto the four schemas of our decomposition.
- The restriction of
*D*to (*A*,*B*) is and some trivial dependencies. - We can see that satisfies as there are no pairs with the same
*A*value. - Also, satisfies all functional and multivalued dependencies since no two tuples have the same value on any attribute.
- We can say the same for and .
- So our decomposed version satisfies all the dependencies in the restriction of
*D*. - However, there is no relation
*r*on (*A*,*B*,*C*,*G*,*H*,*I*) that satisfies*D*and decomposes into and . - Figure 2 shows .
- Relation
*r*does not satisfy . - Any relation
*s*containing*r*and satisfying must include the tuple . - However, includes a tuple that is not in .
- Thus our decomposition fails to detect a violation of .

- We have seen that if we are given a set of functional and multivalued dependencies, it is best to find a database design that meets the three criteria:
- 4NF.
- Dependency Preservation.
- Lossless-join.

- If we only have functional dependencies, the first criteria is just BCNF.
- We cannot always meet all three criteria. When this occurs, we compromise on 4NF, and accept BCNF, or even 3NF if necessary, to ensure dependency preservation.