Normalization Concept

dbtut November 26, 2018 General, MSSQL

The concept of normalization was created by Edgar F. Codd, the founder of the relational database model. During the design of the tables, we can achieve performance improvement by using normalization at an appropriate level. However, data consistency and integrity are maintained.

Normalization procedures consist of certain levels. It does not go to the other level before the operations in one level are completed.

Table and column based operations are performed at each level in order to maintain more atomic data. Basically there are 6 normalization levels. However, we will describe the 3 most commonly used levels of normalization.

Normalization Level 1:

Normalization Level 1 can be considered as the lowest level. For a database to have normalization at this level, the following rules must be followed.

The absence of repetitive columns in the same table
Only one value in each column
An identifier key column for each row

We can see the reasons for these requirements in the examples below.

Name	Surname	Phone 1	Phone 2	Phone 3
Nurullah	Cakir	(123) 4567890	(123) 1234567	(123)1231231
Hakan	Gurbaslar	(123) 1111111

In the above example, the person named Nurullah Cakir has 3 phone information. However, because the person named Hakan Gürbaşlar has only one phone information, the “Phone 2” and “Phone 3” fields are empty. If a person with 4 or more phones is added to the Table, the new columns will have to be opened. This will unnecessarily increase the number of columns in the table. Therefore, there should be no repetitive columns in the same table.

Name	Surname	Phone
Nurullah	Cakir	(123) 4567890 , (123) 1234567 , (123)1231231
Hakan	Gurbaslar	(123) 1111111

If we create the table as above, the number of columns will not be increased. However, if we want to find the person with the phone “(123) 1234567” or to delete the “(123) 1234567” phone, the operation will be more complicated.

Name	Surname	Phone
Nurullah	Cakir	(123) 4567890
Nurullah	Cakir	(123) 1234567
Nurullah	Cakir	(123) 1231231
Hakan	Gurbaslar	(123) 1111111

When we create the table as above, the phone information will be in one column and a single phone information is stored on each row. However, we do not know whether the records belonging to Nurullah Cakir belong to the same person. Therefore, we can add a new column and determine if these records belong to same people.

ID	Name	Surname	Phone
1	Nurullah	Çakır	(123) 4567890
1	Nurullah	Çakır	(123) 1234567
1	Nurullah	Çakır	(123)1231231
2	Hakan	Gürbaşlar	(123) 1111111

Normalization Level 2:

In order to be able to follow the steps in this level of normalization, Normalization Level 1 procedures must be completed.

Then we have to do the following checks:

Creating different tables for records that contain the same subset
Identify the foreign key to define the relationship between the sub table and the actual table.

When we look at the table in our first normalization level, we see that there are repeated records. Then we have to keep the contact information in a separate table, and the phone information of the contacts in a separate table.

Let’s create the contact table as follows.

ID	Name	Surname
101	Nurullah	Cakir
102	Hakan	Gurbaslar

Convert the phone table to the following.

ID	Phone
1	(123) 4567890
2	(123) 1234567
3	(123)1231231
4	(123) 1111111

In this table, it is not clear which phone belongs to which person. When we add the ID value that we created for the contact table, the table is as follows.

ID	Contact ID	Phone
1	101	(123) 4567890
2	101	(123) 1234567
3	101	(123)1231231
4	102	(123) 1111111

This means that we can access the ID in the Contact table from the phone number. We also access all of the contact’s information from the ID value in the contact table.

Normalization Level 3:

In order to be able to follow the steps in this level of normalization, Normalization Level 2 procedures must be completed.

At this level, it is also checked that each non-primary column is not connected to another column.

Add the city and county columns to the contact table.

ID	Name	Surname	City	County
101	Nurullah	Cakir	Ankara	Yenimahalle
102	Hakan	Gurbaslar	Ankara	Yenimahalle

City column(its not primary key) is related to the County column(its not primary key too).

Therefore we need to create a separate table for City and County information.

In OLTP systems, performance will increase as the level of normalization increases.

In OLAP systems, performance will decrease as the level of normalization increases.

Database Tutorials MSSQL, Oracle, PostgreSQL, MySQL, MariaDB, DB2, Sybase, Teradata, Big Data, NOSQL, MongoDB, Couchbase, Cassandra, Windows, Linux

Normalization Concept

Normalization Level 1:

Normalization Level 2:

Normalization Level 3:

About dbtut

Leave a Reply Cancel reply