Introduction
Today, we are going to take up one of the most important topics of Database Management systems, which is, Data Independence. The reason behind implementing three levels of data abstraction is none other than achieving data independence. Data Independence is something that deals with changes taking place at different levels of schema. Apart from the meaning of data independence, you will learn about its need and its types here. So, let us get started with no further ado!
What is Data Independence?
We all are already familiar with the two terms ‘data’ and ‘independence’ separately, so we can more or less decipher the meaning of the entire term ‘data independence’ in the same sense. It is clear from the term itself that we are talking about independence of data present in a database. In technical words, you may put it as follows, ‘Data Independence is the property of DBMS that allows the user to change the schema definition at one level with no requirement of changing the schema definition at the succeeding higher levels’. It is very necessary that when we are making changes at one level, it does not hamper the other levels. To be precise, data independence refers to the independence or self-reliance of data present in the three levels of the database architecture. Moreover, it is something that aids you in keeping data separated from the application programs making use of it. It is often seen as a type of data transparency that a centralized database management system is highly concerned with.
Before we look into a deeper aspect of data independence, let us make a quick recall on the levels of a database. Internal or physical schema is the first level which is in direct connection with the database that too at memory level. The second is the conceptual or logical schema that serves as a mediator between the third and first levels. The third level is the external schema which dictates how the database is visible to the multiple end users. Assuming an instance of a Library database, the implementation of these three levels somewhat looks like this:
Internal Schema | Logical Schema | External Schema |
Unordered files of relations of databaseFirst column index of Visitors | Visitors (id: int, name: string, age: int, contact: numeric, address: string)Books (id: int, title: string, author: string, isbn: numeric) | View1: BookRecords (b_id: int, b_name: string, author: string)View2: VisitorRecords (v_id: int, v_name: string) |
Data independence separates data from API and implements the changes made at one of the levels to the inter-level mappings. It helps in maintaining the freedom of these individual levels of the database at the same time. From some point of views, data independence and operation independence together brings out the data abstraction phenomenon in a DBMS.
How to achieve data independence in DBMS?
For acquiring data independence, we make sure that our database is fulfilling the requisites of data abstraction. In easy words, data abstraction is the process of keeping the irrelevant details hidden from the end user. If we think with respect to real world entities, we can take the example of a car. When a driver drives a car, he has complete know-how of driving a car but if for some reason he is unable to start the car, he will need assistance from a car mechanic. That is so because the driver only knows how to drive a car, he does not know how to deal with the internal circuitry issues, reason being the internal circuitry and mechanism of the car is hidden from him. Likewise, the internal structure is invisible to the programmers and end-users. This property of limiting the visibility in a database is referred to as data abstraction. A database holds up to three levels of abstraction. These three levels are listed below:
- Physical level (Internal schema)
- Conceptual level (Logical level)
- View level
Physical level of data abstraction takes care of the internal schema. This level of abstraction defines how the data in a database is stored. It stores the detailed and complex data structures of the database that the end users and programmers would not be interested in. It is also considered as the lowest level of data abstraction.
The conceptual level, which is the middle level of data abstraction, is for the logical level of the schema. The previous level answers the question ‘How?’. Similarly, this level answers the question ‘Why?’ It describes what and which type of data is stored in the database.
The last and the highest level of data abstraction is the View level. This level tells how the data is to be viewed by different n users. It is responsible for user interaction with the database.
Let us take a very low scale example where we have stored the details of customers of a store. So, if we talk about the physical level, the data is stored as blocks of memory in bytes, gigabytes, terabytes, etc. Basically, it deals with complex memory storage. This information is not visible to the programmers. The logical level will describe the customer details that are entered and their data types. The logical relationship is implemented between the data at this level using programming languages. This level is mainly dealt with by programmers. At the view level, the user interacts with the system through GUI to enter the data, maybe in a form format or some other set format. Now, each level must be independent of the other, so that when we make changes at one level, we are not required to make changes at the next higher level. And, this is what data independence does.
In order to maintain these three levels of data abstraction, we might need to make changes in one level of the database which could be a great hassle if not for data independence. You will agree on the point that changing the whole application program to reflect a slight change in the physical schema is no way in our favour in terms of time and programming. Data independence ensures that doing modifications at one level is not affecting the other levels of the database. On the basis of the three levels of data abstraction, data independence is branched out into two types.
Types of Data Independence
Let us learn about the two types of data independence and their properties. The two categories are:
- Physical data independence
- Logical data independence
Physical Data Independence: Under physical data independence, we get the freedom to alter the physical schema without being bound to alter the application programs for that. It is responsible for separating the internal level from the conceptual level of a database structure. Physical data independence enables providing a logical description or overview of the database, not necessarily required to specify the details of the logical structure of a database. As per physical data independence, any changes made in the internal level are not supposed to change the definition of the conceptual level or view level schema.
Physical independence lets you modify the file storage structures, hashing algorithms, compression techniques, storage devices, location of database, access method, indexes, and so on. So, basically it deals with the implementation of efficient memory storage techniques. Any change made at this level will be applied on the mapping between the internal and conceptual levels of the database. Keep in mind that the modifications introduced must be localized. Physical data independence is attained by the physical level and then transformation from the conceptual to the internal level of the database is carried out.
Occasionally, we are required to update the internal level for enhancing the performance of our DBMS in view of memory management. Thus, physical data independence, undeniably, plays a vital role on the grounds of the fact that making changes to the storage techniques in accord with our requirements is something which an efficient DBMS must be prone to.
Logical Data Independence: Logical data independence gives the freedom to change the conceptual level of schema without putting any compulsion of changing the external views and external programs or API. Modifications done at this level are enforced on the logical and end view level mapping. Application programs being heavily dependent on the conceptual level makes it difficult to attain logical data independence comparative to physical data independence. Any minor or major change in the logical structure of the database would require us to change the programs as well. Thus, achieving logical data independence can be rather challenging. Logical data independence governs the separation between the end-view level and the conceptual level.
Logical data independence allows us to make changes like adding, modifying or deleting an attribute, entity, or even a relationship. Doing such modifications does not call for rewriting the application program, but to make corresponding alterations in the program. It enables us to merge two records into one without affecting the external layer. If one wants to split an existing record into two, it is possible without interfering with the end-user view level structure of a given database.
Making timely modifications to the conceptual level in order to keep your DBMS up to date is essential. This is why logical data independence is said to be leading a pivotal role. It not only helps improve the performance and speed of the DBMS but also turns out to be helpful in making your database much handier and more reliable.
Advantages of Data Independence
Data independence is at an indisputable footing when it comes to being one of the most significant characteristics of a database management system. There are several reasons to justify the need of data independence in DBMS. So, let us take a look at the advantages that satiate the needs of DBMS.
- Data Quality – Data independence helps in boosting up the quality of data stored in a database. Since modifying the database structure becomes more convenient with data independence, data storage becomes efficient. It facilitates enhancement of undivided or unimpaired state. Hence, it results in improvement of quality of the data stored.
- Cost-effective Maintenance – Data independence saves us from going through the trouble of making changes in all the schematic levels of our database if changes are required in one schematic level. Thereby, maintaining our database becomes affordable to a nice extent.
- Security Aspect – Proper enforcement of the standards and protocols for shielding the database becomes easier. Therefore, data independence is indeed helpful in improving database security.
- Developers Focusing on General Structure: Developers can solely focus on handling and updating the logical structure without being bothered about the internal implantation. The changes are directly absorbed by the conceptual level and internal level mapping.
- Reduction in Data Incongruity – Data independence supports modifying our database structure in account of increasing its compatibility. This helps in controlling the data incongruity.
- Improving Performance – Data independence fuels the cause of data abstraction. Besides, it facilitates the smooth implementation of new changes. As a result, data accessing, retrieval or data modification becomes speedy and convenient. This is how data independence proves to be useful in improving database performance.
As we can infer from the above listed points highlighting the merits of data independence, data independence acts as one of the weapons of DBMS that overcome the drawbacks of file based systems. One may see it as the immunity of user applications against the changes made in the schema definition and data organization. Every coin has two faces, same goes with this too. Now that you are pretty convinced about data independence being the key to gaining reliability and security of a database, you must be wondering about its darker aspect. Then, let me tell you that there are just two major shortcomings that we commonly come across while dealing with data independence. The first one is increased complexity that comes with adopting data independence. Databases have to be carefully designed to make optimum use of data independence. The second disadvantage is related to the former. Application programs severely depend on the logical schema of the data. Therefore, changing the conceptual structure mandates changing the respective application program.
Hope this blog has been able to throw proper light on data independence and add value to your knowledge on DBMS. Apart from data independence, there are several other factors that deploy a great impact on a database. To get a good grasp on such relevant topics of DBMS, you can refer to other blogs of Great Learning. Stay tuned and keep continuing your learning journey with us. You might prefer opting for a crash course on DBMS with a fine chance of earning a certificate. We have got numerous certified courses that will benefit you in one or the another way. So, what else are you waiting for? Go and enrol yourself for a new course here! Happy learning!