- What is DBMS?
- What is Database?
- Why Learn DBMS?
- Advantages of DBMS
- Disadvantages of DBMS
- Application of DBMS
- Few important terminology related to database are listed below:
- The different types of database management systems are-
- Actors/ Users:
- RELATIONAL ALGEBRA
- Unary Relational Operations
- Binary Relational Operations
- ER to Relational model
- SQL OVERVIEW
- DDL:
- DML:
- TCL:
- DCL:
- Relational Database Design
- Constraints in DBMS
- DATABASE NORMALISATION
- KEYS
- Keys in DBMS
- FUNCTIONAL DEPENDENCIES
- 2NF (Second Normal Form)
- 3NF (Third Normal Form)
- BCNF ( Boyce Codd Normal Form)
- 4NF(Fourth Normal Form)
- 5NF(Fifth Normal Form)
- JOINS
- DBMS Tutorial FAQs
What is DBMS?
Database Management System (DBMS) is software that enables its users to manage, access the databases and also helps in manipulating and reporting the data. Some of the examples of DBMS are MySQL, PostgreSQL, Microsoft Access, SQL Server, Oracle, etc. Check out database management courses.
This DMBS tutorial explains the basic of DMBS such as important terminologies, types of DBMS,E-R model, relation model,Constraints and much more.
DBMS has a wide range of applications in various sectors like banking, Finance, Telecom, Manufacturing etc. Traditional file systems have certain drawbacks that lead to the databases. Those are listed below:
- Main memory will be limited and the data to be processed needs more space.
- Special application programs are needed for the specific needs of the users.
- Consistency issues arise during system crashes.
Take a look at our Free database management systems course
What is Database?
The database is defined as a collection of data, that helps in creating, editing and maintaining database files and records. Database software handles data storage, data backup, and security. Database software is also called as “database management system” (DBMS).
Examples of database are: Microsoft Access, Oracle, MySQL, FoxPro etc.
DBMS makes the preceding tasks easier. The main characteristics of the database approach are-
- Self-describing nature of the database.
- Programs and data are insulated such that change in the data does not require a change in the program.
- Same data can be shown differently to different viewers.
- Data can be shared between the users, and transactions can be interleaved maintaining consistency.
Why Learn DBMS?
This DBMS tutorial helps you to understand why to learn DBMS
- Real-world entity
- Relation-based tables
- Isolation of data and application
- Less redundancy
- Consistency
- Query Language
Advantages of DBMS
- DBMS can control database redundancy
- DBMS allow authorized users of the organizations to share the data.
- DBMS can be maintained easily due to its centralized nature.
- DBMS reduces development time and maintenance need.
- DBMS provides backup and recovery subsystems
- DBMS offers Data Integrity and Security
Disadvantages of DBMS
- DBMS requires a high-speed data processor so the cost of both hardware and software is high.
- DBMS occupies a large space of disks and memory.
- In DBMS all the data is stored in a single database, and if the databases are damaged due to power failure then we may lose all the data.
- DBMS system is a complex system, it requires training for the users to use DBMS.
- DBMS can’t perform high-end calculations
Application of DBMS
Some of the fields where a database management system is used given below:
HR Management: Employees’ information and salary-related activities can be managed in DBMS software.
Banking: Customer records , customer information, account activities, payments, deposits, loans can be managed by DBMS software
Educational Institution: All the information regarding the students is handled by DBMS software.
Finance: For storing information about stock, sales, is handled by DBMS software.
Sales: DBMS is used for storing customer’s information.
Few important terminology related to database are listed below:
Metadata – the information stored in the catalog.
Data Abstraction – is the program data independence and program operation independence.
Database schema – describes the database. It is divided into 3 levels. External, Conceptual and Internal.
EXTERNAL SCHEMA |
|
CONCEPTUAL SCHEMA |
|
INTERNAL SCHEMA |
Logical Data Independence – Ability to change the conceptual schema without any need to change the external schema.
Physical Data Independence – Ability of the database to change its internal schema without the need for changing its conceptual schema.
DBMS can be organised in various manners.
The different types of database management systems are-
- Relational Database systems – Here all the data is organised in the form of tables called the relations.
- Hierarchical database systems – here the data is stored in the form of parent child relationships. These are confined to specific uses.
- Network database systems – These also follow hierarchy but the relationships are many to many.
- Object oriented databases – Here the data is stored as objects and relationships exist between various objects.
Actors/ Users:
The various types of users include:
- Application programmers
- Database Administrators
- End-users
RELATIONAL ALGEBRA
We all know that relational databases contain data in the form of relations. A relation might have an association with some other relations. E.F Codd had modelled data in the form of relations called the relational model. To get the information from the relational model, we need to use some formal language. The languages that are used for the relational model are Relational Algebra and Relational Calculus. Both these languages explain the concepts, but won’t implement them. Relational algebra is a procedural query language which explains how to perform a series of actions to get the required output. In contrast, the Relational Calculus is a non-procedural query language that specifies what the output should be without specifying the actions to be taken. Here we are going to discuss the relational algebra in detail.
To get the tasks done we need to perform operations. In Relational Algebra, we have two types of operations:
- Unary Relational Operations
- Binary Relational Operations.
As the name indicates, unary relational operations require one relation to operate with and the binary relational operations require two relations.
Unary Relational Operations
1. SELECT ( σ )
Syntax: σ p(R) where p is the predicate formula that we will apply on a relation R
The select operation can be applied on a single relation, to retrieve the rows from the table that satisfy the given predicate or condition.
The predicate will be a formula that can include connectors like AND,OR, NOT. it can have operators like =, <,>, <=, >=, !=.
For example: Course relation is as below:
Course_id | Course_name | Duration(in years) | exam_type |
1001 | Diploma in Software engineering | 2 | 3 |
1002 | B.Tech in Software Engineering | 4 | 3 |
1003 | M.Tech in Software Engineering | 2 | 3 |
1004 | Integrated B.Tech in SE | 5 | 3 |
1005 | Certificate Course on SE | 1 | 2 |
Here the exam_type is a categorical variable with values 1=practical, 2=theory, 3= both
Input: σ exam_type=3 (Course)
Output:
Course_id | Course_name | Duration(in years) | exam_type |
1001 | Diploma in Software engineering | 2 | 3 |
1002 | B.Tech in Software Engineering | 4 | 3 |
1003 | M.Tech in Software Engineering | 2 | 3 |
1004 | Integrated B.Tech in SE | 5 | 3 |
2. PROJECT (π)
Syntax: ∏ A1, A2, …An (R)
The specified column values will be projected.
Duplications will be removed.
Example: ∏ course_id, course_name (Course)
Output:
Course_id | Course_name |
1001 | Diploma in Software engineering |
1002 | B.Tech in Software Engineering |
1003 | M.Tech in Software Engineering |
1004 | Integrated B.Tech in SE |
1005 | Certificate Course on SE |
3. RENAME ( ρ )
Syntax : ρ(new_name, old_name)
To change the name of the attributes of a relation.
Example: ρ(NEW_COURSE,COURSE)
Binary Relational Operations
1. JOIN:
Syntax: R1 ⋈R2
Two or more relations can be joined based on some condition given.
It is nothing but the Cartesian Product followed by the Select operation.
Example: Course ⋈ Instructor
The Instructor table is given below:
Course_id | Instructor_ID |
1001 | 2034 |
1002 | 3056 |
1003 | 1092 |
1004 | 4089 |
1005 | 0199 |
And the output is given below:
Course_id | Course_name | Duration(in years) | exam_type | Instructor_ID |
1001 | Diploma in Software engineering | 2 | 3 | 2034 |
1002 | B.Tech in Software Engineering | 4 | 3 | 3056 |
1003 | M.Tech in Software Engineering | 2 | 3 | 1092 |
1004 | Integrated B.Tech in SE | 5 | 3 | 4089 |
1005 | Certificate Course on SE | 1 | 2 | 0199 |
There are different types of joins and were dealt in detail in the later part of the tutorial.
2. DIVISION
Syntax : R1 ÷ R2
Performing a Division operation is nothing but doing Project, Cartesian Product and Set Difference operations in order.
This operation can be done only if the attributes in R2 are a proper subset of that of R1.
It includes the attributes that are in R1 but not in R2.
It includes those tuples from the relationR1 that are associated with every tuple in the relation R2.
Relational Algebra has some operations that were in the set theory. Those were listed below:
1. UNION (U)
Syntax: A U B
- Can be applied to two relations if both the relations have the same number of attributes.
- The domains of the attributes must be compatible.
- All the tuples from both the relations will be included in the result eliminating the duplicates
2. INTERSECTION ( ∩)
Syntax: A ∩ B
- Can be applied on two relations if they both have the same number of attributes
- The domains of the attributes must be compatible.
- The tuples that are present in both the relations will be chosen for the output.
3. SET DIFFERENCE (-)
Syntax: A – B
- Includes only those tuples that are in the relation A but not in the relation B.
- The attribute names in relation A should match with the attribute names in relation B
- Both the relations should have the same number of attributes and also their domains must be compatible.
4. CARTESIAN PRODUCT or CROSS PRODUCT ( X )
Syntax: A X B
- To merge a row of a relation with a row in another relation.
- Stand-alone cartesian products are never useful in relational databases.
Few points to be noted regarding the above operations are:
a) Combination of the Relational Algebra Selection and Projection is nothing but the SQL SELECT command.
b) SQL SELECT does not remove the duplicate tuples whereas the relational algebra PROJECT will remove duplicate tuples.
c) Relational Algebra SELECT is a horizontal partition whereas the PROJECT is a vertical partition.
d) The SELECT operation is commutative whereas the PROJECT operation is not commutative
e) The UNION and the INTERSECTION are both commutative and associative, ie.,
AUB = BUA, A∩ B = B∩ A (commutative) and (AU B) U C = A U (B U C),
(A ∩ B)∩ C = A ∩(B ∩ C) (which is the associative rule).
The set of operations {σ, π, U, -, ρ, X } is a complete set. Any relational algebra expression can be expressed as a sequence of the above operations.
ER to Relational model
We often come across two different models, viz., the ER model and the Relational model. The ER model means the Entity-Relationship model. The Relational model tells us how the data is stored in the form of relations in the database. Then what is the need for the ER model? The ER model is the high-level conceptual data model that is required to systematically analyze the data requirements. Generally, the clients are not sure of how to model their data into tables. This ER model helps in visualizing their thoughts with the help of diagrams. This comes in the design phase of software development. It identifies what the real-world objects that exist in the system and their association with other objects are. In the ER model terminology, those objects are called the entities, and the association between them is called the relationship. In short, we can say that the ER model is the blueprint of the system that we are going to implement. The important concepts in the ER model are entities, attributes and relationships. The characteristics of the features of the entities are the attributes of the entities. Let us see how different types of entities, attributes and relationships are represented in the ER diagram.
Entity or Strong Entity is represented by a rectangle.
Weak Entity is represented by a double rectangle.
Key Attribute is represented by Ellipses with the name underlined
Multivalued Attribute is represented by Double ellipses
Derived attribute is represented by Dashed ellipses
Composite attribute is represented by ellipse connected to the other ellipses.
All the above are represented as below:
Relationship is represented by a diamond and the cardinality is mentioned on the lines connecting the entities involved in the relationship.
Inheritance is represented by a triangle for generalisation and a circle for its implementation.
Now let us see how the ER model is mapped into the Relational model.
Step1: Strong Entities – For each strong entity, create one table (relation).
Include all the simple attributes of the strong entity as the columns of the relation.
Only the simple components of the composite attribute need to be included in the relation.
Choose the primary key of the above created relation.
Step 2: Weak Entities – For each weak entity, a new relation has to be created.
All the simple attributes need to be included.
Include the primary key of the owner entity as the foreign key of the newly created relation.
The primary key of the newly created table will be the combination of the primary key of the owner entity and the partial key of the weak entity.
The mapping order should be first the owner entity and then the weak entity.
Step 3: Binary 1:1 relationship
For a relationship R, identify the participating relations R1 and R2.
In this relationship type, the mapping can be done in 3 ways.
a) i) Choose one relation R1.
ii) The primary key of the other relation R2 should be included in the relation R1 as foreign key.
iii) The entity that has a total participation in the relationship R should be taken as the relation
b) i) merge the relationship and the entities into one relation
ii) this option is chosen when both the entities are in total participation.
c) i) Create a new relation for the relationship R.
ii) Include the primary keys of both the relations in the relation.
iii) The primary key of the new relation is the combination of the primary keys of the participating relations in the relationship.
Step 4. Binary 1:N Relationship
For the relationship R with R1 and R2 as the participating relations with R1 on N and R2 on 1 side, do the following:
i) In R1 include the primary key of R2 as the foreign key.
ii) Any simple attributes of the relationship R are to be included in the relation R1.
Step 5: Binary M:N Relationship
i) Create a new relation by including the primary keys of both the participating entities as the foreign key.
ii) The primary key of the new relation is the combination of the primary keys of the participating entities.
iii) All the simple attributes of the relationship are to be included in the new relation.
Step 6: Multivalued attributes
A new relation R has to be created for each multivalued attribute A.
Also include the primary key of the relation/entity that has the attribute A in it, as the foreign key in R.
Step 7: N-ary relationship
A new relation S has to be created to represent the relationship R.
The primary keys of all the participating entities have to be included in S as the foreign key.
The primary key of the new relation is the combination of all the foreign keys.
SQL OVERVIEW
As the name indicates, SQL, Structured Query Language is a language used to question the database which is in a structured format. As relational databases store data in a structured format, SQL is used with RDBMS. RDBMS like Oracle, Ms Access, MySQL, Sybase etc. use SQL for data query operations. In this tutorial, we are going to cover all the important SQL commands and their usage. Please note that there will be slight variations in the syntax depending on the SQL dialect you are using, and hence syntax is not mentioned here.
All the commands in the SQL can be grouped into four categories, viz, Data Definition Language (DDL), Data Manipulation Language(DML), Transaction Control Language (TCL) and Data Control Language(DCL). DDL commands change the metadata of the database. DML commands are used to change the data in the database. DCL deals with the schema elements’ security aspects (such as tables, views, constraints, domains etc.). TCL commands are used for transactional purposes.
The usage of the various commands is summarised below:
DDL:
- CREATE SCHEMA/DATABASE schema_name/database_name AUTHORIZATION user_name – to create a schema or database giving privileges to a particular user_name.
- a) CREATE TABLE table_name (column1 datatype, column2 datatype, … )
– to create an empty table with columns and data types as specified.
b) CREATE TABLE new_table_name ( SELECT AS column1, column2, …. FROM old_table_name WHERE … )
– to create a new table from an existing table that satisfies a condition in the WHERE clause and fills the new table with rows from the old table. - CREATE DOMAIN domain_name AS type_name
– creates a domain name for a particular data type. - a) ALTER TABLE table_name ADD column_name data_type
– to add a new column to an existing table
b) ALTER TABLE table_name DROP COLUMN column_name
– to drop a column from an existing table.
c) ALTER TABLE table_name RENAME TO new_tabel_name
– to rename an existing table. - DROP DATABASE/TABLE/INDEX/VIEW database_name/table_name/index_name/view_name
– to drop an existing database or table or index or view. - TRUNCATE TABLE table_name
– to remove the data from the table.
DML:
- a) INSERT INTO table_name VALUES (val1, val2, val3…)
– to insert values into the table for all the attributes in the table.
b) INSERT INTO table_name(col1, col2, col3, …. ) VALUES (val1, val2, val3….)
– to insert values into the table for the specified columns of the table. - a) DELETE FROM table_name
– to delete all the tuples from a table.
b) DELETE FROM table_name WHERE condition
– to delete some tuples from a table that satisfies the given condition. - SELECT col1, col2,col3, … FROM table_name WHERE condition1
– outputs the selected columns from the given table that satisfy the condition.
TCL:
- SET TRANSACTION [ READ_WRITE | READ_ONLY ]
– to start a transaction with the operations that are allowed in it. - COMMIT
– all the instructions before the commit command should be permanently stored in the database. - a) ROLLBACK
– all the uncommitted transactions that are not saved in the database will be undone.
b) ROLLBACK savepoint_name
– all the commands between the rollback command and the savepoint command will be undone. - a) SAVEPOINT savepoint_name
– a name given to a particular point in the transaction which a transaction can be rolled back to that particular point instead of the entire transaction, if any unusual/abnormal operation occurs.
b) RELEASE SAVEPOINT savepoint_name
– to delete a savepoint that has been created and can no longer be used in a rollback command.
DCL:
- GRANT – to allow a specific user to have some privileges
- REVOKE – to take back the granted privileges that were granted
Some key points to be noted while using these commands are listed below:
- In the TRUNCATE command, the structure of the schema element is preserved as opposed to DROP command and hence DROP command cannot be rolled back.
- DELETE just removes the rows from the table and returns the number of rows deleted, unlike TRUNCATE which does not return the number of rows.
- For some of the commands above, SQL clauses will be used for row specific or column specific actions. Those include ORDER BY, GROUP BY, HAVING , DISTINCT, LIMIT etc.
Relational Database Design
For a database to be efficient, it should be designed properly. Some of the main aspects that we look into, while designing a database
- It should reflect the real-world structure of the problem.
- Redundancy should not be there.
- Users can efficiently access the database.
- Security and consistency issues should be taken care of.
In the relational database design, the data to be stored in the databases need to be organised as a collection of relations. In layman words, a relation is a table, the rows in it are called the tuples, and the column header is the attribute. The set of values that a particular attribute can have is the domain of that attribute. The degree or the arity of a relation is the number of attributes of its relation. Now, the characteristics of the relations are listed below:
- In relation, the ordering of the tuples doesn’t have any significance. This means that there is no row significance in the relational database.
- The ordering of the values within a tuple is important. This means that the order in which the attributes were declared during the creation of the relation that attributes ordering should be used while inserting the values in the relation.
- The tuples will have atomic values, i.e., values that cannot be divided further.
- There can be NULL values for some nonprime attributes. Here the NULL is either of the following-
a) value is unknown
b) value exists for that attribute but is not available(at present).
c) that particular attribute does not apply to this tuple. - The relation names must be unique.
Constraints in DBMS
- Inherent Model based constraints
- Schema based constraints
- Other constraints
Inherent model based constraints: These are the implicit constraints that hold good for all the relations. Eg : The tuples in a relation must be unique; atomic values should be stored in the relations.
Schema based constraints: Various constraints come under this category. These include:
a) Domain constraints – the values of the attributes must be atomic.
b) Key constraints – this covers the constraints on all the types of keys that can be present in the relations, like the super key for the uniqueness constraint; key as the minimal super key; candidate key; primary key which is one among the candidate key.
c) Entity Integrity constraint – which specifies that the primary key cannot have null values.
d) Referential Integrity constraint – this is the foreign key constraint which means that the tuple in one relation must refer to an existing tuple in another relation. If both the referred and the referring relations are the same then it is called the self-referencing relation.
Other constraints:
a) Semantic integrity constraint – eg: The supervisor’s salary should be more than his subordinate’s salary;
b) Transition Constraint – these specify the state changes in the database. For example, the salary of an employee can only increase; the marital status can change from unmarried to married but not vice-versa.
Now, let us look at some of the operations that can violate some of the above mentioned constraints.
- Insertion – this operation can violate
a) Domain constraint – this occurs if the value entered for an attribute is not within the domain of the attribute ie., the data type or the range of values allowed.
b) Key constraint
c) Entity integrity constraint
d) Referential integrity constraint
If any one of the above constraints is violated during the insertion into the relation, then the insertion operation is rejected, thereby indicating to the user that some values are entered incorrectly.
- Deletion – this operation can violate referential integrity constraint only. This can be dealt in three ways:
a) To reject the delete operation.
b) To cascade the deletion operation by deleting all the tuples that were referred by the tuple that is being deleted.
c) to modify the referencing attribute value that causes the violation.
- Updation – This is nothing but deleting one or more values and then inserting new values. Hence, all the violations that were stated for deletion and the insertion operation will apply here.
=======================================================
DATABASE NORMALISATION
The term Normalization has a great significance with respect to the relational databases. In general, Normalization means adjusting the values in such a way that they follow certain criteria. From the database perspective, Normalization means adjusting the attributes or columns into different tables or relations so as to minimize the data redundancy and eliminate various anomalies. In this regard, Edgar Codd had proposed the theory of Normalization and had laid down some rules. Those rules were called the Normal Forms. In this regard, he proposed the first three Normal forms and then Raymond F. Boyce proposed the Boyce Codd Normal Form. So the normal forms till now we stated were 1NF (First Normal Form), 2NF (Second Normal Form), 3NF (Third Normal Form), and BCNF(Boyce Codd Normal Form). Let us go check out what these rules are by diving deeper into these normal Forms.
1NF (First Normal Form)
Rules:
Attribute (Column) : Atomic (single value)
Record (Row ) : Unique
Attribute Domain should not vary.
For a table to be called as a relation, it should be in 1NF.
Example:
Student_id | Student_name | Student_phone_number |
1001 | Aakash | 9967543782, 7865432901 |
1002 | Bhanu | 8765432891,6784932189 |
1003 | Chaitanya | 7659372801 |
1004 | Durga | 8754329282 |
In the above table, the attribute Student_phone_number is a multivalued attribute so this table is not in 1NF. For the the table to be in 1NF, we have to change the table as below:
Student_id | Student_name | Student_phone_number |
1001 | Aakash | 9967543782 |
1001 | Aakash | 7865432901 |
1002 | Bhanu | 8765432891 |
1002 | Bhanu | 6784932189 |
1003 | Chaitanya | 7659372801 |
1004 | Durga | 8754329282 |
Before proceeding to the other normal forms, we should understand the concept of keys and functional dependencies in the database.
KEYS
A key can be composed of a single or a group of attributes. The main purpose of keys are:
- To uniquely identify the rows in a relation.
- To know the connection (relationship) between 2 or more relations in a database.
Keys in DBMS
- Primary Key – a candidate key
– unique
– non null
– only one for a relation - Super Key – unique
– contains redundant attributes that are not needed to check the uniqueness
– superset of candidate key - Candidate key – unique
– minimal subset of super key
– a relation can have many candidate keys - Foreign Key – is a primary key in some other relation
– can be null
– can be non-unique - Alternate Keys – unique
– candidate keys
– not chosen as primary key
Note:
Prime Attributes: the attributes that are part of primary key.
Non-Prime attributes: the attributes that are not part of the primary key.
FUNCTIONAL DEPENDENCIES
Let R be a relation. Let r, the relation state, denote the set of tuples that are currently present in the relation R. The Functional Dependency FD between two groups of attributes X-> Y denotes x and Y from the relation R. The set of attributes Y are functionally dependent on the set of attributes X. The FD states a constraint that for any two tuples t1 and t2 that are present in ‘r’, with t1[x] = t2[x], then t1[y] must be equal to t2[y]. An FD is the property of the semantics of the attributes.
The various types of functional dependencies are
1. Trivial FD – X -> Y is a FD and Y is subset of X
2. Non Trivial FD – X-> Y is a FD and Y is not subset of X
3. Completely Non Trivial FD – X->Y is a FD and X intersection Y is null
4. Multivalued dependency – X->->y , two independent attributes are both dependent on a third attribute. Occur in a relation with three or more attributes.
5. Transitive Dependency – X->Z is a FD, and the following properties hold
a) X->Y
b) Y is not functionally dependent on X
c) Y -> Z
Occur in a relation with three or more attributes.
6. Join Dependency (JD) – A generalization of multivalued dependency. Let R(A,B,C,D) decomposed to R1(A,B,C) and R2(C,D). If the join of R1 and R2 on C gives R, then JD exists.
Now let’s continue with the other Normal Forms.
2NF (Second Normal Form)
Should be in 1NF.
Non-prime attribute if any in the relation is fully functionally dependent on the primary key.
Example :
sid | sname | cid | cname | marks |
111 | Rahul | 1 | Maths | 55 |
111 | Rahul | 2 | Science | 46 |
112 | Raghav | 1 | Maths | 90 |
112 | Raghav | 2 | Science | 78 |
113 | Anup | 3 | Social | 40 |
Here the primary key is sid+cid.
But sid alone will give sname, and cid alone will give cname. Hence not in 2NF.
Now split the attributes into different tables.
sid | sname |
111 | Rahul |
112 | Raghav |
113 | Anup |
cid | cname |
1 | Maths |
2 | Science |
3 | Social |
sid | cid | marks |
111 | 1 | 55 |
111 | 2 | 46 |
112 | 1 | 90 |
112 | 2 | 78 |
113 | 3 | 40 |
3NF (Third Normal Form)
Should be in 2NF.
There should not be a transitive dependency of non-prime attribute on the primary key.
Alternatively, if X->P is a FD then either.
a) X is a superkey of R or
b) P is a prime attribute of R
Example:
cid | room_num | instructor_id | dept |
101 | 34 | 1 | cse |
102 | 38 | 2 | math |
103 | 56 | 3 | cse |
Primary key {cid}
The debt is dependent on the instructor id which is not a prime attribute, also instructor id is not the super key. Hence this table is split into 2 tables.
cid | room_num | instructor_id |
101 | 34 | 1 |
102 | 38 | 2 |
103 | 56 | 3 |
instructor_id | dept |
1 | cse |
2 | math |
3 | cse |
BCNF ( Boyce Codd Normal Form)
Should be in 3NF.
If X-> P is a FD then X is a superkey of R.
student_id | City | State |
101 | Mumbai | Maharashtra |
102 | Panaji | Goa |
Primary key: student_id
We can get the city or the state from student_id. Also given the city alone we can get the state. Hence, not in BCNF.
student_id | City |
101 | Mumbai |
102 | Panaji |
City | State |
Mumbai | Maharashtra |
Panaji | Goa |
For the real world databases, the highest normal form that is sufficient is BCNF. But let us have a glance about the other normal forms.
Beyond BCNF, there are other normal forms like 4NF, 5NF
4NF(Fourth Normal Form)
– Should be in BCNF
– Multivalued dependency is not allowed
5NF(Fifth Normal Form)
– Should be in 4NF
– Join Dependency is not allowed
– Join should be lossless.
JOINS
A Join in the RDBMS implies retrieving the tuples of two relations that have an attribute in common and a condition applied to it. Here, the main emphasis is that only those tuples are retrieved that satisfy the join condition as opposed to the cartesian product or the cross product of the relational database in which all the combinations of tuples are included in the result.
The different types of joins are:
- Inner Join – only the matching tuples appear in the result
a) Theta join
b) Equi join
c) Natural - Outer Join – all tuples in either or both the tables are included in the result.
a) Left Outer
b) Right Outer
c) Full Outer
Let’s dive into the details of all the above joins.
In the below definitions, we use two relations R and S on which the join operation is performed.
The attributes Ai belong to relation R and the attributes Bj belong to relation S.
Theta Join:
In the Theta join, the join condition can be any of the following: =, <, <=, >=, >, !=. This retrieves those tuples that satisfy the given condition.
Equi Join:
In the equi join, the join condition is always equal to =. Here, we always have one or more pairs of attributes that have identical values in every tuple.
Natural Join:
In the natural join, the second superfluous attribute present in the relations will be removed. In this case renaming of the join attributes should be done before the join operation to have the same name in both the relations.
Left Outer Join:
All the tuples from the left relation will be included in the result.
Only the matching tuples from the right relation are included in the result.
Right Outer Join:
All the tuples from the right relation are included in the result.
Only the matching tuples from the left relation are included in the result.
Full Outer Join:
All the tuples from the left and right relations where the condition is met are included.
Returns null columns where there is no match.
DBMS Tutorial FAQs
Q: What are the types of DBMS?
A: The various types of DBMS are Hierarchical databases, Network databases, Relational databases, Object-oriented databases, Graph databases, ER model databases, Document databases, and NoSQL databases.
Q: What are the basics of a database?
A: A database table appears a lot like a spreadsheet, in which data is stored in rows and columns. As a matter of fact, it is usually very easy to insert a spreadsheet into a database table. The basic difference between storing data in a spreadsheet and storing data in a database is in organizing the data.
Q: How do I learn DBMS?
A: To learn DBMS, you have to start with small steps. You can watch as many tutorials as you can, at your convenience. You can take a SQL class and even install a free SQL database.
Q: What are the examples of DBMS?
A: Some of the examples of DBMS are Microsoft SQL Server, MySQL, Microsoft Access, Oracle Database, FileMaker Pro, and dBASE.
Q: What are the components of DBMS?
A: The components of DBMS are software, hardware, procedure, data, and database access language. However, the main among these is software which helps in managing the database and to control the total computerized database.
Q: What is an entity in DBMS?
A: An entity in DBMS is an object that exists in real life. For example, in an organization, the entities can be the employers, employees, projects, etc. The attribute values are stored in the database.
Q: Why should I learn DBMS?
A: DBMS helps in a systematic method of creating, updating, recovering, and storing information in a database. A DBMS is also very important for data security, data integrity, data access control and optimization, automatic rollback, resume, and recovery.
Q: What is taught in DBMS?
A: In DBMS, you will learn to write programs with packages, the process of debugging, triggers, and database structures by using SQL. The other skills that you can learn include using data as well as object modeling, relational data models, relational algebra, and applications programming.
Q: Is MySQL a DBMS?
A: MySQL is a database management system. You would need a MySQL server to add, access, and process data that are stored in the database of a computer.
Q: What are the advantages of DBMS?
A: Some of the advantages of DBMS include data integrity, data security, lesser data inconsistency, faster data access, improved decision making, and simplicity.
Also Read:
Top DBMS (Database Management System) Interview Questions