Getting Started with Database Management Systems

From this blog, we will start with the Database Management Systems(DBMS) series. This is the perfect starting point if you are an absolute beginner and have never really heard anything about DBMS. All the blogs will be beginner-friendly. We will start from zero and gradually move on to the advanced stuff. It will be a long journey and we will cover all the important concepts of DBMS.

Content of this blog

An example of a Database
Characteristics of a Database Approach
Actors on the Scene
Actors behind the scene
Advantages of using a Database Approach

An example of a Database

Let us consider an example that most students are familiar with, a university database that maintains information concerning students, courses, and grades in a university environment.

Let this database have 5 files :

STUDENT: stores all the students' files.
COURSE: stores all the course details.
SECTION: stores data on each section of the course.
GRADE_REPORT_: stores the grades that the student receives.
PREREQUISITE: stores the prerequisites that are needed for a particular course.

To define a database, we must specify the structure of records of each file by specifying the different types of data elements and their data types to be stored in each record.

Notice that the records in various files may be related. For example, the record for Smith in the STUDENT file is related to two records in the GRADE_REPORT file that specify Smith’s grades in two sections. Most medium and large-size databases have many relationships among the records.

Characteristics of a Database Approach

In the database approach, a single repository maintains data that is defined once and then accessed by various users repeatedly through queries, transactions, and application programs. The main characteristics of the database approach versus the file-processing approach are the following:

Self-describing nature of a database system
Insulation between programs and data, and data abstraction
Support of multiple views of the data
Sharing of data and multiuser transaction processing

Self-describing nature of a database system

A fundamental characteristic of the database approach is that the database system contains not only the database itself but also a complete definition or description of the database structure and constraints. This definition is stored in the DBMS catalog, which contains information such as the structure of each file, the type and storage format of each data item, and various constraints on the data. The information stored in the catalog is called meta-data, and it describes the structure of the primary database.

Insulation between programs and data, and data abstraction

In traditional file processing, data structures are hardcoded into programs, so any changes, like adding a new field, require modifying all programs that access the file. A DBMS solves this by storing data structures in a catalog, enabling program-data independence, where programs don’t need changes when the data structure changes. Similarly, operations on data, such as calculating a GPA, are defined separately from their implementation, allowing program-operation independence. This is achieved through data abstraction, where the DBMS hides complex storage and operational details, presenting users with a simple, logical view of data. In object-oriented databases, both data and operations are abstracted, so users can perform tasks without worrying about implementation specifics, ensuring flexibility and ease of use.

Support of multiple views of the data

A database serves multiple types of users, each requiring specific views of the data. A view can be a subset of the database or virtual data derived from existing files but not directly stored. Users may not need to know if the data is stored or calculated. A multiuser DBMS supports multiple views to meet these varying needs.

A multiuser DBMS allows multiple users to access and update the database simultaneously, ensuring data integrity and consistency through concurrency control. This is crucial for applications like online transaction processing (OLTP), where multiple users, such as airline reservation agents, may attempt to update the same data (e.g., assigning seats) at the same time. The DBMS ensures that such updates are handled correctly, preventing conflicts. Central to this is the concept of a transaction, which is a process involving one or more database operations, like reading or updating records. Transactions must follow two key properties: isolation, ensuring that each transaction operates as if it is alone, and atomicity, ensuring that either all operations within a transaction are completed or none are. These features differentiate DBMS from traditional file-processing systems by offering reliable and efficient concurrent data handling.

Actors on the scene

In this section, we will talk about the important roles that are played by individuals in maintaining a large-scale database management system.

Database Administrator

In a database environment, the database administrator (DBA) oversees and manages the database and related software. The DBA's responsibilities include authorizing access to the database, coordinating and monitoring its usage, and acquiring necessary software and hardware. They are also accountable for addressing issues like security breaches and slow system performance. In large organizations, the DBA often has a team to help with these tasks, ensuring efficient and secure database operations.

Database Designers

Database designers are responsible for deciding what data will be stored in the database and selecting the structures to store it. They perform these tasks before the database is implemented and filled with data. To create a design that meets the needs of all users, database designers communicate with prospective users, gather requirements, and develop specific views tailored to each group. These views are analyzed and combined into a single, unified database design that supports the needs of all user groups. Often, database designers work with the DBA and may take on other responsibilities once the design phase is complete.

End Users

End users are the primary users of a database, accessing it for tasks like querying, updating, and generating reports. They can be categorized as follows:

Casual End Users: They access the database occasionally. They require different information each time and use an advanced query interface.
Naive or Parametric End Users: They make up a sizeable portion of database users. Their main job function revolves around constantly querying and updating the database, using standard types of queries and updates— called canned transactions—that have been carefully programmed and tested.
Sophisticated end users: include engineers, scientists, business analysts, and others who thoroughly familiarize themselves with the facilities of the DBMS to implement their applications to meet their complex requirements.
Standalone users: They maintain personal databases by using ready-made program packages that provide easy-to-use menu-based or graphics-based interfaces

System Analysts and Application Programmers

System analysts determine the requirements of end users, especially naive and parametric end users, and develop specifications for standard canned transactions that meet these requirements. Application programmers implement these specifications as programs; then they test, debug, document, and maintain these canned transactions.

Actors behind the scenes

In addition to those who directly use, design, or manage databases, some workers design, develop, and maintain the DBMS software and system environment. These individuals are primarily focused on the technical aspects of the system rather than its content.

DBMS System Designers and Implementers: These professionals design and build the various modules and interfaces that make up the DBMS software. Their work involves developing complex components such as the catalog, query processing, interface handling, concurrency control, and data recovery. They also ensure that the DBMS integrates seamlessly with other system software like operating systems and compilers.
Tool Developers: Tool developers create software tools that enhance the functionality and usability of database systems. These tools assist with database modeling, design, performance optimization, and interface development. Examples include packages for database design, performance monitoring, graphical interfaces, and test data generation. Often, these tools are developed and marketed by independent software vendors.
Operators and maintenance personnel: They are responsible for the actual running and maintenance of the hardware and software environment for the database system.

Advantages of Using the DBMS Approach

In this section, we are going to talk about some additional benefits of using a DBMS.

Controlling Redundancy

In traditional file processing, each user group maintains its files, often leading to data redundancy, where the same data is stored in multiple places. For example, in a university system, the registration office may keep files for courses and grades, while the accounting office keeps files for billing and registration. This redundancy results in problems such as:

Duplication of Effort: A single update (e.g., adding a new student) must be made separately in multiple files.
Wasted Storage: Storing the same data repeatedly consumes excessive space.
Inconsistency: Updates may not be synchronized, leading to errors. For instance, a student’s birthdate might be recorded differently in two systems.

In the database approach, data is normalized to store each logical data item (like a student's name or birthdate) only once, ensuring consistency and saving space. However, controlled redundancy is sometimes used to improve query performance. For instance, in a GRADE_REPORT file, storing student names and course numbers alongside grades avoids searching multiple files. To manage this redundancy, the DBMS enforces consistency checks. For example, it ensures that a student’s name and ID in the GRADE_REPORT file match those in the main STUDENT file. If these checks are absent, inconsistencies may arise.

Restricting Unauthorized Access

When multiple users share a large database, it is important to control access to the data to protect sensitive information and ensure proper functionality. For example, financial data like salaries may be restricted to authorized personnel. Additionally, some users may only be allowed to retrieve data, while others may be authorized to both retrieve and update it. The database should provide a security and authorization subsystem to enforce these access controls, ensuring that each user or user group has specific permissions based on their role. Typically, users are assigned account numbers and passwords to access the database, and the DBA is responsible for creating these accounts and specifying any restrictions. The DBMS enforces these restrictions automatically. For example, only the DBA’s staff may have access to certain privileged software tools, and users may only interact with the database through predefined applications or transactions that restrict their actions to what is permitted.

Providing Persistent Storage for Program Objects

In traditional programming languages like C++ or Java, when you create complex data structures—like classes or structs—they only exist while the program is running. Once the program ends, those structures are typically lost unless you take the extra step to save them to a file. This usually involves transforming them into a format that a file system can understand, and when you want to use them again, you have to figure out how to convert them back to their original form. Object-oriented databases simplify this process by allowing you to store these complex objects directly in the database, without extra conversions. This means these objects are persistent; they remain intact even after the program ends, and other programs can access them easily. This kind of storage resolves the impedance mismatching problem. With object-oriented DBMS, you can smoothly store and retrieve complex objects, making it much easier to work with them in languages designed for object-oriented programming.

Providing Storage Structures and Search Techniques for Efficient Query Processing

Database systems must efficiently handle queries and updates using specialized data structures and search techniques. Since data is stored on disk, indexes built on tree or hash structures improve search efficiency. When a query requests specific records, they are moved from disk to main memory, aided by buffering or caching modules in the DBMS for faster access. While the operating system manages basic disk-to-memory transfers, most DBMSs optimize their own data buffering for better performance. The query processing and optimization module identifies the best way to execute queries based on existing storage structures. A key part of database design and tuning is determining which indexes to create and maintain, typically managed by the database administrator (DBA).

This is the end of this blog.

In the next blog, we will learn about Database system concepts and architecture. If you want notifications then you can subscribe and follow me (Parth Gupta) on Hashnode.

You can connect with me on:

Thank you for Reading.