Untangling Database Concepts
As I learn more about data science, I find that there’s many terms tossed around interchangeably. As someone new to the field, this makes it hard to solidify certain concepts. When it comes to database management and storage, SQL is almost always part of the conversation, but is SQL the same as a Database Management System? What about MySQL or PostgreSQL? Or a Database Management System vs a Relational Database Management System? Let’s untangle some of these concepts.
Let’s start at the most basic level. A database is an organized collection of structured information (data) that is generally stored on a computer or other hardware. Databases are used extensively and store a variety of information, depending on the industry. Some data that may be stored in a database includes:
- Reservations for flights for an airline company
- Customer information and purchases for a retail giant
- Call records and monthly bills for a telecom provider
- Customer deposits/withdrawals and loan information for a bank
While this information is stored in the database itself, all the data can be accessed via a Database Management System.
Database Management System (DBMS)
A DBMS differs in that it is the piece of software that allows one to access, interact with and manipulate the data. Depending on the type of DBMS, the database can be organized into various structures, including:
- Object oriented or Non-relational (NoSQL)
- Relational (RDBMS)
Relational Database Management Systems (RDBMS)
RDBMS is a type of database management system that organizes data into a table with rows and columns (similar to an Excel spreadsheet). Each row represents a unique “record” and each column represents a feature or attribute of the records. This type of structure is very common and the system is accessed using SQL, which is why the term gets thrown around a lot.
SQL refers to the programming language that the user uses to interact with the DBMS and therefore the database. While many RDBMS have SQL in the title, they are all referring to the use of the same SQL language with some slight variations. SQL is specifically used for relational databases.
What about MySQL, PostgreSQL, SQlite?
These names refer to the different vendors of relational databases and their software. While all are built off of SQL, each has its own “dialect”, which is why the terms are used interchangeably with SQL. Some of the main vendors and software used are:
- Microsoft SQL Server
- SAP HANA
- Oracle Database
- Microsoft Access
Putting it All Together
In sum, all these terms represent moving parts in the database management and interaction process. While they are sometimes used interchangeably, there are subtle differences. To end, lets look at a visual representation: