A database management system (DBMS) is a piece of software that manages stored data: exposes ways to retrieve and modify it, and handles all the logistics — concurrent access, durability, indexing, query optimization — that we don’t want to build ourselves.
We reach for a DBMS when files stop being enough: when we want to ask complicated questions, when the dataset is large enough that searching it takes meaningful time, when more than one person needs to access the data at the same time, or when we want guarantees about consistency. Below that scale, CSV, JSON, or HDF5 files are simpler. Above it, a DBMS pays for itself.
The names worth recognizing — all of them relational — are MySQL, MariaDB (a MySQL fork), Oracle Database, Microsoft Access, SQLite, PostgreSQL, and a handful of others. They all do roughly the same thing; they differ in licensing, scale, and which workloads they’re tuned for. MySQL and PostgreSQL are the standard open-source server-class databases. SQLite is the standard file-based embedded database. Oracle dominates large enterprise deployments.
The way we communicate with a DBMS is by sending it commands written in a query language. The dominant query language, by a wide margin, is SQL. Recent developer surveys (Stack Overflow, JetBrains) consistently rank SQL among the four or five most-used languages in the world, and it’s the most-used language for working with data — listed as a required skill in nearly every data-engineering job posting. It’s worth learning well.
A query is a command we send to the database to retrieve information or perform an operation — inserting, updating, deleting, defining a new table. The word covers all of them. SQL queries break into four families (DDL, DML, DCL, TCL) depending on what kind of work they do.