Types of databases and how do they work (WIP)#

What is a database?#

A database is an electronically stored, collection of data. It can contain any type of data, including words, numbers, images, videos, and files.1

You can use software called a database management system (DBMS) to store, retrieve, and edit data.1 A database management system (DBMS) is a computerized system that enables users to create and maintain a database.

What is a database model?#

A database model shows the logical structure of a database.

Types of databases#

There are several types of databases, each designed to handle different types of data and use cases. Types include:

---
config:
  look: handDrawn
  theme: redux
---
graph TD
    A[Databases]

    A --> R[Relational]
    A --> KV[Key-Value]
    A --> D[Document]
    A --> WC[Wide-Column]
    A --> G[Graph]
    A --> TS[Time-Series]

1. Relational Databases:#

A Relational Database Management System (RDBMS) stores data in tables (relations) composed of rows (tuples) and columns (attributes). Relationships between tables are enforced using primary keys and foreign keys. Data integrity and correctness are guaranteed through ACID transactions.

  • Data Model: Structured schema, Fixed column types, Relations between entities, Normalization (1NF, 2NF, 3NF), Query Language

  • Examples: PostgreSQL, MySQL, Oracle Database, Microsoft SQL Server

  • Advantages: Strong data consistency, ACID-compliant transactions, Mature tooling & ecosystem, Excellent for complex queries and joins, Well-understood theoretical foundation

  • Disadvantages: Rigid schema (hard to evolve), Vertical scaling limitations, Performance bottlenecks at massive scale, Less suitable for unstructured data

  • Use Cases: Banking & finance systems, ERP or CRM software, E-commerce systems, Any transactional system requiring correctness

2. Key–Value Databases#

A Key–Value database stores data as mapping from a unique key to a value. The database does not need to understand the type of the value.

  • Data Model: key:value pairs, No fixed schema

  • Examples: Redis, Amazon DynamoDB

  • Advantages: Extremely fast (O(1) lookups), Horizontally scalable, Ideal for caching

  • Disadvantages: No complex queries support, No relationships supported, Limited indexing

  • Use Cases: Session storage, Caching, Rate limiting, Real-time counters

3. Document Databases#

A Document Database stores data as self-describing documents (JSON/BSON/XML). Each document can have a different structure.

  • Data Model Schema-less, Nested structures, Object-like documents

  • Examples: MongoDB, CouchDB

  • Advantages Flexible schema, Easy horizontal scaling

  • Disadvantages Data duplication, Limited joins

  • Use Cases: Rapidly evolving applications, Content management systems

4. Wide-Column Databases#

A Wide-Column Store organizes data into rows and column families, allowing each row to have a dynamic set of columns.

  • Data Model: Sparse tables, Column families, Partition and clustering keys

  • Examples: Apache Cassandra, HBase

  • Advantages: Horizontal scalability, High write throughput, Fault tolerant, Excellent for time-series

  • Disadvantages: Complex data modeling, No joins, Limited ad-hoc queries, Eventual consistency

  • Use Cases: IoT data, Event logging, Analytics pipelines, Large-scale distributed systems

5. Graph Databases#

A Graph Database stores data as nodes (entities) and edges (relationships) with properties.

  • Data Model: Nodes, Edges, Properties

  • Examples: Neo4j, Amazon Neptune

  • Advantages Fast relationship queries, No joins needed, Natural graph modeling

  • Disadvantages Not ideal for tabular data, Harder to scale globally

  • Use Cases: Social networks, Recommendation systems, Fraud detection

6. Time-Series Databases#

A Time-Series Database is optimized for storing and querying data indexed by time.

  • Data Model Timestamp-based records, Metrics, Tags

  • Examples: InfluxDB, TimescaleDB

  • Advantages High ingestion rates, Optimized storage for metrics, Efficient aggregation, High write throughput

  • Disadvantages Limited querying, Not suitable for relational data

  • Use Cases: Monitoring systems, IoT telemetry, Financial market data

Provider native databases#

---
config:
  look: handDrawn
  theme: redux
  layout: elk
---
flowchart LR
    A["Database Providers"] --> AWS["AWS"] & GCP["Google Cloud"] & AZ["Azure"] & REDIS["Redis"] & MONGO["MongoDB Inc"] & CLOUDFLARE["Cloudflare"] & SNOW["Snowflake"] & COCK["Cockroach Labs"] & NEON["Neon"] & PLANET["PlanetScale"] & SUPABASE["Supabase"] & FAUNA["Fauna"]
    AWS --> AWS1["DynamoDB"] & AWS2["Aurora"] & AWS3["Redshift"] & AWS4["Keyspaces"]
    GCP --> GCP1["Spanner"] & GCP2["Bigtable"] & GCP3["BigQuery"] & GCP4["Firestore"]
    AZ --> AZ1["Cosmos DB"]
    REDIS --> R1["Redis OSS"] & R2["Redis Enterprise"]
    MONGO --> M1["MongoDB Atlas"]
    CLOUDFLARE --> C1["D1"]
    SNOW --> S1["Snowflake"]
    COCK --> C2["CockroachDB"]
    NEON --> N1["Neon"]
    PLANET --> P1["PlanetScale"]
    SUPABASE --> S2["Supabase"]
    FAUNA --> F1["FaunaDB"]

ACID and BASE Properties in Databases#

Modern database systems are designed around two contrasting consistency models: ACID and BASE.

1. ACID#

ACID is a set of guarantees that ensure strong consistency and reliability in transactional database systems, primarily relational databases.

  • A – Atomicity
  • C – Consistency
  • I – Isolation
  • D – Durability

2. BASE#

BASE is a model designed for large-scale distributed systems, prioritizing availability and scalability over strict consistency.

  • B – Basically Available
  • A – Soft State
  • E – Eventual Consistency

References: