// 1970 — 2026 · Codd · Chamberlin · Boyce · the relational lineage

SQL:DeclarativeQuery

SQL is not a general-purpose programming language — it's a declarative query language. On this site it lives in the "script / query" bucket alongside Bash, not next to C / Rust / Java. Codd published the relational model in 1970, Chamberlin & Boyce drafted SEQUEL at IBM in 1974, Oracle sold the first commercial RDBMS in 1979. Fifty-six years later, the data layer's default is still SQL.

1970Codd's paper
relational model · CACM
56 yrLanguage age
1970 → 2026
10+Major engine dialects
PG / MySQL / SQLite / DuckDB / ...
#1Most-deployed DB on Earth
SQLite · every phone / browser
SQL
SELECT *JOIN ... ONWHERE x IS NULLGROUP BYROW_NUMBER() OVERWITH cte ASEXPLAIN ANALYZEBEGIN; COMMIT;CREATE INDEXUPSERTJSONBPARTITION BY
scroll
01

What is SQL

SQL = Structured Query Language. Designed in 1974 by Donald Chamberlin and Raymond Boyce inside IBM's System R project as SEQUEL, renamed over a trademark conflict. It's not a Turing-complete general-purpose language (standard SQL isn't) — it's a declarative query language for relational databases: you write "give me these rows", the engine decides how to fetch them.

Declarative · not imperative paradigm

You state what (SELECT ... WHERE), you don't write the iteration, the index choice, or join order. The optimiser bets for you — fifty years of database research live in that translation.

Relational algebra under the hood model

The underlying model comes from Codd's 1970 paper: relations (tables), tuples (rows), and five algebraic operations — projection / selection / join / union / Cartesian product. SQL is syntactic sugar over that algebra.

Standard vs dialect spec

The ANSI / ISO standard runs from SQL-86 to SQL:2023, but no engine implements 100% of any version. CTEs, window functions, JSON, arrays — each vendor moves at its own pace. This is the real day-to-day pain of SQL engineering.

Not a general-purpose language scope

SQL itself doesn't write web pages, drive hardware, or train ML. It sits beside Bash in the "script / query" bucket — single-purpose, 50 years steady. Staying perfectly still for half a century is its own kind of skill.

postgres=> SELECT name, country, single
              FROM wca_results
              WHERE event_id = '333' AND single IS NOT NULL
              ORDER BY single LIMIT 5;
Time: 4.218 ms · Index Scan · idx_results_event_single
namecountrysingle
Yiheng Wang (王艺衡)CN3.05
Max ParkUS3.13
Xuanyi Geng (耿暄一)CN3.47
Asher Kim-MagierekUS3.66
Tymon KolasińskiPL3.79
(5 rows)
02

History : Timeline

Fifty-six years of SQL: from a 1970 paper at IBM San Jose, through the Oracle / Sybase / MSSQL commercial wars, the PostgreSQL / MySQL / SQLite open-source troika, the NoSQL revolt, and the Snowflake / BigQuery / ClickHouse / DuckDB analytics resurgence — the language barely moved, the engines underneath rotated entirely.

  1. 1970·06

    Codd publishes the relational model

    At IBM San Jose, Edgar F. Codd publishes "A Relational Model of Data for Large Shared Data Banks" (CACM, June 1970). Eleven pages lift data from hierarchical / network models into relational algebra. No SQL yet — but this is its seed.

  2. 1973—74

    System R · SEQUEL is designed

    Inside IBM's System R project, Donald Chamberlin and Raymond Boyce design SEQUEL (Structured English Query Language) — a language to let non-mathematicians query relational data. Renamed SQL over a trademark clash. Boyce dies in 1974 at age 35.

  3. 1979

    Oracle V2 — the first commercial RDBMS

    Larry Ellison's Relational Software Inc. (later renamed Oracle) ships the first commercial relational database. IBM wrote the paper, Oracle sold the product — a lesson in software commercialisation that echoed for decades.

  4. 1986

    ANSI SQL-86 standard

    SQL gets its first formal spec — SQL-86 ratified by ANSI, then ISO the same year. Standardisation only on paper — vendor dialects had already forked; the standard codifies the lowest common denominator.

  5. 1989

    Microsoft SQL Server 1.0 (a Sybase fork)

    Microsoft partners with Sybase to port Sybase SQL Server to OS/2; that becomes Microsoft SQL Server. It moves to Windows NT in 1993 and gradually splits from Sybase into its own T-SQL line. It will lock up US enterprise SQL for the next three decades.

  6. 1992

    SQL-92 — the baseline most people still learn

    SQL-92 is the most consequential version in history: explicit JOIN syntax (INNER / LEFT / RIGHT / FULL), VARCHAR, DATETIME, three-valued NULL logic all settle here. Even in 2026, most textbooks still teach this version.

  7. 1996·07

    PostgreSQL 1.0

    Michael Stonebraker's POSTGRES project at UC Berkeley (1986) gains a SQL frontend in 1995 — renamed Postgres95 — and ships as PostgreSQL 1.0 in July 1996. Open source, BSD-style licence. Thirty years later it is the de-facto king of open-source SQL.

  8. 1995

    MySQL — Monty's database

    Sweden's Michael "Monty" Widenius names the database after his daughter My. Early MySQL is fast but lacks full transactions / ACID, yet it fuses with PHP through the Web-1.0 years. The "M" in LAMP is this.

  9. 1999

    SQL:1999 — CTEs, triggers, OLAP, regex

    A big version: recursive CTEs, triggers, OLAP aggregates (ROLLUP / CUBE), regex, object types. Vendors had most of it; the standard merely caught up. But recursive CTE was a genuine novelty and defined a whole shape of query.

  10. 2000·08

    SQLite 1.0 (D. Richard Hipp)

    D. Richard Hipp at Hwaci writes an embedded SQL engine for the US Navy's DDG-79 destroyer. No server, one file per database, public-domain licence. It quietly becomes the most-deployed database on Earth — every phone, every browser, every aircraft avionics box.

  11. 2003

    SQL:2003 — window functions

    Window functions (ROW_NUMBER() OVER ..., RANK, LAG, LEAD, SUM() OVER) enter the standard. This is SQL's single most modernising step: expressing "look at this row plus its neighbours" without breaking relational algebra. XML also joins the spec the same year.

  12. 2008

    Sun acquires MySQL · the NoSQL wave

    Sun buys MySQL AB for $1B. Two years later Oracle buys Sun. Monty walks and forks MariaDB (2009). Meanwhile NoSQL takes off: MongoDB / Cassandra / Couchbase / Redis — "Schemas are dead" was the slogan.

  13. 2012

    Google Spanner — distributed SQL

    Google publishes the Spanner paper: globally distributed, strongly consistent (TrueTime API), SQL frontend. It debunks the "distributed means NoSQL" axiom. CockroachDB / TiDB / YugabyteDB are all the open-source response to this thread.

  14. 2014

    Snowflake — the cloud-warehouse era

    Snowflake goes public-cloud GA. Storage / compute separated, billed per second, SQL data warehouse — unlike legacy Teradata / Vertica, it's cloud-native from day one. IPOs at roughly $70B. BigQuery is maturing inside Google in the same window.

  15. 2016

    SQL:2016 + ClickHouse open-sourced

    Standards side: SQL:2016 adds JSON path and row-pattern matching. Open-source side: Yandex open-sources ClickHouse — a columnar OLAP engine that will dominate logs, time series and realtime analytics in the years to come.

  16. 2018

    DuckDB 1.0 — the pandas killer

    Mark Raasveldt and Hannes Mühleisen at CWI Amsterdam release DuckDB: an embedded columnar analytical SQL engine — SQLite-shaped, but built for analytics. "Replace most pandas workflows with DuckDB" becomes the biggest 2024-2026 consensus shift in data tooling.

  17. 2020

    NoSQL → "let's add SQL back"

    The 2009-2014 "schemaless" pitch ages badly — engineers realise "no schema means schema scattered across application code". MongoDB adds an Aggregation Pipeline (SQL-shaped); Cassandra ships CQL; Kafka ships ksqlDB. SQL didn't win — it just refused to die.

  18. 2023

    The LLM era — text → SQL

    GPT-4 / Claude / Gemini bring "natural-language to query" close to free. BI tools (Mode, Hex, ChartDB) plug LLMs in everywhere. But "let the LLM write production SQL" is still handled gingerly — schema leakage, wrong joins, runaway cost are all unresolved trade-offs.

  19. 2026

    SQL at 56 — still the default of the data layer

    State of the union, 2026: OLTP runs on PostgreSQL; analytics runs on ClickHouse / DuckDB / Snowflake / BigQuery; embedded runs on SQLite; enterprise legacy runs on Oracle / MS SQL. SQL the language never changed; the engines underneath rotated entirely — that is how it survived 56 years.

03

Language Essentials : SqlAlphabet

Eight cards covering the core structures every SQL writer needs: the declarative essence of SELECT, the four joins, the two-step aggregate, window functions, CTEs, the NULL three-valued-logic trap, transactions, indexes. Master these eight and 90% of daily queries are within reach.

A

SELECT — the declarative core

SQL is declarative: you state what you want, not how to fetch it. The query optimiser turns SELECT into an execution plan — 50 years of database research piled into that translation.

SELECT name, country, single
FROM wca_results
WHERE event_id = '333'
  AND single IS NOT NULL
ORDER BY single ASC
LIMIT 10;
B

JOIN — the meat of relational algebra

Four joins (INNER / LEFT / RIGHT / FULL) plus SQL-92's explicit syntax. A relational database's power is in the join: normalised tables, cross-table queries, every metric ultimately rides on it.

SELECT p.name, c.name AS comp
FROM persons p
JOIN results r ON r.person_id = p.id
JOIN competitions c ON c.id = r.comp_id
WHERE p.country = 'CN';
C

GROUP BY + HAVING

Aggregation in two steps: GROUP BY groups, then HAVING filters those groups (distinct from WHERE, which filters rows). This is the pair beginners conflate most often.

SELECT country, count(*) AS n
FROM persons
GROUP BY country
HAVING count(*) > 1000
ORDER BY n DESC;
D

Window functions — the gift of SQL:2003

OVER (PARTITION BY ... ORDER BY ...) makes "look at self plus neighbours" a one-liner. Ranks, running totals, moving averages, previous/next rows — all collapse to one expression. SQL's single most modernising feature.

SELECT name, event, single,
  ROW_NUMBER() OVER (
    PARTITION BY event
    ORDER BY single ASC
  ) AS rk
FROM wca_results;
E

CTEs — WITH ... AS

Pulls subqueries into named steps, chains them readably, and supports recursion (SQL:1999). "Five levels of nested subquery" is a code smell — rewrite it as five named CTE blocks.

WITH top10 AS (
  SELECT id, single FROM results
  ORDER BY single LIMIT 10
)
SELECT p.name, t.single
FROM top10 t
JOIN persons p ON p.id = t.id;
F

NULL — the three-valued logic trap

NULL means "unknown", not "empty". NULL = NULL is not true — it's UNKNOWN. WHERE x = NULL always returns no rows; you must use IS NULL. The single largest source of SQL bugs.

-- Wrong: returns 0 rows
SELECT * FROM t WHERE x = NULL;

-- Correct:
SELECT * FROM t WHERE x IS NULL;
G

Transactions / ACID

SQL databases offer real transactions: BEGIN / COMMIT / ROLLBACK. ACID (atomic / consistent / isolated / durable) is the SQL camp's moat — and the direct counterpoint to early NoSQL's "eventually consistent is good enough".

BEGIN;
UPDATE accounts SET bal = bal - 100
  WHERE id = 'A';
UPDATE accounts SET bal = bal + 100
  WHERE id = 'B';
COMMIT;
H

Indexes — the actual database magic

B-tree / hash / GIN / BRIN / columnar / bitmap. The same SQL is 1000× faster with the right index. The most valuable thing to learn beyond SQL itselfEXPLAIN is the DBA's stethoscope.

CREATE INDEX idx_results_event_single
  ON results (event_id, single)
  WHERE single IS NOT NULL;

EXPLAIN ANALYZE
SELECT ... -- see the plan

SQL itself isn't Turing-complete — but PL/pgSQL is

Pure SQL (SQL-92 core) is not Turing-complete: no loops, no recursion (recursive CTE arrived in SQL:1999 in a restricted form). But every engine ships a procedural extension: PL/pgSQL (PG), T-SQL (MSSQL), PL/SQL (Oracle), SQLite's WITH RECURSIVE — and these are Turing-complete. When you "write SQL", you almost always write some dialect mix of these.

"SQL is the standard; dialects are the reality — and the gap between them is every SQL engineer's daily job."

04

The Dialect Zoo : DialectMatrix

"I write SQL" — really you write some engine's dialect. Ten major dialects, six features, what each one ships. The SQL standard is the map; engine dialects are the terrain.

Window fnCTERecursive CTEJSONArraysJIT
ANSI SQL:2016
PostgreSQL
MySQL 8.0+
MariaDB 10.6+
SQLite
MS SQL Server
Oraclepartial
BigQuery
Snowflakepartial
DuckDB
ClickHousepartial

How to read: "yes" = supported in standard syntax; "partial" = supported but with non-standard syntax or limits; "—" = not natively (often emulable via JSON etc.). The JIT column means engine-level LLVM-class compilation (PostgreSQL 11+, Oracle 12c+ ship it). State as of 2024-2026 — vendors are still catching up and rows will shift.

05

Famous Anti-Patterns : DontDoThis

Fifty years on, SQL anti-patterns are more stable than its good patterns — every new engineer rediscovers them. These six are the ones that make every DBA reach for the off switch.

SELECT *

Fine for ad-hoc; fatal in production code, views, or reports. A new column breaks downstream consumers; the optimiser loses column pruning; covering indexes go to waste. Listing columns is the floor.

N+1 queries

"For each row, run another query". 1 + N round-trips; at N=10000, latency alone kills you. The single most common ORM-induced bug — fix with one JOIN or a batched WHERE id IN (...).

OR in WHERE

WHERE a = 1 OR b = 2 — many optimisers can't use two indexes at once and fall back to a full scan. Rewriting as UNION ALL of two halves, or adding a composite (a, b) index, often unlocks speedup.

Implicit cast

WHERE id = '123' when id is INT: the engine often casts the column (cast(id AS TEXT)) and the index dies. Same trap with dates, charsets, collations. Match types literally — don't gamble on the optimiser.

GROUP BY 1, 2, 3

Using positional ordinals instead of column names — change the column order and the grouping silently shifts. Painful to review. Fine for one-off ad-hoc; never in production code, reports, or views.

NULL treated as empty

COUNT(col) ignores NULL but COUNT(*) doesn't. SUM silently skips NULLs, but col + 1 on a NULL gives NULL. Three-valued logic is the most invisible bug source in SQL; explicit COALESCE(col, 0) is the antidote.

06

OLTP vs OLAP — two workloads : OneSqlTwoWorlds

One SQL, two utterly different engines. OLTP (online transactional) is write-heavy, single-row, millisecond-scale; OLAP (online analytical) is read-heavy, aggregation-heavy, scans gigabytes. The engines diverge wildly, the SQL stays the same — this is what makes SQL genuinely remarkable.

OLTP

Online Transactional

For applications. "Give me user 1234's cart". A query touches a handful of rows; rate is 10⁴-10⁶ QPS; strong consistency; strict ACID.

  • Storagerow-store (rows contiguous)
  • Indexes — B-tree
  • Representative — PostgreSQL, MySQL, Oracle, MSSQL, SQLite
  • Per-query sizerows < 1000
  • Latency target — < 10ms
  • SQLSELECT single-table + LIMIT, lots of INSERT/UPDATE

Typical query: SELECT * FROM users WHERE id = ?. An index pinpoints one row — the DB may be TBs but no single query scans the whole thing.

OLAP

Online Analytical

For analysts, reports, ML features. "Total sales by country, last 30 days". A query scans billions of rows; rate is 10⁰-10² QPS; eventual consistency is acceptable.

  • Storagecolumn-store (columns contiguous)
  • Indexeszone maps / bitmaps / data skipping
  • Representative — ClickHouse, Snowflake, BigQuery, DuckDB, Redshift
  • Per-query sizerows 10⁶-10¹¹
  • Latency targetseconds to minutes
  • SQLSELECT multi-table + heavy GROUP BY, occasional bulk INSERT

Typical query: SELECT country, SUM(amount) FROM sales WHERE date >= ... GROUP BY country. No index helps — but columnar storage means "read only the country and amount columns" costs 1/100 of a row-store scan.

In one line: For OLTP, reach for PostgreSQL / MySQL / SQLite; for OLAP, reach for ClickHouse / DuckDB / Snowflake / BigQuery. The wrong engine costs you 100× in performance; the right one runs the same SQL you already wrote.

07

Why SQL Won't Die : WhySqlLasts

The past 25 years contain at least four predictions of SQL's imminent replacement: object databases, NoSQL, GraphQL, ORM abstractions. None landed — this section is about why SQL keeps surviving.

"

SQL doesn't survive because it's good — it survives because it's good enough and stable enough. For 25 years engines moved from single-node to cloud to columnar to distributed, and the SQL interface barely flinched. Replacing the language costs 100× more than replacing the engine, and the whole data stack knows this.

— industry folk wisdom, paraphrasedDistilled from 50 years of applied use

Declarative — let the optimiser bet for you

You say "give me these rows"; the optimiser decides join order, index choice, memory vs spill. Fifty years of query-plan research live behind that single SELECTwhich is precisely why imperative APIs never managed to replace SQL.

-- Same SQL, different engines:
-- PostgreSQL: nested loop + index
-- ClickHouse: parallel column scan
-- DuckDB:     vectorized hash join
-- Your SQL doesn't change.

The relational model — the only paradigm that lasted

Hierarchical / network / object / document / graph databases all came and went; relational + SQL is the only paradigm intact across 50 years. The reason: it's a mathematical definition (relational algebra), not one product's implementation. After the NoSQL wave receded, the consensus left behind was "SQL is non-optional".

-- Codd 1970: relations + tuples
-- ↓ 50 years
-- 2026: still relations + tuples

One language, N engines

The same SELECT runs against PostgreSQL (row-store OLTP) / ClickHouse (column-store OLAP) / SQLite (embedded) / Spanner (distributed) — the engine changes, the SQL doesn't. This is SQL's real power as a universal interface: swap engines without rewriting application code.

-- portable across:
-- · PostgreSQL  (row OLTP)
-- · ClickHouse  (column OLAP)
-- · SQLite      (embedded)
-- · DuckDB      (analytical)
-- · BigQuery    (cloud DW)

NoSQL's quiet reversal

The 2009-2014 NoSQL wave promised "death to schemas". A decade later the consensus: no schema doesn't mean no schema — it means the schema is scattered across application code. MongoDB ships an Aggregation Pipeline, Cassandra ships CQL, ksqlDB lets you SELECT against Kafka — SQL didn't win the argument; it just refused to be left behind.

-- MongoDB Aggregation Pipeline
-- looks suspiciously like SQL:
db.x.aggregate([
  { $match:    { y: { $gt: 10 } } },
  { $group:    { _id: "$z", n: { $sum: 1 } } },
  { $sort:     { n: -1 } }
])

"ORMs will replace SQL" → "SQL replaced the ORM"

In the 2010s everyone tried Hibernate / ActiveRecord / Django ORM / Eloquent — "finally no more SQL". In the 2020s everyone is writing raw SQL again, just type-checked: sqlc (Go), sqlx (Rust), Drizzle (TS), jOOQ (Java). The tide came in twice; the SQL was still there.

-- 2014 (Hibernate):
session.createCriteria(User.class)...

-- 2026 (Drizzle / sqlc):
-- write the SQL · get typed rows
SELECT id, name FROM users WHERE ...
SPOTLIGHT

Window functions SQL:2003 — the feature that dragged SQL into the modern era

Before 2003, "top 3 cubers per country" required a correlated subquery plus self-join — thirty lines of SQL that ran slowly. Window functions collapse it to one line: ROW_NUMBER() OVER (PARTITION BY country ORDER BY single) with WHERE rk <= 3. SQL's single most modernising step.

  • ROW_NUMBER()row index, ties get different numbers
  • RANK()ties get same rank, with gaps
  • DENSE_RANK()ties get same rank, no gaps
  • LAG / LEADprevious / next row
  • SUM() OVER (...)running totals / moving averages
  • FIRST_VALUE / LAST_VALUEwindow-edge values
-- top-3 333 singles per country
WITH ranked AS (
  SELECT
    name, country, single,
    ROW_NUMBER() OVER (
      PARTITION BY country
      ORDER BY single ASC
    ) AS rk
  FROM wca_results
  WHERE event_id = '333'
    AND single IS NOT NULL
)
SELECT * FROM ranked
WHERE rk <= 3
ORDER BY country, rk;
08

The Ten Engines : SqlEngines

SQL's real strength isn't the language — it's the engine diversity. From SQLite at tens of megabytes embedded in every iPhone, to Snowflake and BigQuery spanning multi-region clouds, the same SQL covers ten orders of magnitude of data scale.

2026 toolchain / clients / frameworks

psql
PostgreSQL CLI
pgAdmin
PG GUI
DBeaver
Multi-engine GUI
DataGrip
JetBrains · commercial
dbt
Data transform framework
Hex / Mode
SQL notebook BI
sqlfluff
SQL linter / formatter
pgBench
PG benchmark
Drizzle ORM
TS · SQL-shaped
sqlx (Rust)
Compile-time SQL check
sqlc (Go)
SQL → typed Go
PostgREST
PG → REST auto-API
09

SQL vs Python pandas vs MongoDB : HowItStacks

SQL's real competition isn't another SQL — it's different paradigms for getting at data. Python pandas is imperative dataframes; MongoDB is document queries. Three styles, the same "fetch data" task.

SQLPython pandasMongoDB
ParadigmDeclarative · relationalImperative · dataframesDocument + Aggregation Pipeline
OriginIBM · 1974Wes McKinney · 200810gen / MongoDB Inc · 2009
Data structureTables (row × column × type)DataFrameBSON documents (JSON-like)
SchemaEnforced (DDL-declared)Inferred dynamicallyOptional / weak
JOINFirst-class · 4 semanticsmerge()$lookup (added late · slow)
AggregationGROUP BY + HAVINGgroupby().agg()$group
TransactionsNative ACIDNone (in-process memory)4.0+ multi-doc transactions
Data scalebytes → many TBRAM-bound (~10⁸ rows)GB → TB · shardable
InteropEvery language has a clientDeep in Python ecosystemJS / Python / Java widely
2026 usageOLTP + OLAP + embedded + warehouseData-science explorationLogs / IoT / config / cache
TrendSteady · 50-year defaultDuckDB eating into itSteady · added SQL-ish interface
10

Outlook : NextFiftyYears

SQL at 56 won't change its syntax — but the surroundings keep moving. LLMs help humans write queries; distributed SQL reshapes the ops model; DuckDB pulls analytics back to the laptop. SQL stays still, everything around it moves.

HOT · 2026+

LLM-written SQL — still not "production-ready"

Since 2023, LLMs write SQL surprisingly well, and every BI tool (Hex / Mode / ChartDB) plugs one in. But "let the LLM write production SQL unattended" hasn't landed — three blockers: schema injection (context windows can't hold large schemas), hallucinated queries (LLMs invent non-existent tables or columns), and cost (LLMs don't reason about query plans — one bad SQL can melt a warehouse).

State of the art in 2026: "LLM drafts, human reviews" remains the norm. Text-to-SQL models (NSQL, Defog SQLCoder) hit ~85% accuracy on narrow schemas; complex joins / cross-DB / tuned queries still need humans.

Narrow schema / single DB (today)~85%
Complex joins / cross-DB / tuned~40%
DISTRIBUTED

Distributed SQL — after Spanner

CockroachDB / TiDB / YugabyteDB open-source the Spanner pattern: distributed strong-consistency plus SQL frontend. The 2026 question isn't "can it speak SQL" — it's "can ops cost match plain Postgres" — and it doesn't quite yet.

EMBED

DuckDB — embedded analytics returns

Analytics used to demand a Snowflake / BQ warehouse. Now DuckDB chews 100 GB of Parquet on a laptop and runs SQL 10-100× faster than pandas. "Small-data analytics moves back local" is the most under-discussed data-stack shift of 2024-2026.

In one line: SQL doesn't need a "future" — it's already the data layer's default, 56 years deep, and likely good for another 56. New engines speak SQL, new BI tools emit SQL, even NoSQL came back and added SQL. It isn't a winner — it's infrastructure.