Back to blog

Data Engineering System Design Software Architecture

May 30, 2026
2 min read

Data Engineering System Design Software Architecture

Why schema decisions are the ultimate "One-Way Door" in engineering.

In software, most decisions are reversible. You can refactor a messy class, swap out a library, rewrite a service, or change an API response shape. The cost is usually just developer hours.

But database schemas? Schema debt doesn't behave like regular technical debt. It scales maliciously. Once a table crosses 500 million rows, the physics of data gravity take over. Simple changes turn into high-stakes operations:

-Changing a column type requires a full table rewrite (hours of lockups or complex online migrations).

-Adding a NOT NULL column forces a multi-step migration dance to avoid locking the database.

-Upgrading a Primary Key (INT ➡️ BIGINT) means rewriting not just the main table, but every single foreign key table connected to it.

-Repartitioning means rebuilding the entire table and all its indexes from scratch.

-Denormalizing requires complex data backfills and coordinated application changes across every service touching those tables.

The math of architectural regret looks like this:

The cost of reversing a bad schema decision grows linearly with data volume, and quadratically with the number of services that depend on that schema.

At 10 billion rows, the math stops working entirely. Some decisions stop being "difficult to reverse" and become completely irreversible without forcing extended, business-impacting downtime.

Speed matters, but 20 minutes of deliberate data modeling in week one saves 6 months of high-stress migration work in year three.

If you are building at scale, measure twice. Because cutting twice might not even be an option.

#DataEngineering #SystemDesign #SoftwareArchitecture #Scalability #BackendEngineering #Database