Why Building Yet Another KV Database Misses the Point

The debate around key-value databases has reached a turning point in the data science community. As someone who’s implemented data systems for over a decade, I’ve watched the pendulum swing from “KV stores solve everything” to “stop building them already.” But what does the research actually tell us about when these seemingly simple data structures are appropriate and when they’re not?

Stores – The Right Tool for Specific Jobs

Key-value stores excel at certain tasks by design. Their simplicity creates genuine advantages for:

  • Caching systems: When performance matters more than complex querying
  • Distributed session storage: When you need quick access to ephemeral data
  • Configuration management: When you’re primarily concerned with name-value pairs
  • Nonce enforcement: When you need fast lookups with minimal overhead

The brilliance of KV stores lies in their conceptual clarity. They map keys to values with predictable performance characteristics—providing O(1) lookups in many implementations. This makes them perfect for scenarios where you’re frequently retrieving specific items by known identifiers.

Stores – When “Database” Becomes a Loaded Term

We’ve become somewhat sloppy with terminology. Is a key-value store actually a database? The answer depends on how we define our terms.

If a database must support complex queries, relationships, and structured data management, then no. But if we consider any persistent, organized collection of data a database, then yes—KV stores qualify, along with filesystems, spreadsheets, and even sorted lists.

This terminological confusion clouds the more important discussion: what’s the right data structure for your specific needs?

Stores - database spectrum visualization

Building on KV Foundations

Key-value stores often serve as building blocks for more complex data systems. Many distributed relational databases are built on KV underpinnings. This architectural approach has distinct advantages:

  1. It leverages the proven reliability of KV systems
  2. It separates storage concerns from query processing
  3. It allows for incremental complexity as needed

But using KV stores as foundational elements doesn’t mean you should expose raw KV interfaces to your users. The abstraction you present should match the problem domain, not your implementation details.

The Storage Engine vs. Database Distinction

A more useful distinction might be between:

  • Storage engines: The underlying mechanism for persisting data (disk, memory, network)
  • Data stores: Organized collections following a particular data model (key-value, document, graph)
  • Databases: Complete systems with management capabilities, query languages, and specific semantics

Under this taxonomy, key-value is merely a data model implemented by various storage engines. A complete database system provides additional capabilities on top of this foundation.

When KV Is Absolutely Not Enough

Despite their utility, KV stores face serious limitations for many use cases:

1. Relationship-Heavy Data – Stores

When your data’s value comes from connections between entities, KV stores become painful. You’ll find yourself manually implementing joins, which quickly devolves into:

// Anti-pattern: Reconstructing relationships in application code
users = kv.get("users:" + userId)
orders = kv.get("orders:" + userId)
for each order in orders:
    orderItems = kv.get("items:" + order.id)
    // ...and so on

This approach multiplies network trips and pushes complexity to application code.

2. Analytical Workloads – Stores

KV stores generally aren’t optimized for scanning large ranges of data efficiently. When you need to:
– Aggregate across millions of records
– Run complex statistical analyses
– Process data in parallel

…you’ll hit performance walls that dedicated analytical systems have already solved.

3. Consistency Requirements

While distributed KV stores often offer eventual consistency, applications requiring strong consistency guarantees may need more sophisticated systems. Banking transactions, inventory management, and other mission-critical operations benefit from ACID guarantees that go beyond basic KV capabilities.

The Research Suggests: Fit Technology to the Problem

Recent database research points to a more nuanced approach than “KV stores everywhere” or “never use KV stores.” Systems like NewSQL databases, multi-model databases, and purpose-built time-series databases demonstrate that different workloads benefit from different architectures.

Stores - workload-specific database architectures

Instead of starting with the technology, start with your data access patterns:

  1. What are your read-to-write ratios?
  2. How important is query flexibility versus raw performance?
  3. What consistency guarantees do you actually need?
  4. How will your data volumes and access patterns evolve?

The answers will guide you toward the appropriate technology—which may or may not involve key-value stores.

Building What Matters

The provocative statement “stop building KV databases” isn’t really about eliminating KV stores from our toolkit. It’s about moving past reinventing basic components to focus on solving higher-level problems.

We already have excellent KV implementations like Redis, RocksDB, and LMDB. What we need are better abstractions, smarter integration patterns, and systems that match real-world problems.

The next time you’re tempted to build a custom KV database, ask yourself: am I solving a genuinely new problem, or am I reinventing a wheel that’s already rolling quite well? Your users will thank you for focusing on what makes your solution uniquely valuable rather than its storage mechanics.