I find this topic difficult to reason about because I'm not intimately familiar ...

humodz · on March 3, 2025

Do you mind elaborating why a db partitioned like that is not enough for your registration example? If the partitioning is based on the email address, then you know where the new user's email has to be if exists, you don't need to query all partitions.

For example, following your partitioning logic, if the user registers as john.smith@example.com, we'd need to query only partition j.

refulgentis · on March 3, 2025

You're right, the email address example isn't clearcut -- its not an issue at all at registration. From there, you could never allow an email change. Or you could just add a layer for coordination, ex. we can imagine some global index that's only used for email changes and then somehow coordinates the partition change

My broad understanding is that you can always "patch" or "work around" any single objection to partitioning or sharding—like using extra coordination services, adding more layers, or creating special-case code.

But each of these patches adds complexity, reduces flexibility, and constrains your ability to cleanly refactor or adapt later. Sure, partitioning email addresses might neatly solve registration checks initially, but then email changes require extra complexity (such as maintaining global indices and coordinating between partitions).

In other words, the real issue isn't that partitioning fails in a single obvious way—it usually doesn’t—but rather that global state always emerges somewhere, inevitably. You can try to bury this inevitability with clever workarounds and layers, but eventually you find yourself buried under a mountain of complexity.

At some point, the question becomes: are we building complexity to solve genuine problems, or just to preserve the appearance that we're fully partitioned?

(My visceral objection to it is, coming from client-side dev virtually my entire career: if you don't need global state, why do you have the server at all? Just give use a .sqlite for my account, and store it for me on S3 for retrieval at will. And if you do need global state...odds are you or a nearby experienced engineer has Seen Some Shit, i.e. the horror that arises in a codebase worked on over years, doubling down on an seemingly small, innocuous, initial decision. and knows it'll never just be one neat design decision or patch)

gabeio · on March 4, 2025

> but then email changes require extra complexity

Check the other partition for the user name. Create the new user with the same pointer (uuid, etc) to the user’s sqlite file, delete the old user in the other partition. Simple user name changed. Not really that complex to be honest. (After thinking this through I’m probably going to suggest us changing to sqlite at work…)

> if you don't need global state, why do you have the server at all?

2 reasons I can think of right off of the top of my head are:

- validation (preventing bad actors, or just bad input)

- calls to external services

juliuskiesian · on March 3, 2025

What if the users are partitioned by ID instead of email? You would have to iterate through all the partitions.

TylerE · on March 3, 2025

Not much of a partition if it's on what is essentially an opaque unique key.

manmal · on March 3, 2025

FWIW, I‘ve seen consensus here on HN in another thread on SQLite-on-server, that there must indeed be a central DB for metadata (user profiles, billing etc), and all the rest is then partitioned.

NathanFlurry · on March 3, 2025

I (sort of) disagree. Cassandra- & DynamoDB-based systems which are also partitioned do fine without a central OLTP DB.

Wrote a bit about it here: https://news.ycombinator.com/item?id=43246212