ObsessionDB
ClickHouse · Cloud · Infrastructure · Distributed Systems
I co-founded ObsessionDB after years running ClickHouse® Infra at scale. We kept writing internal tooling to work around the operational pain of running replicated tables. At some point we had 2,000 lines of workarounds and realized we were solving the wrong problem. The architecture was the issue. So we fixed it, and now its open to the public.
The Product
ObsessionDB is managed ClickHouse with decoupled storage and compute. Shared engines instead of Replicated ones. All nodes share object storage as the source of truth. Compute is stateless. No ZooKeeper. No replication queues backing up at 3am.
If you've run ClickHouse at scale, you know the pain I'm talking about. Schema changes that require ALTER on every shard. Replica drift when one node fails mid-migration. ZK sessions expiring under load. We dealt with this for years before realizing the problems were baked into ReplicatedMergeTree itself. Now we just don't have them.
How It Works
Storage and compute scale independently. You don't need a platform team to run it. Your developers, and even your agents, get tools to handle it programmatically.
Storage
S3 is the source of truth. Hot data gets cached on local NVMe, cold data pulls from object storage when needed. You pay for what you store, not for replicated copies across shards.
Compute
Stateless query execution on different read/write nodes. They scale with load. Add capacity in minutes. When traffic drops, compute scales down. You're not paying for idle machines.
Developer Tools
Native ClickHouse wire protocol. Your existing tools and SQL just work. We built a CLI for schema migrations, health checks, reingestions and other daily tasks. There's a console for monitoring and an API for automation. We also maintain AI skills for things like schema design and query optimization, the kind of knowledge that usually takes years to accumulate.
Who This Is For
Teams processing millions of events who want ClickHouse but not the infrastructure overhead. Engineers who've spent too many nights debugging replication lag. Companies that looked at hosted ClickHouse pricing and kept self-hosting anyway.
If you're already running ClickHouse, migration is straightforward. Wire protocol is native, your queries work unchanged. We built this because we needed it ourselves.
What I Learned
Most distributed systems complexity comes from the architecture, not despite it. ReplicatedMergeTree forces you to manage distribution. Shared engines delegate it to object storage. One approach creates problems you spend years building tooling around. The other just doesn't have the problems.
Infrastructure products are different from applications. Reliability matters more than features. A database that's down is worse than one missing functionality. We think about the 3am scenario a lot, because that's when trust gets built or broken.
Starting a company with people you've worked with before matters. We knew each other's styles already. No surprises about who handles what. The technical debates are productive because the trust was there from day one.