Many years ago I was obsessed with Berkeley DB and its performance.
But when I discovered Tokyo Cabinet and Tokyo Tyrant I almost literally fell in love. We used it for things that would have been impossible without it at the time.
I got similar experience. Using Berkeley DB until I found SQLite ;) Of course it is not directly key/value, but small size, simplicity and IO performance was amazing for me.
I love the aosa book. I learned a lot about systems design from it. Ironically, I usually fail the Systems Design interviews at fancy companies because they only ask about LBs, sharding, obscure data structures like CRDTs, and what not.
Berkeley DB is one of those things everyone respected, for some reason, but that didn't actually work if you threw a bit of data at it. And not just for us. I remember talking to companies that paid them lots of money to work on reliability, and it never got better.
But I do remember reading much of the source (trying to figure out why it didn't work) and thinking "this is pretty nice code".
Well, it worked for Amazon — Berkeley DB was used extensively there as the makn database, right from the beginning. I remember talking to an ex-Amazon engineer in 2006 who said BDB was still the main database used for inventory, and complained that everything was a mess, with different teams using different tech for everything. Around that time Amazon made DynamoDB to solve some of that mess — and it sat on top of BDB.
It worked well for Amazon because they kept it within a tight operating envelope. They used it to persist bytes on disk in multiple, smaller BDBs per node. This kept it out of trouble. They also sidestepped the concurrency and locking problems by taking care of that in the layers above. It was used more like SSTables in BigTable.
They phased out BDB before DynamoDB was launched. Some time between 2007 and 2010. By the time DynamoDB launched as a product in 2012(?), BDB was gone.
I wanted to love berkeley db; it was available everywhere, seemed simple, was fast when tested. In practice it never worked well though, with pretty frequent corruption under load, and license confusion from oracle. It has a lot of features you're never going to use, and if you try, you'll be disappointed
There's no shortage of embeddable key-value stores with C bindings like leveldb, rocksdb, or even gdbm, and all of them have worked better for me.
Many years ago I was obsessed with Berkeley DB and its performance.
But when I discovered Tokyo Cabinet and Tokyo Tyrant I almost literally fell in love. We used it for things that would have been impossible without it at the time.
Still worth checking it out: https://github.com/hthetiot/Tokyo-Cabinet
I got similar experience. Using Berkeley DB until I found SQLite ;) Of course it is not directly key/value, but small size, simplicity and IO performance was amazing for me.
I love the aosa book. I learned a lot about systems design from it. Ironically, I usually fail the Systems Design interviews at fancy companies because they only ask about LBs, sharding, obscure data structures like CRDTs, and what not.
Berkeley DB is one of those things everyone respected, for some reason, but that didn't actually work if you threw a bit of data at it. And not just for us. I remember talking to companies that paid them lots of money to work on reliability, and it never got better.
But I do remember reading much of the source (trying to figure out why it didn't work) and thinking "this is pretty nice code".
Well, it worked for Amazon — Berkeley DB was used extensively there as the makn database, right from the beginning. I remember talking to an ex-Amazon engineer in 2006 who said BDB was still the main database used for inventory, and complained that everything was a mess, with different teams using different tech for everything. Around that time Amazon made DynamoDB to solve some of that mess — and it sat on top of BDB.
An old thread about this: https://news.ycombinator.com/item?id=29290095.
It worked well for Amazon because they kept it within a tight operating envelope. They used it to persist bytes on disk in multiple, smaller BDBs per node. This kept it out of trouble. They also sidestepped the concurrency and locking problems by taking care of that in the layers above. It was used more like SSTables in BigTable.
They phased out BDB before DynamoDB was launched. Some time between 2007 and 2010. By the time DynamoDB launched as a product in 2012(?), BDB was gone.
I wanted to love berkeley db; it was available everywhere, seemed simple, was fast when tested. In practice it never worked well though, with pretty frequent corruption under load, and license confusion from oracle. It has a lot of features you're never going to use, and if you try, you'll be disappointed
There's no shortage of embeddable key-value stores with C bindings like leveldb, rocksdb, or even gdbm, and all of them have worked better for me.
Loved this chapter, great design, well written.