From the Department of Redundancy Department
This is a comic about redundancy. It’s also redundant since it’s a redone version of my second comic. The difference is, this shall appear in my upcoming book, Seven Databases in Seven Weeks.
Something you need to understand about replication: it exists to improve read yield, uptime, and recovery. That’s about it. What it won’t do is is increase write availability or allow you to scale horizontally like you can with sharded data. I just wanted to make this clear. It seems a common misunderstanding about replication that, since it is “distributed”, that equals “scalable”.
There are largely two strategies in the task of horizontally scaling data: replication and sharding (sharding existing to stretch resources by dividing computation across several machines or, in weird cases, CPUs). This is why I hate the word distributed… it’s too vague to be useful.
Mongo supports both and manages them in quite deliberately. Contrast this to something like Riak which shards and replicates by default, and you must be very deliberate to turn one off. I won’t venture to say which is better, because it’s largely driven by use-case. But what I will venture to say is that one is much easier to manage operationally. Then again, Riak doesn’t have a company to manage it for you.