When I first started down the path to the Clam project, I realized I could get sub-second responses when the db/cache/php/nginx were all collocated on the same machine. It literally removes almost all network effects. That being said, how could I use that to my advantage?
From a database perspective, for a small site, it is rather simple. There only needs to be one partition, with many replicas. For larger sites, it can get rather complex. For example, how should a site be partitioned?
At first glance, the wp_users
, wp_usermeta
, wp_options
, wp_term*
would belong in a global partition that appears across all site partitions and replicas.
A naive partitioning scheme would be to partition across post id’s. However, I’m not sure I like it. It would leave hotspots on the nodes, but it is the most logical scheme. A page load is generated due to a user viewing a post … so that makes sense at first glance.
So let us take a look at this with a naive partitioning using even and odd post id’s:
There’s many replicas of the “MetaDb” across all nodes. However, there’s only a few replicas of the evens and odds of the posts. I’m going to venture down this partitioning scheme and see if the obvious works in practice.
Interestingly, this brings a few things up like caching and rebalancing the db partitions. In the caching realm, some things become impossible to cache. For example, we can no longer cache the wp_options
table since Node B’s changes while replicated to the db, will not be invalidated in their local caches. If we’re using some distributed cache like rethinkdb, maybe we can share invalidated caching across nodes. Until that work gets completed, caching is completely out.
Rebalancing the db requires quite a bit more work. Imagine in the diagram above, if there were just Node A, and the PostDb wasn’t partitioned at all. When Node B comes online, the “evens” PostDb has to be created from Node A’s data and then replicated. The PostDb for odds also has to be created from the original db … with 0 downtime. That’s a challenge.
Anyway, this is still theoretical. There’s still a long way to go… lots of experiments to do.
Until next time,
Rob