Empower Blog
Multi-Tenant Data and MySQL
Here at Empower Campaigns we’re building a multi-tenant software as a service (SaaS) application. This means that while we have many organizations using our product simultaneously, they’re not able to see the data and activity of other organizations - each organization sees its own private version (instance) of the product.
There are many ways to accomplish this. We could duplicate every resource in our architecture and maintain a full stack for every client. Sounds expensive and error-prone - imagine tracking down the one database with a table that failed to update. At the other end of the spectrum, we could share everything and enforce separation in our application - i.e. the application always remembers to ask for only the current instance’s data. Also not good - a single developer mistake could cause data to leak between instances.
We’d like to find a comfortable position near the latter end of that spectrum since that would maximize server utilization, minimize infrastructure costs, and minimize the risk of updates (which we intend to do really often - more on that later.) We found an interesting solution in this blog post:
http://www.reachcrm.com/2010/03/11/multi-tenant-strategy-for-saas-using-mysql5/
The key is to give each instance its own database login with which to access views containing data only for that instance. Not separate schemas - the underlying tables contain data for all instances. This keeps our maintenance tasks (backup, deployment, etc.) manageable while making data separation failures very unlikely (and easily detected).
Here’s a concrete example. As root, create the base table, view and trigger:
As MySQL user tenant1, add data:
As tenant2, add data and verify that you see only your data:
As tenant1, verify that you still see only your data:
As root, verify that the tenant IDs were stored properly:
I ran some primitive benchmarks and didn’t see any performance bottlenecks in this approach. I thought we might see inefficient caching or poor query plans but we didn’t see any such problems. This approach does preclude persistent connections from the web tier. And of course the real test is production load. We may have to revisit this approach, but even if we have to abandon it our job is relatively easy - we can just change the model (or later service) layer to access the base tables and drop the views.
Dan Copeland, VP of Engineering (twitter: http://twitter.com/dpc47)