I'm a bit confused with the "branching" [0] and "non-blocking schema changes" [1] features. I'm confused as they sound like "the next big thing" and I don't see anything special here. Not saying are bad concepts or ideas, the contrary. But not really useful either. Surely I'm missing something, so I would love to hear from the PlanetScale team here if possible.
I have a strong and long Postgres operational background, so I may be also here with assumptions that might be different in MySQL/Vitess/PlanetScale. My main concerns/questions are:
* I can't imagine testing DDL changes without data. Having data there is so important to understand the change and its impact, that I won't do them without data. And unless I'm mistaken, these branches only contain DDL, no data at all ("data from the main database is not copied to development branches").
* While it sounds neat, has a web UI and a CLI, managing branches of a schema and using CI and approval lifecycle... is something that sounds like I could do, and possibly better (as it is more integrated with tooling and workflows) from Git platforms themselves, isn't it? I could do branches, merges, CI, comments on MRs, approval... I could even easily build a deploy queue ("promote") with a CI. Doesn't sound like too hard.
* I don't understand how the "safeness" and the non-blocking nature of changes are ensured. Many DDL changes will take different amount of locks on rows or tables, which may cause some queuing and even lock storms in the presence of incoming traffic. Without incoming traffic, they may run fine. In other words: the impact of a migration can only be determined in combination with the traffic hitting production. How does PlanetScale do this? How for example is handled the case where a DDL changes the type of a column to another type which causes a table rewrite, which essentially locks the table and prevents concurrent writes?
Again, not saying both concepts are bad. Terminology and methodology may be already an innovation. And surely I'm missing a lot. But other than this, I don't see myself using this (testing migrations without data is a showstopper, and not the only one) and I don't see much of an innovation from a safeness perspective here.
Why this system isn't one where thin clones of the database are created as the branches (e.g. like in Database Lab Engine [2]), where you can play with data too, and then some data synchronization is performed to switch over to the branch once done (is not easy at all, but doable with many precautions)? That would be a significant improvement in the process, IMHO.
[0]: https://docs.planetscale.com/concepts/branching
[1]: https://docs.planetscale.com/concepts/nonblocking-schema-cha...
[2]: https://postgres.ai/products/realistic-test-environments
Hi, engineer at PlanetScale and maintainer for Vitess here. Appreciate your thoughtful comment, a couple answers:
> is something that sounds like I could do ... from Git platforms themselves
Git is very bad at analyzing SQL diffs. It can show you the textual diff between two CREATE TABLE statements, but it will not know what it takes to get you from _here_ to _there_. It has many parsing issues, like capturing irrelevant columns due to trailing commas. Or, for example, if you change a column's data type as per your suggestion, Git cannot differentiate that from a complete drop and recreation of the column. It just doesn't have insight into how SQL works/parses. And most importantly, it is unable to provide the actual operational diff you're seeking: the ALTER TABLE statement to take you from state A to state B. At this point I just want to give a shout out to skeema [0] and its underlying library tengo [1], which tackled this issue in a git-like manner for MySQL dialects/flavors.
> Many DDL changes will take different amount of locks on rows or tables, which may cause some queuing and even lock storms in the presence of incoming traffic.
This is indeed one of our main premises. Online schema changes in PlanetScale (and based on Vitess) will:
- Run concurrently to your production traffic
- Will automatically throttle when your production traffic gets too high, and in particular taking care not to affect replication lag
- Will run completely lockless throughout the migration, up to the cut-over point, where locking is required
- At cut-over point, will only cut-over when it predicts smooth operation (i.e. when satisfied that its own backlog for cutting over is short enough that lock time is minimal)
- To top it all, PS also manages the lifecycle of the Online DDL, such as service discovery, scheduling, error handling, throttling (mentioned), cleanup and garbage collection and more.
I just want to clarify the above is based on proven technologies, widely used in the MySQL community, which I was fortunate enough to be involved in for the past years. These run at scale for the largest deployment in the world today. We keep evolving Online DDL with more to come.
Please also see these docs on the Vitess website: [2]
[1]: https://github.com/skeema/tengo
[2]: https://vitess.io/docs/user-guides/schema-changes/#the-schem...
(Comment edited for formatting and grammar)
Thank you for your comments, I appreciate it. I'm still not sold, however. I would like to understand the underlying principles, "how this works". I don't need implementation details (happy if they are shared, though) but more on the main principles of operation. Please see my further comments below:
> Git is very bad at analyzing SQL diffs.
Agreed, nothing against. So PS has built-in a nice SQL diff. Neat! But what this really brings? I mean, it's not that there aren't SQL diff tools, tools to manage DDL migrations. Besides this, why not layer it on top of Git? Many orgs and integration tools already have similar workflows (e.g. approval workflows, issue management tools, CI, etc) and if instead of coming up with a new system it would be a layer on top of the existing ones, it would probably have less friction to use. Just my perspective on this, of course.
> Run concurrently to your production traffic
Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?
> Will automatically throttle when your production traffic gets too high, and in particular taking care not to affect replication lag
Same as above: who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify? We run customers that do dozens to thousands of transactions per second. Is this high enough? Will their migrations ever run, or will wait for very long periods of time, maybe forever?
> Will run completely lockless throughout the migration
How is this possible? Where the migration is running, then? A shadow table, shadow server... none?
> At cut-over point
What's cut-over? Are groups of servers switched? This is what it sounds to me, and that would explain how it could be lock-less and not affecting production traffic. However, it does not explain how data is synchronized from the production database to the migration branch, nor how it keeps being updated with the real production traffic. This is essentially the crux of me failing to understand how this system works.
In general, I apologize if these are too many questions. But in essence, I feel this all sounds really well, but unless I have a deeper understanding of how the principles work, and they are sound to me, I won't be able to recommend this for production usage, as I know from experience the many caveats migrations have. If they are all solved, hats off, but I would appreciate if from a technical perspective this would be more clearly explained.
Thank you!
I am estimating that your database space isn't MySQL, which is just fine of course. Reason I'm asking/guessing, is that in the MySQL space, online schema change toold have been around for over a decade and are the go-to solution for schema changes. A small minority of the industry, based on my understanding as a member of the community, uses other techniques such as rolling migrations on replicas etc., but the vast majority uses one of the common schema change tools:
- pt-online-schema-change - facebook's OSC - gh-ost
I authored the original schema change tool, oak-online-alter-table https://shlomi-noach.github.io/openarkkit/oak-online-alter-t..., which is no longer supported, but thankfully I did invest some time in documenting how it works. Similarly, I co-designed and was the main author for gh-ost, https://github.com/github/gh-ost, as part of the database infrastructure team at GitHub. We developed gh-ost because the existing schema change tools could not cope with our particular workloads. Read this engineering blog: https://github.blog/2016-08-01-gh-ost-github-s-online-migrat... to get better sense of what gh-ost is and how it works. I in particular suggest reading these:
- https://github.com/github/gh-ost/blob/master/doc/cheatsheet....
- https://github.com/github/gh-ost/blob/master/doc/cut-over.md
- https://github.com/github/gh-ost/blob/master/doc/subsecond-l...
- https://github.com/github/gh-ost/blob/master/doc/throttle.md
- https://github.com/github/gh-ost/blob/master/doc/why-trigger...
At PlanetScale I also integrated VReplication into the Online DDL flow. This comment is far too short to explain how VReplication works, but thankfully we again have some docs:
- https://vitess.io/docs/user-guides/schema-changes/ddl-strate... (and really see entire page, there's comparison between the different tools)
- https://vitess.io/docs/design-docs/vreplication/
- or see this self tracking issue: https://github.com/vitessio/vitess/issues/8056#issue-8771509...
Not to leave you with only a bunch of reading material, I'll answer some questions here:
> Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?
The way all schema change tools mentioned above work is by creating a shadow aka ghost table on the same primary server where your original table is located. By carefully both copying data from original table as well as tracking ongoing changes to the table (whether by utilizing triggers or by tailing the binary logs), and using different techniques to mitigate conflicts between the two, the tools populate the shadow table with up-to-date data from your original table.
This can take a long time, and requires an extra amount of space to accommodate the shadow table (both time and space are also required by "natural" ALTER TABLE implementations in DBs I'm aware of).
With non-trigger solutions, such as gh-ost and VReplication, the tooling have almost ocmplete control over the pace. Given load on the primary server or given increasing replication lag, they can choose to throttle or completely halt execution, to resume later on when load has subsided. We have used this technique specifically at GitHub to run the largest migrations on our busiest tables at any time of the week, including at peak traffic, and this has show to pose little to no impact to production. Again, these techniques are universally used today by almost all large scale MySQL players, including Facebook, Shopify, Slack, etc.
> who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify?
The tool (or Vitess if you will, or PlanetScale in our discussion) will throttle based on continuously collecting metrics. The single most important metric is replication lag, and we found that it predicts load more than any other matric, by far. We throttle at 1sec replication lag. A secondary metric is the number of concurrent executing threads on the primary; this is mroe improtant for pt-online-schema-change, but for gh-ost and VReplication, given their nature of single-thread writes, we found that the metric is not very important to throttle on. It is also trickier since the threshold to throttle at depends on your time of day, particular expected workload etc.
> We run customers that do dozens to thousands of transactions per second. Is this high enough?
The tooling are known to work well with these transaction rates. VReplication and gh-ost will add one more transaction at a time (well, two really, but 2nd one is book-keeping and so low volume that we can neglect it); the transactions are intentionally kept small so as to not overload the transaction log or the MVCC mechanism; rule of thumb is to only copy 100 rows at a time, so exepect possibly millions of sequential such small transaction on a billion row table.
> Will their migrations ever run, or will wait for very long periods of time, maybe forever?
Some times, if the load is so very high, migrations will throttle more. At other times, they will push as fast as they can while still keeping to low replication lag threshold. In my experience a gh-ost or vreplication migration is normally good to run even on the busiest times. If a database system is such that it _always_ has substantial replication lag, such that a migration cannot complete in a timely manner, then I'd say the database system is beyond its own capacity anyway, and should be optimized/sharded/whatever.
> How is this possible? Where the migration is running, then? A shadow table, shadow server... none?
So I already mentioned the ghost table. And then, SELECTs are non blocking on the original table.
> What's cut-over?
Cut-over is what we call the final step of the migration: flipping the tables. Specifically, moving away your original table, and renaming the ghost table in its place. This requires a metadata lock, and is the single most critical part of the schema migration, for any tooling involved. This is where something as to give. Tooling such as gh-ost and pt-online-schema-change acquire a metadata lock such that queries are blocked momentarily, until cut-over is complete. With very high load the app will feel it. With extremely high load the database may not be able to (or may not be configured to) accommodate so many blocked queries, and app will see rejections. For low volume load apps may not even notice.
I hope this helps. Obviously this comment cannot accommodate so much more, but hopefully the documentation links I provided are of help.