What does HackerNews think of gh-ost?

GitHub's Online Schema-migration Tool for MySQL

Language: Go

#4 in MySQL
With foreign keys, I can't use https://github.com/github/gh-ost for zero downtime migrations. Instant deal breaker, because schema changes are inevitable given business evolution.
Gh-ost is the new hotness. Simple to use and lots of great features: https://github.com/github/gh-ost
MySQL has some robust tooling in this space. Some of the tools use triggers to copy to a new table. GitHub's gh-ost[1] is probably the state of the art, and uses the binary log stream to replicate the data.

[1] https://github.com/github/gh-ost

So you're right but also wrong thanks to good tooling. If you have enough discipline to not put triggers on your tables then pt-online-schema-change has got you sorted. Does it take forever? Yep, but does it matter that it takes forver? Nope because it doesn't lock tables.

And even if you do have triggers https://github.com/github/gh-ost you can still do it.

For MySQL we use github's tool gh-ost https://github.com/github/gh-ost There is a few tech restrictions, but if your database meets that it works great.

I migrated huge tables with absolutely no impact on production

Online schema change solutions have been around for over the past decade and are commonly used to ALTER TABLE with no downtime (or with minimal interruption) on the largest deployments of MySQL today.

The two most common solutions are pt-online-schema-change and gh-ost, and if you are running MySQL today and still running direct ALTER TABLE suffering outage, then you're in for a pleasant change.

On top of that, most MySQL ALTER TABLE operations with InnoDB tables support non-blocking, lockless operation as well. My main concern with these is that they're still replicated sequentially leading to replication lags.

MySQL is also slowly adding "Instant DDL", currently still limited to just a few types of changes.

Disclosure: I authored gh-ost (at GitHub), oak-online-alter-table (the original schema change tool) and am a maintainer for Vitess and working on online schema changes in Vitess.

Links:

- https://www.percona.com/doc/percona-toolkit/3.0/pt-online-sc...

- https://github.com/github/gh-ost

- Past HN discussion: https://news.ycombinator.com/item?id=16982986

- https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-op...

- https://vitess.io/docs/user-guides/schema-changes/

Edited for formatting.

Again, thank you for the questions.

I am estimating that your database space isn't MySQL, which is just fine of course. Reason I'm asking/guessing, is that in the MySQL space, online schema change toold have been around for over a decade and are the go-to solution for schema changes. A small minority of the industry, based on my understanding as a member of the community, uses other techniques such as rolling migrations on replicas etc., but the vast majority uses one of the common schema change tools:

- pt-online-schema-change - facebook's OSC - gh-ost

I authored the original schema change tool, oak-online-alter-table https://shlomi-noach.github.io/openarkkit/oak-online-alter-t..., which is no longer supported, but thankfully I did invest some time in documenting how it works. Similarly, I co-designed and was the main author for gh-ost, https://github.com/github/gh-ost, as part of the database infrastructure team at GitHub. We developed gh-ost because the existing schema change tools could not cope with our particular workloads. Read this engineering blog: https://github.blog/2016-08-01-gh-ost-github-s-online-migrat... to get better sense of what gh-ost is and how it works. I in particular suggest reading these:

- https://github.com/github/gh-ost/blob/master/doc/cheatsheet....

- https://github.com/github/gh-ost/blob/master/doc/cut-over.md

- https://github.com/github/gh-ost/blob/master/doc/subsecond-l...

- https://github.com/github/gh-ost/blob/master/doc/throttle.md

- https://github.com/github/gh-ost/blob/master/doc/why-trigger...

At PlanetScale I also integrated VReplication into the Online DDL flow. This comment is far too short to explain how VReplication works, but thankfully we again have some docs:

- https://vitess.io/docs/user-guides/schema-changes/ddl-strate... (and really see entire page, there's comparison between the different tools)

- https://vitess.io/docs/design-docs/vreplication/

- or see this self tracking issue: https://github.com/vitessio/vitess/issues/8056#issue-8771509...

Not to leave you with only a bunch of reading material, I'll answer some questions here:

> Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?

The way all schema change tools mentioned above work is by creating a shadow aka ghost table on the same primary server where your original table is located. By carefully both copying data from original table as well as tracking ongoing changes to the table (whether by utilizing triggers or by tailing the binary logs), and using different techniques to mitigate conflicts between the two, the tools populate the shadow table with up-to-date data from your original table.

This can take a long time, and requires an extra amount of space to accommodate the shadow table (both time and space are also required by "natural" ALTER TABLE implementations in DBs I'm aware of).

With non-trigger solutions, such as gh-ost and VReplication, the tooling have almost ocmplete control over the pace. Given load on the primary server or given increasing replication lag, they can choose to throttle or completely halt execution, to resume later on when load has subsided. We have used this technique specifically at GitHub to run the largest migrations on our busiest tables at any time of the week, including at peak traffic, and this has show to pose little to no impact to production. Again, these techniques are universally used today by almost all large scale MySQL players, including Facebook, Shopify, Slack, etc.

> who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify?

The tool (or Vitess if you will, or PlanetScale in our discussion) will throttle based on continuously collecting metrics. The single most important metric is replication lag, and we found that it predicts load more than any other matric, by far. We throttle at 1sec replication lag. A secondary metric is the number of concurrent executing threads on the primary; this is mroe improtant for pt-online-schema-change, but for gh-ost and VReplication, given their nature of single-thread writes, we found that the metric is not very important to throttle on. It is also trickier since the threshold to throttle at depends on your time of day, particular expected workload etc.

> We run customers that do dozens to thousands of transactions per second. Is this high enough?

The tooling are known to work well with these transaction rates. VReplication and gh-ost will add one more transaction at a time (well, two really, but 2nd one is book-keeping and so low volume that we can neglect it); the transactions are intentionally kept small so as to not overload the transaction log or the MVCC mechanism; rule of thumb is to only copy 100 rows at a time, so exepect possibly millions of sequential such small transaction on a billion row table.

> Will their migrations ever run, or will wait for very long periods of time, maybe forever?

Some times, if the load is so very high, migrations will throttle more. At other times, they will push as fast as they can while still keeping to low replication lag threshold. In my experience a gh-ost or vreplication migration is normally good to run even on the busiest times. If a database system is such that it _always_ has substantial replication lag, such that a migration cannot complete in a timely manner, then I'd say the database system is beyond its own capacity anyway, and should be optimized/sharded/whatever.

> How is this possible? Where the migration is running, then? A shadow table, shadow server... none?

So I already mentioned the ghost table. And then, SELECTs are non blocking on the original table.

> What's cut-over?

Cut-over is what we call the final step of the migration: flipping the tables. Specifically, moving away your original table, and renaming the ghost table in its place. This requires a metadata lock, and is the single most critical part of the schema migration, for any tooling involved. This is where something as to give. Tooling such as gh-ost and pt-online-schema-change acquire a metadata lock such that queries are blocked momentarily, until cut-over is complete. With very high load the app will feel it. With extremely high load the database may not be able to (or may not be configured to) accommodate so many blocked queries, and app will see rejections. For low volume load apps may not even notice.

I hope this helps. Obviously this comment cannot accommodate so much more, but hopefully the documentation links I provided are of help.

Every branch in PlanetScale has its own separate MySQL connection credentials first off (and conceptually just looks like a separate DB), so no. You'll be using your main production database as usual, and you'll have separate credentials for a branch where you can make schema changes like ALTER TABLE, and you "connect" to that branch with... whatever. Deploy a canary version of your app, use your SQL IDE, whatever. The main branch always prohibits changes like ALTER TABLE or CREATE DATABASE. Merging a branch where the CREATE/ALTER happened is the only way to bring changes to the production database that serves your users queries.

When you merge a branch in PlanetScale, you can still write to the database while the migration is happening, online. This is done through "Online Schema Change", but the TL;DR is that a system in the background creates the new database, reads the mysql binlogs from your original database, including all the writes, and propagates those writes to the new database with the changes applied. It does this continuously, in real time. Once the new database is caught up, the system switches over transparently to make the changes atomically appear live.[1]

The requirement is that the running application server has to be able to handle both the old and new schema at the same time. This is easy if you do something like "add a column that isn't yet used and will be soon", but harder for "dropping a column that is in use". No database alone can solve this however, you have to coordinate your application layer correctly so that it can handle the switch over when you apply the changes. But this means you can have very high uptime with e.g. blue/green deployments and staged rollouts, and you'll never have to take your DB offline for it.

It's also trite but "hundreds of writes a second" is nothing. Vitess can handle 100x that write load (probably much more) and deal with schema changes to petabyte-scale databases completely online. PlanetScale currently doesn't let you control the Vitess keyspace sharding configuration (yet) which is crucial to this but I suspect it will happen eventually. But they'll also let you run Vitess in your own K8S environment while they provide a control plane, so you wouldn't pay list price, anyway.

I do not have any affiliation with PlanetScale but it's a pretty cool product I could see myself using for my low-volume needs, just because of the pay-as-you-go model where I don't pay for e.g. RDS allocation. And I say that as a really big PostgreSQL fan.

[1] https://github.com/github/gh-ost

GitHub uses MySQL, not Postgres. They built the best-in-class online schema change tool gh-ost [1], and have a custom declarative schema change execution system built around Skeema [2], which contains a wealth of linters [3].

Even so, it's always possible for an engineer to submit a schema change which is detrimental to performance. For example, dropping an important index, or changing it such that some necessary column is no longer present. Linters simply cannot catch some classes of these problems, as they're application/workload-specific. Usually they must be caught in code review, but people make mistakes and could approve a bad change.

Disclosure: I'm the author of Skeema, but have not worked for or with GitHub in any capacity.

[1] https://github.com/github/gh-ost

[2] https://github.blog/2020-02-14-automating-mysql-schema-migra...

[3] https://www.skeema.io/docs/options/#lint

SQLite did get support for renaming columns recently [1].

Anyways, the process you describe is also used in MySQL for doing online schema migrations. [2] "proper" database migrations cause downtime

[1]: https://stackoverflow.com/questions/805363/how-do-i-rename-a... [2]: https://github.com/github/gh-ost

I confirm the sibling post that adding a column is instaneous on MySQL 8 (although there are limitation), and point to the reference document: https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-op....

Outside the context of db comparisons, and in relation to the specific case, if you don't have triggers/foreign keys on a given MySQL table, Gh-ost¹ solves the DDL locking issues.

¹=https://github.com/github/gh-ost

Re db migrations: they've built their own DB management tooling (https://github.com/openark/orchestrator) and online migration tooling (https://github.com/github/gh-ost)
They do, and obviously it is workable, but I think every big MySQL org ends up with a tool/workflow like gh-ost: https://github.com/github/gh-ost which have their own set of challenges, like not playing well with foreign keys.
A true version controlled database is tough given the nature of the data. A bad migration and you lose data. Being able to do them on a replica and then run tests is cool. If you use MySQL look into Githubs tool https://github.com/github/gh-ost
Is there any equivalent for django+mysql? I historically found more tools for online schema changes in the mysql sphere, such as gh-ost ( https://github.com/github/gh-ost )
That's not great either. Moving the "master" role from one server to a replica isn't instantaneous (meaning you have a period of read-only state), and while your replica is performing the schema operation you still need to keep the data in sync with changes made on the master server (which to the best of my knowledge is non-trivial).

There are tools that "replay" data changes though: - FB's OSC (https://www.facebook.com/notes/mysql-at-facebook/online-sche...) - GH's gh-ost (https://github.com/github/gh-ost)

Indeed. It seems there are many such tools available for MySQL: Openark kit [0] for instance, whose author also brought us gh-ost. Unfortunately most tools focus on just MySQL, and none really have an answer when you want to use (and enforce) foreign key constraints. Although it's hard to say, I suspect the users of these tools have "given up" on foreign key constraints.

[0] http://code.openark.org/forge/openark-kit [1] https://github.com/github/gh-ost

I don't think the claim of MySQL replication being better is related to statement vs. row vs. binary diff. I think it is about the tooling and community knowledge about replication, and about running MySQL in large scale production environments in general.

MySQL is run more often at extremely large scale (Facebook, YouTube, Twitter, Dropbox, etc.) than Postgres. That results in very battle tested and/or featureful tooling like orchestrator (https://github.com/github/orchestrator), MHA (https://github.com/yoshinorim/mha4mysql-manager), ProxySQL (http://www.proxysql.com/), and gh-ost (https://github.com/github/gh-ost), along knowledge and best practices shared by those organization.

Another tool in this category is gh-ost by github: https://github.com/github/gh-ost

Its main advantage seems to be that it doesn't require triggers on the original table (it works by subscribing to the replication log), thus not making write locks worse during the migration.

I haven't tried either yet, so if anybody has some experience with them, do please share a comparison!