Posts Tagged ‘migrate’

SQLAlchemy Migrate Process Hiccups

Friday, December 5th, 2008

So, I’ve done migration processes for two medium-large database schemas (50-100 tables) and I have found what I believe to be a disconnect in the process of migrating and a database and developing a database application.


The Problem
——————-
Here is the problem in general.  I am sorry it is so long winded, but it’s hard to see what is going on without this full expression.

You create a schema, and use that schema to create a database.  Maybe
you are using Pylons, and using setup-app to create the schema. Everyone is happy in schema land, and then someone realizes you need to make a change, and maybe even more changes in the future.  You decide to use migrate, because hell, migrate will make things easier. So, you write the migration, modify your model code and everything is honky dory.  Clients are happy, your developer feels like he has a maintainable codebase.  Your project begins to grow.  Awesome.

Not Awesome.  Your second developer needs a development environment. You give them your model, they setup-app it and they are ready to go. While they are plugging along, you realize you need a new migrate, so you create it, and upgrade your development version of the database, which is now at migration 2.  The production also goes off without a hitch, it’s now on 2.

What about your new development buddy?  When he ran migrate_version, he set his database up to version 0.  So, migrate will try to do 0->1 and then 1->2, but 0->1 will fail because his database is already at version 1 even though his version table says its 0.  Now he is in pain, and re-creates his entire database to get moving, and the whole cycle starts over again, although he can work… until you do the next migration.

This is not ideal, and I have been struggling with it at one of my clients.  In that situation, I am the developer buddy, bopping in from time to time to lend a hand.  I spend probably 5% of my time getting up and running, where if their migration system was somehow linked to their database model, I could be synchronized, and actually use it for a development tool, rather than a dusty old production tool.

Solution 1
—————
So, I came up with my own solution, and have talked to a close friend about an alternative solution.  I’d like to discuss both of them as they pertain to migrate, in the hopes that we can improve migrate, and  re-connect the development environment with the production one.

My solution is to connect the model base code to the migration version.  The way I accomplish this is to add a version.py file at the root of my model code, which has, incredibly, the version of the migration changeset that matches this model definition.  I also have a custom database creation script which creates a migrate_version table and fills it with a record of the correct version number.  Now I can svn up, and migrate up and everyone is happy. The problem with this method is that I have to maintain a damn file with the version number, and not go insane doing it.  Since my model contains 3 different database schemas in it, that means I have to maintain 3 variables in my version.py .  Personally, I don’t mind this, but I had to write more documentation than the amount in this post to make certain that my predecessors (or me in 6 months) can figure out what the heck is going on.

Solution 2
————–
The second solution, which Mark came up with, is to preserve the initial model.  Then, you do all of the migrations to bring you up to the current codebase.  This does not require any version file tracking, and if your migrations add boilerplate entries, you don’t have to record them in a second place. I think solution 2 is reasonable, but could be more time consuming to execute (which is probably not really a problem with everyone’s 2ghz machines these days).  The problems I see are that the boilerplate entries you make for production may a not be the same as you make for development.  This makes testing a bit harder if they differ. I also feel that solution 2 is more prone to problems, simply because there are a lot more cranks to turn to get a development database.

How can migrate help
—————————–
The way to provide solution 1 to the problem in migrate would be to allow the developer to connect his migrations to his model code, and allow migrate some level of control over the “versions” module in the model code.  So, when you do a migrate commit, you are also saying that the model code supports this version of migration.  It will probably only take me  a few hours to make this happen.  We then create an easy way to create the migrate_version table by importing something.  This way you can have a custom createdb script that also creates the migrate_version table.  Perhaps we could even provide a reasonable createdb template.

Solution 2 only requires us to provide a custom creation script.  For this we will have to preserve the initial schema (schema 0) and then upgrade.  We could probably also include a template instead of a script so that you can manage boilerplate on your own.

My own conclusions
—————————-
I lean more towards solution 1, mostly because I feel that it will provide for easier boilerplating, because you will have access to your whole model at once, instead of bits and pieces of your model as it moves from one version the next.  I also think the boilerplate for dev. is likely to be a lot different than that of prod for most folks, and I feel like solution 1 offers an easier way of handling it.  It also seems like solution 1 is much more efficient, because you don’t have to go and re-do all of that migration work, you have already arrived at the correct-for-now solution.

If you made it this far congratulations!  Now, go grab a cup of coffee, stretch your arms, come back and tell me what you think.