Easy SQL Schema Migration for SqlAlchemy and Flask

While SqlAlchemy (and Flask-SqlAlchemy) provides an easy way to create DB schema from scratch,  migration of an existing schema is bit more challenging. As soon as you change attributes  in your declarative ORM models, the underlying DB schema is to be changed for application to work.   If you need to keep data , you’ll need to modify DB schema with some DDL commands (ALTER TABLE …)  and this functionality is not part of SqlAlchemy nor Flask-SqlAlchemy.

Usually this task is fairly manual,  but several tools exist to help you with it. Probably most sophisticated tool is Alembic and its Flask wrapper flask-Migrate. While Alembic advertise itself as “a lightweight database migration tool”, there is still a lot of manual  work to be done.  This additional effort is then rewarded with advanced features like downgrading, cross database compatibility, branching etc.

However if you need something simpler with just one database product in mind (PostgreSQL) and  forward  migration only, Alembic can be an overkill (especially requirement to learn custom scripting to modify DB).

So if we stick to PostgreSQL, we can use other tools to compare different database instances and generate migration SQL DDL scripts for us.   I’ve found this tool apgdiff very usefull. It’s written in Java and can compare two database dumps ( from pg_dump tool) and create differential DDL script. We just need two databases, one with the old and other with the new schema.   But is you are running tests,  you should already have another database for testing, which is upgraded to new schema as you are running tests after SqlAlchemy declarative models changes.

So my generic approach to DB schema migration is following:

  1. Add table to hold current version of schema in database (declarative model):
    class Version(Base):
        version = Column(Integer)
  2. Add required version of DB schema into Python source code:
    __db_version__ = 1
  3. Modify declarative models, run tests and create diff script with apgdiff, name it as SQL_DIR/migration/vX.sql, where X is the next version of DB schema. Add this script to version control. Increase __db_version__ in code.
  4. Create a migration task – it basically runs all scripts with version bigger then schema version stored in database and smaller or equal then version required by the code.  Scripts run in order of increasing db versions.  Here is migration task written for flask-script:
    def migrate_tables():
        import psycopg2
        print('This will migrate database to latest schema, you are advised to backup database before running this command')
        if prompt_bool('Do you want to continue?'):
            mdir = os.path.join(SQL_DIR, 'migration')
            if not version_obj:
                    version_obj=model.Version(version=0, version_id=1)
            old_version = version_obj.version
            if old_version == db_version:
                print('DB is at correct version %d'% old_version)
            scripts = []
            for script in os.listdir(mdir):
                m=re.match(r'v(\d+)\.sql', script)
                if m:
                    version = int(m.group(1))
                    if version <= db_version and version > old_version:
                        scripts.append((version, os.path.join(mdir,script)))
            connection = psycopg2.connect(database=settings.DB_NAME,
                                          user = settings.DB_USER,
                                          password = settings.DB_PASSWORD,
                                          host = settings.DB_HOST,
                                          port = settings.DB_PORT)
            connection.autocommit = True
            #connection = db.engine.raw_connection()  # @UndefinedVariable
                c = connection.cursor()
                for v,fname in scripts:
                    script = open(fname, 'rt', encoding='utf-8-sig').read()
                    print('Upgrading database to version %d'% v)
                    res = c.execute(script)
                    version_obj.version = v

    For PostgreSQL there is one tricky issue here.   Some DDL commands like ALTER TYPE type_name ADD  VALUE 'x' cannot run within transaction.   So as workaround  we use direct connection from psycopg2 package (because  all connections from sqlalchemy engine by default use transaction already, unless you configure the engine  with autocommit and we did not want  to this this here). If you will not need such special DDL commands, you can easily use connection provided from default slqlalchemy engine ( commented lines).

  5. Checkout code into desired environment (stage, production, …)  and run the migration task there.

After this infrastructure in set-up in your code you just need to run items 3 and 5 in next migrations.

Leave a Reply

Your email address will not be published. Required fields are marked *