Developer: Ruby on Rails
Guide to Ruby on Rails Migrations
by Matt Kern
Published June 2007
Join the exodus and make managing your database schemas a snap.
Developers are flocking to Ruby on Rails at an incredible pace. There's a great reason for this migration: Rails provides a robust framework built on one of the most flexible languages ever conceived.
DOWNLOAD
One of the tricks in the Rails bag is the concept of "Migrations." Migrations provide an excellent example of why developers would want to use the framework, and here's why: Generally speaking, managing changes to a database schema has been one of the most odious tasks for a team of developers. Most have relied on storing DDL in revision control, ever vigilant to ensure that our database creation scripts are updated and consistent with each rollout. That solution can be very clumsy in an Extreme Programming project. And because Rails encourages iterative development, it would be very easy to imagine the constant schema changes turning into nightmares.
Fortunately, Migrations allows a developer to manage rollout, and rollback, of database schema changes in a controlled and consistent manner, and one that happens to feel very natural to a Rails programmer.
A Troubled Past
There's probably as many ways to manage a schema for a project as there are open source content management systems. What most developers end up settling on is some sort of combination of ongoing database creation scripts and data load scripts written in pure SQL. Mine usually end up being called create_db.sql and populate_db.sql. They'll often look something like this:
CREATE TABLE artists (
id NUMBER(38) PRIMARY KEY,
name VARCHAR(100) NOT NULL,
updated_on DATE NOT NULL,
created_on DATE NOT NULL
);
CREATE TABLE albums (
id NUMBER PRIMARY KEY,
release_name VARCHAR2(255),
year DATE,
created_on DATE NOT NULL,
updated_on DATE NOT NULL,
artist_id NUMBER(38) NOT NULL
);
CREATE TABLE songs (
id NUMBER(38) PRIMARY KEY,
title VARCHAR2(255) NOT NULL,
length NUMBER(6,2) NOT NULL,
created_on DATE NOT NULL,
updated_on DATE NOT NULL,
album_id NUMBER(38) NOT NULL
);
CREATE SEQUENCE artists_seq;
CREATE SEQUENCE albums_seq;
CREATE SEQUENCE songs_seq;
There are quite a few problems with this approach, though. While it gets the job done, it's entirely inflexible. Every time you need to change the data model for any given enhancement one or both of these files must change. That's not so much a problem if you're using a version control system like Subversion or CVS because you'll at least have a version history. But if you're working on a project for distribution where you're maintaining three or four different sets of files for each database, it can be very painful—and completely contrary to the principle of DRY. Even with a single set of database files rolling out your changes is problematic at best. And if you need to roll the change back, it's likely that the only way to do it is to drop all the tables, restore to a known good version of your schema, and reload the data from a backup. You could choose to try to roll back the changes by retracing your steps, but it's very error prone.
Another potential problem area is portability. If your schema generation scripts were written for a specific database like MySQL and later in the development process you decide to move to Oracle, you'll have to rewrite the SQL in its entirety. The Migration library takes the burden of having to remember the details of each database vendor's implementation off your shoulders. You'll see how this works in practice later.
To get the most out of this article you'll want to have at least a rudimentary understanding of the way Rails works. Jump over to "Ruby on Rails on Oracle: A Simple Tutorial" if you're new to Rails or you just need a refresher on the conventions upon which Rails relies.
A Bright Future
Until now, we've been captive to this ball-and-chain of SQL generation scripts (or whatever your "favorite" mechanism for managing database schemas as been). Enter Rails' ActiveRecord Migrations.
At the most basic level Migrations allow you to define incremental changes to your data model (and data itself!). This sort of approach melds seamlessly with Agile and XP methodologies favored by Rails developers the world over. A Migration is defined by subclassing ActiveRecord::Migration and overriding the two required method definitions, self.up and self.down:
class SampleMigration < ActiveRecord::Migration
def self.up
end
def self.down
end
end
Within the up and down definitions you can create tables and indexes, add data, manipulate models and more. The concept is simple in practice and theory: Actions you want to perform to migrate to the next version of the schema go in the up method definition. Anything you need to do to revert the change goes in the down definition. For example:
1 class SampleMigration < ActiveRecord::Migration
2 def self.up
3 create_table :people do |t|
4 t.column :name, :string
5 end
6 end
7
8 def self.down
9 drop_table :people
10 end
11 end
On line 3 we create the table definition and enter a block where we can alter the table by adding, dropping or otherwise modifying the table definition. In this case we created a new column called "name" of type string. Rails maps Oracle data types to Ruby logical types like :string, :decimal, :text and so on.
To revert our change is simple, you need only drop the table you created in the up definition. On line 9 you do just that: drop the table. Couldn't be easier, right? Of course it can! This is Rails after all.
To make it even easier to implement Migrations the Rails team has given us a generator script to create the Migration file and make it conform to Rails conventions for running the Migration. In your RAILS_ROOT run:
$ script/generate migration SampleMigration
create db/migrate
create db/migrate/001_sample_migration.rb
Open the db/migrate/001_sample_migration.rb script and you'll see the class definition already laid out for you. All that's left is to fill in the method definitions!
You'll probably notice that the filename containing our migration has a three digit number appended to it, 001. That number plays a huge role in the Migrations implementation. Rails Migrations transparently adds a SCHEMA_INFO table to keep track of the current version number of the schema. That version number corresponds directly to the numbers appended to the actual migration files (more on that subject later).
First, let's see how to run a Migration. Rails makes heavy use of another Ruby library called "Rake". Rake is a pure ruby implementation analogous to Unix/Linux "make" utility and is another very useful tool for managing a project. Tasks are defined and run through Rake as rake <task>. So to run a migration you can issue. For a list of all database tasks rake knows about, use rake -T db*):
$ rake db:migrate
That command will run all the Migrations it finds under db/migrate until it's at the highest version number. You can override the version number by appending VERSION=x after the command. You can also pass RAILS_ENV="production" to force the migration to run on the production database (or test, dev, etc.). Be very careful with that; unless you've tested your migration extensively you could seriously damage your production database!
Migrations in the Wild: Discographr
Now you're going to walk through the use of Rails' Migrations in an example (contrived) application called Discographr. Discographr utilizes a very simple data model containing artists, albums, and songs.
Remember the antiquated SQL generation script you saw in the first part of this article? Let's convert that to a Migration. First you'll need to create the migration file using a generator:
$ script/generate migration StartupSchema
exists db/migrate
create db/migrate/001_startup_schema.rb
Notice how Rails takes your camel case words and splits them for you to follow the Rails convention for naming Migration filenames. If you look inside the file you'll see that the class name is automatically formatted to Rails conventions as well. Edit db/migrate/001_startup_schema.rb to read as follows:
class StartupSchema < ActiveRecord::Migration
def up
create_table :artists do |t|
t.column :name, :string, :null => false, :limit => 100
t.column :created_on, :timestamp, :null => false
t.column :updated_on, :timestamp, :null => false
end
create_table :albums do |t|
t.column :release_name, :string, :null => false
t.column :year, :date
t.column :created_on, :timestamp, :null => false
t.column :updated_on, :timestamp, :null => false
t.column :artist_id, :integer, :null => false
end
create_table :songs do |t|
t.column :title, :string, :null => false
t.column :length, :decimal, :precision => 6, :scale => 2
t.column :created_on, :timestamp, :null => false
t.column :updated_on, :timestamp, :null => false
t.column :album_id, :integer, :null => false
end
end
def down
drop_table :songs
drop_table :albums
drop_table :artists
end
end
To run the migration, simply call:
$ rake db:migrate
Open up SQLPlus and run the following query (b sure to log in as the correct user):
SQL> SELECT table_name FROM user_tables;
TABLE_NAME
------------------------------
SCHEMA_INFO
SONGS
ALBUMS
ARTISTS
You see the three tables you defined along with the Migration added and maintained SCHEMA_INFO table. A quick peek inside the SCHEMA_INFO table reveals what version of the schema you're currently at:
SQL> SELECT * FROM schema_info;
VERSION
----------
1
If you have a look at the definition of the tables according to Oracle you see:
SQL> DESC songs;
Name Null? Type
--------------------------------- -------- ----------------------------
ID NOT NULL NUMBER(38)
TITLE NOT NULL VARCHAR2(255)
TRACK_NUMBER NUMBER(3)
CREATED_ON NOT NULL DATE
UPDATED_ON NOT NULL DATE
ALBUM_ID NOT NULL NUMBER(38)
SQL> DESC albums;
Name Null? Type
--------------------------------- -------- ----------------------------
ID NOT NULL NUMBER(38)
RELEASE_NAME NOT NULL VARCHAR2(255)
YEAR DATE
CREATED_ON NOT NULL DATE
UPDATED_ON NOT NULL DATE
ARTIST_ID NOT NULL NUMBER(38)
SQL> DESC artists;
Name Null? Type
--------------------------------- -------- ----------------------------
ID NOT NULL NUMBER(38)
NAME NOT NULL VARCHAR2(100)
CREATED_ON NOT NULL DATE
UPDATED_ON NOT NULL DATE
There's no need to define the id column that Rails requires as a primary key. ("Requires" might be a bit of a strong word as you can override the name of the primary key column as you can with most Rails conventions—but things just go more smoothly if you follow the conventions.) The Migration knows what is expected of a table used in Rails and it takes the burden of remembering all these conventions. But how does Rails get the values used in the ID column? Just like it should for Oracle, of course, with sequences! Luckily, the OracleAdapter knows that, by convention again, Rails expects to find a sequence called <tablename>_seq for the ID column to use as a primary key value so it takes care of that detail for you:
SQL> SELECT sequence_name FROM user_sequences;
SEQUENCE_NAME
------------------------------
ARTISTS_SEQ
ALBUMS_SEQ
SONGS_SEQ
You'll notice that you do have to add your foreign key columns manually. Rails doesn't know what sort of relationship a given object will have with others so it's necessary to define those columns in the Migration table definition. As I'll explain later, though, I prefer not to formalize the foreign key relationship in the database and leave it to Rails in the application layer using macros like :belongs_to and :has_and_belongs_to_many. But for now, all you need to be concerned with is that the column is in place. The model classes will handle the rest.
Now, let's pretend that you forgot to add a column to the songs table for the track number. You could run the migration generator and add it in a new and separate migration, but since you just started this schema, instead you decide to roll back the schema and add it to the StartupSchema. (Expect lots of changes to the schema as you're relying heavily on the iterative agile development methodology, so let's just keep things tidy for now.)
To roll back to an earlier version add an environment variable to the rake command:
$ rake db:migrate VERSION=0
And, just to make sure you've got a clean slate:
SQL> SELECT table_name FROM user_tables;
TABLE_NAME
------------------------------
SCHEMA_INFO
SQL> SELECT sequence_name FROM user_sequences;
no rows selected
SQL> SELECT * FROM schema_info;
VERSION
----------
0
All looks well: the sequences have been dropped automatically along with the tables. The SCHEMA_INFO table remains, but a quick look inside it tells you that you're at version 0 again. Now you have a clean slate to work with and can add anything else you want to the StartupSchema class. So, add the following line to the StartupSchema Migration in the songs table definition:
t.column :track_number, :integer, :limit => 3
All that's left to after making the changes is to rerun the migration to move to version 1 again:
$ rake db:migrate
You now have a good prototype, but let's say that after a few days you decide that you really need to add a genre table so you can associate any of your three canonical models with a genre, such as a song or an artist with a genre. You also decide that the year column would be better handled as a four-digit integer than a date object as you're really not interested in doing many date calculations with the data.
Thanks to Migrations your changes are almost trivial to implement. First you'll use the model generator to build out all the files you'll need to add a new model. This includes stubs for unit testing and, of course, the migration file!
$ script/generate model Genre
exists app/models/
exists test/unit/
exists test/fixtures/
create app/models/genre.rb
create test/unit/genre_test.rb
create test/fixtures/genres.yml
exists db/migrate
create db/migrate/002_create_genres.rb
If you edit the newly created migration file you'll see:
1 class CreateGenres < ActiveRecord::Migration
2 def self.up
3 create_table :genres do |t|
4 end
5 end
6
7 def self.down
8 drop_table :genres
9 end
10 end
Rails' code generation continues to help you conform to best practices and Rails conventions by stubbing out everything for the new Genre model. The only thing that's left is to flesh out the table definition. (It can't be quite that smart!)
1 class CreateGenres < ActiveRecord::Migration
2 def self.up
3 create_table :genres do |t|
4 t.column :name, :string, :null => false, :limit => 100
5 t.column :created_on, :timestamp, :null => false
6 t.column :updated_on, :timestamp, :null => false
7 end
8 end
9
10 def self.down
11 drop_table :genres
12 end
13 end
Now change the year column:
$ script/generate migration ChangeAlbumYearToInteger
exists db/migrate
create db/migrate/005_change_album_year_to_integer.rb
Edit the migration file:
1 class ChangeAlbumYearToInteger < ActiveRecord::Migration
2 def self.up
3 add_column :albums, :year_int, :integer, :limit => 4
4 Album.reset_column_information
5 say_with_time "Updating albums" do
6 albums = Album.find_all
7 albums.each do |a|
8 a.update_attribute(:year_int, a.year.year.to_i)
9 say "#{a.release_name} updated!", true
10 end
11 end
12
13 remove_column :albums, :year
14 rename_column :albums, :year_int, :year
15 change_column :albums, :year, :integer, :limit => 4, :null => false
16 end
17
18 def self.down
19 add_column :albums, :year_date, :date
20 Album.reset_column_information
21 say_with_time "Updating albums" do
22 albums = Album.find_all
23 albums.each do |a|
24 a.update_attribute(:year_date, Date.new(a.year, 01, 01))
25 a.save!
26 say "#{a.release_name} updated!", true
27 end
28 end
29 remove_column :albums, :year
30 rename_column :albums, :year_date, :year
31 end
32 end
This migration was a bit tricky because you're assuming there's already data in the year column. You don't want to do something that would change the primary key as the Song model depends on the relationship. So, the first thing you do is create a temporary column on line 3 to allow you to do type transformation without losing data. Next, on line 4, you refresh the Album model's view of the columns because you added a new one and you'll want to use it right away. If you don't do this, the model isn't aware of the changes made to the underlying table. Then you retrieve all the records in the table on line 6 and iterate through them saving the value from the year column into the year_int column as an Integer on lines 7 and 8.
The say_with_time and say calls are convenience methods to let you print to the screen during your migrations. It's just a handy way of letting everyone know what's happening. You then delete the column on line 13 and rename the temporary column year_int to year on line 14. However, you may decide that you really want to require data in the year column—in which case you would add:null => false in the change_column call.
To get a feel for what this looks like in practice here's the output from running the migration (assuming there's one album in the database called "Greatest Hits"):
$ rake db:migrate
(in /Users/mattkern/dev/discographr)
== ChangeAlbumYearToInteger: migrating ========================================
-- add_column(:albums, :year_int, :integer, {:limit=>4})
-> 0.0404s
-- Updating albums
-> Greatest Hits updated!
-> 0.3315s
-- remove_column(:albums, :year)
-> 1.5591s
-- rename_column(:albums, :year_int, :year)
-> 0.0987s
-- change_column(:albums, :year, :integer, {:limit=>4, :null=>false})
-> 0.1794s
== ChangeAlbumYearToInteger: migrated (2.2352s) ===============================
The self.down definition reverses all those changes. Occasionally you'll make a change where you can't revert to the prior schema version. For example, if you had instead deleted all the data in the year column, effectively discarding all that data, you couldn't reverse the migration. In those cases you can raise an IrreversibleMigration exception in the down method.
Migrations are powerful, as we've seen so far, but it's important to keep in mind that they're not perfect. It's still possible to corrupt your database if you're not careful. In the next section you'll learn some potential pitfalls along with a few tips and tricks.
Potholes on the Road to Schema Nirvana
No code is perfect. If you want to use Migrations with Oracle, it's imperative that you use at least version 1.2.1 (ActiveRecord version 1.15.1). Versions prior to 1.2.1 can't handle :decimal data types and they fumble on options like :precision and :scale effectively precluding you from using any numbers with decimal points, both fixed and floats. Make sure you double-check your schema as Oracle sees it while testing your migrations, and make sure to test them on a development environment prior to running them in production. Rails has tons of built-in facilities to run multiple environments for development, testing, and production so there's really no excuse for not testing prior to deployment. That fact is one of the biggest benefits to using both the Rails framework and Migrations in particular.
Also, make sure that you set up a separate schema (user) for your Rails application. Part of a migration run includes doing a schema dump to a file called schema.rb. You can control weather this dump is SQL or Ruby (in Migration format of course) with a configuration parameter in environment.rb. If you are using a schema that has a lot of non-Rails tables, they all get inspected and dumped. So if you notice your migrations hanging at the end of a run, make sure you're not using a user like "system" to connect. You'll end up with a schema.rb filled with all the system tables—if you're patient enough to wait out the dump!
Generators are convenient but you can easily generate superfluous migration classes. For example, if you create an application with a "StartupSchema" as you did in developing Discographr, then use a generator to create your model; you'll automatically get another migration file for that model. Feel free to delete that migration since you've already handled the tables needed for the model. Watch out, though: if you don't delete it you get an error saying that the object already exists in the database.
If your migrations do fail your best bet is probably to wipe the database clean and rerun rake db:migrate to get up to the current schema version. Just make sure you fix the offending code first. And of course make sure you're not going to lose critical data. Well-tested migrations should have very little chance of failing in production, especially if you're loading sample data from your test fixtures.
Tips for Smooth Sailing
While many developers like make heavy use of constraints in the database, I find it easier and cleaner to keep most of them away from the database. I consider constraints part of the business logic of the application and find that they're easier to track if they stay in the application code and out of the database. Take foreign keys for example; although there are no built-in ways to add foreign keys to your Migrations, there are plugins and other help out there. However, I'm explicitly being vague because I think they're a bad idea. Adding foreign keys to your Migrations degrades database independence and it can cause problems when trying to fix data that's gone wrong. Rely on Rails filters and validations instead and avoid the headaches associated with adding foreign keys. Many developers, myself included, like to be able to develop on a lightweight database like SQLite3 and deploy to production on big workhorses like Oracle, and foreign keys in your migrations will fail on SQLite. Stay flexible.
Try to keep each Migration class to a single task or feature. Decide what works best for you and stick with it. My rule of thumb is that if the file/class name gets too long to describe the change being made it should be broken up into separate Migrations. You should be able to recognize the change from the filename without having to dig through the code itself. Remember, this is about making development more efficient and enjoyable. I'll even let you in on a little secret: ActiveRecord and Migrations can be used outside the Rails framework, too. (Okay, that's not really a secret, but it's good to know!)
Finally, test, test, and retest. Make sure that your migrations are solid before moving them to production. The order in which tasks are completed can make a big difference when running a migration so testing is absolutely crucial. For example, if you try to change a column with data in it before deleting the data, your migration will fail. I can think of no bigger mistake than to move a poorly tested Migration into production. Migrations are powerful, and they can really do damage to a production database if you're not careful. Of course, used wisely that power can go a very long way to increase your efficiency and enjoyment of the development process.
Ultimately, the sky's the limit when it comes to what you can do with Migrations. You can load data to your development database only with access to RAILS_ENV and test fixtures. You can do complex data transformations and calculations as part of a major feature upgrade. With Migrations you've got the power, elegance, and simplicity of the entire Rails framework at your fingertips for any schema or data change you can imagine.
Matt Kern has been searching for and developing ways to make life easier through technologies like Rails for years—mostly an attempt at finding ways to spend ever more time roaming the mountains of Central Oregon with his family. He is the founder of Artisan Technologies Inc. and co-founder of Atlanta PHP.