Who doesn't enjoy the big term, DevOps? It sounds promising and sounds interesting as well when someone says, they work as a DevOps engineer! But like they say, 

With great power, comes great responsibility!

Similarly, in the case of DevOps, every process, which DevOps exposes has to be handled with a great care or the results are catastrophic. In this post, I am going to talk about the swissknife of continuous delivery, in the terms of database migrations and maintenance. It has been quite a while, since I have joined DevOps practices and they are quite promising in what they make a statement for -- making development easier, and less cumbersome. But there are several problems, which arise as we move onwards toward the final stages of the cycle. The stage of continuous delivery is where you have to apply tests, like integration testing, and then push the content out to your users. Different application models, require different ways of managing the dependencies. For example, web applications require backend databases to be connected and initialized, whereas the local applications might require to check the database file to be there. 

But in all of these cases, databases must be checked and migrated if there are some changes in the model. Otherwise, the ORM, which you are using will show errors. 

Problem at hand

First, let us understand the problem, which we will be facing and then I will move onwards and show you how you can make appropriate changes to the model, and the code system to ensure that database models are always up to date for your applications. The problem is different, on server-side databases and for the client-side databases. We are focusing on the problems residing on the server-side. Server-side databases have a major problem as they, 

  1. Have to be available at every instance. There can be users accessing information every time, thus you cannot go down for even an instance. 
  2. Must be available for users through different software you have, sometimes your users access content through a web portal, sometimes a software or a Web API or Web Service. 
  3. Same database is being used, thus you cannot expect to have a group of people down and a group using it. 
  4. Even your own applications can cause a problem and crash. 
  5. Migrations of database, is required as soon as you update.
  6. Backward compatibility must be kept, so that in case if you didn't apply patch, your application runs on the older schema. 

To understand the problem we are facing, just try to visualize the following model, which we have, 

x
1
public class Note {
2
    public string Id { get; set; }
3
    public string Title { get; set; }
4
    public string Content { get; set; }
5
}

Now, we know that we have to manage a repository for the storage type of our notes, we need to see beyond the current state and see how the notes could possible change and how it could possible break our application. The notes can provide us, 

  1. Title information about the Note
  2. Some content for the Note
  3. The ID to search, query and slice-dice the note through.

But, what if we want to include more information to this Note? That is where the problem actually starts. At the moment, our table might look something like this, 

x
1
+------+---------+---------------------+
2
| Id   | Title   | Content             |
3
+------+---------+---------------------+
4
| 1    | Hey!    | Something here...   |
5
|------+---------+---------------------+

Imagine that now we have to add an extra column to our database table, in the C# program we can do that easily, but however we cannot do that in the database without a migration...

x
 
1
public class Note {
2
    public string Id { get; set; }
3
    public string Title { get; set; }
4
    public string Content { get; set; }
5
  
6
    // Addition; allows null.
7
    public DateTime? PublishedOn { get; set; }
8
}

As seen clearly, this cannot be used in our applications because the model has been changed and thus our database requires to upgrade as well. Now, let us go down and understand what are the best ways to avoid any downtime while we migrate our databases, to ensure a proper DevOps kind of development and release process. 

Migrating the database

I will give you a few hints, to understand where the problem starts, if you believe that the problem starts when the code executes, then you are wrong. Your control over the code has been lost the moment you published the code without proper maintenance. There are several factors that play a very vital role in smooth migration of databases and integration of databases with your application's code. If either case of them fails, everything fails and you have to call the test teams, and Ops for this. There are two types of changes in the database schema and both require a migration, 

  1. First one holds the case where you add columns to the database, and do not remove any.
  2. Second case holds where you delete the columns and not remove any.
  3. Third case, is actually an aggregate of them being, removing and adding the columns and thus you have to make sure both the cases are met. 

Now, let us look into solving the problem one by one.

Migration with addition of column

The first sort of migration is where you add a column to the database, and don't remove the column. In such cases, the problem is not with code as you can see here, 

x
 
1
+------+---------+---------------------+-----------------+
2
| Id   | Title   | Content             | PublishedOn     |
3
+------+---------+---------------------+-----------------+
4
| 1    | Hey!    | Something here...   | NULL            |
5
|------+---------+---------------------+-----------------+

You can see, if our queries were column-strict or not, we would still get the results pretty much simply. 

1
1
SELECT * FROM table
2
3
-- Or through column-selected
4
SELECT Id, Title, Content FROM table -- skipping PublishedOn

But, the only problem here, is what would happen if you access the data of a record where your PublishedOn value is null. The problem comes in the code, that is like this, 

x
1
var record = dbContext.Notes.Where(x => x.Id == 1234); 
2
3
// Assume it has no PublishedOn value
4
myControl.Text = record.PublishedOn.ToString("MMMM dd, yyyy"); // Accessible, but null, thus raising error.

Thus, we now need to check for NULL values, and ensure that our data has the values which we require it to have, not in most cases, but you can update the code to check for a condition, 

1
 
1
// Requires DateTime? and not DateTime
2
if(record.PublishedOn != null) {
3
     // Publish the data
4
} else {
5
     // Skip the date printing. 
6
}

Otherwise, you can have a valid value for the date time column in SQL database. Now, our database migration won't effect our database and the application, because we have specified that, 

  1. We can allow the NULL values to be there, NULL values would be the values for records which do not have this value. 
  2. We are properly checking and ensuring that our records are highlighting the state of data, printing date time only if there is any. 
  3. Any older application is not affected by this change, because they are unaware of this column. The changes are visible as you update your code, and your model. 
    1. This is a case, especially when you are using an API backed application and API requires the clients to send list of columns they want to access. 
    2. Mobile applications for instance follow this pattern, you either get a complete table (*) or you request for columns (Id, Title...).
    3. In such case, mobile app won't ask for this column at all, as it is new to it.

You can also try a fallback value, but that would not make much sense at all. Now, let us see, what would possibly happen once we delete the column from the table and how to solve the problem.

Migration with deletion of the column

This is somewhat critical case, because you are trying to remove data, unlike addition, this would require that your clients start to stop accessing the column information as well. For sake of this example, let us assume that instead of adding the PublishedOn column, we removed it instead and then we required to modify the clients as to support their request for data. 

Let's see what happens, once we do that, instead of showing the C# code I will start by the database table structure first, 

x
 
1
+------+---------+---------------------+
2
| Id   | Title   | Content             | 
3
+------+---------+---------------------+
4
| 1    | Hey!    | Something here...   |
5
|------+---------+---------------------+

Now that our table does not have a PublishedOn column, any request that accesses this information must fail. This is somewhat tough, because now we require to ensure that none of our clients access this information. So we do, 

x
 
1
public class Note {
2
    public string Id { get; set; }
3
    public string Title { get; set; }
4
    public string Content { get; set; }
5
  
6
    [System.Obsolete("This column is obsolete and will be removed. Update your repositories as per.")]
7
    public DateTime? PublishedOn { get; set; }
8
}

This way, our own application will not access the information, but there must be cases where your clients, might request this and some cases where your older applications are running. In such cases, it is not best approach to hide or delete the column. Note that this will never throw an exception (unless, IsError field is set to True), but in other cases it must give a formal warning to our developers as well as customers, telling them that this field is obsolete and must never be used. Now that said, Entity Framework will not ask for anything else, because the model is still same... Also, 

Now we see, upon this change, 

  1. Our developers won't use it and we can enforce that they never use it by treating the usage as errors. 
  2. Our clients will get a warning as well, and they can upgrade their systems and code to reflect the changes. 
  3. The database table will not be deleted. Deleting the table forces everyone to accept the change, which is not at all an agile way of development.

DevOps does require a prompt response, but not a sudden one. Thus, giving a warning before deleting the columns is a better approach because it supports both case scenarios: 1) Leaving the column and not using it at all, 2) Notifying the users that they must change their behavior, as future might remove the column. It does this all by properly ensuring the database does not get changed, plus it also ensures that none of our code breaks -- by only adding a single attribute, Obsolete.

Final words

In this post, we saw the two major types of changes in the database and how they can cause a problem in our application. We then saw how to properly take notice of these changes and make sure our code does not break as we move onwards in the DevOps cycle of continuous deployment. Hopefully, after this post you will be able to better understand how DevOps flows require your database administrators and database management tools to upgrade and make sure that a database migration does not require any downtime in the application itself.