Wednesday, February 2, 2011

Managing Database Code for Continuous Integration (Part 2 of 4)

Introduction
Part 2 (of 4 series) on Continuous Database Integration covers creating, managing and provisioning a database project for a continuous integration environment. I will be using Visual Studio 2010 to create a database project (.dbproj project) and for source control I will be using Subversion (open source). The usage of Visual Studio is mainly for the management of SQL scripts and (most importantly) for the deployment file that it generates, but you can do away with it and have a complete CDBI system setup in an open source environment. Most of the tools selected in this article to set up CI are free.
          If your environment extensively uses Team Foundation Server (instead of Subversion), and Application Lifecycle Management (ALM), you should look into setting up continuous integration with these tools before starting to look for free ones. If you are an open source shop (two thumbs up) or want to try out standing a CI environment on your own, read on.

This article is focused on the use of the SQL Server database project type in Visual Studio 2010. The SQL Server project type has ‘build’ and ‘deploy’ capabilities (along with a bunch of other features), which is the pivotal component of the CI setup. If you do not want to use VS 2010 Database Projects, and instead use a file system management based structure of hosting the database scripts; then, you need to manually create the re-runnable\deployable script (an OSQL or SQLCMD command script that executes all your deployable .sql scripts). Make sure you test this against a local instance of your database to emulate the build and deploy features. I created a custom C# console application that looks into specific folders (Tables, Stored Procedures etc.) and creates a command line batch script with error handling embedded in it (a custom deploy file generator was created as my project involved both oracle and sql server and I was working with Visual Studio 2005 .dbp project type), but with VS 2010 you get all these benefits plus the database testing, refactoring and it just plainly makes managing a database project much simpler (.dbproj project type).
Before we begin on the CI setup of the database project, download the following tools to set up the environment:
1.    Subversion – For source control of the database project. This is by far the best and most comfortable source control system I have worked with (Sorry TFS). After installing subversion and creating a repository, make a note of the repository location as you will need it to link your database project to that.
2.    Plugin for Subversion integration with Visual studio IDE (either one of the two)
       a. Visual SVN (Free to try, $49 per license)
       b. Ankh SVN (Free)

Visual SVN vs. Ankh
: I would recommend Visual SVN to a database centric shop that has heavy duty database development, SSIS, Reporting and\or SSAS solutions. I have had problems with Ankh SVN plugin to work correctly with these project types. It does not recognize these project types from the IDE and you end up managing them from the windows explorer instead of the commit\revert operations from the IDE. Visual SVN is much simpler to use, works perfectly with all types of project types that a database developer needs to work with. Yes, it does come with a price tag, but a license of $49 is dirt cheap. This was when I was working with Visual Studio 2005 integration. Things may have changed with Ankh since then, try it out and see what works best for your scenario.

3.    SQL Server SSMS Tools Pack: This is more of a helper plugin than a requirement. Helps you generate your seed data, save custom snippets as hotkeys, CRUD generator and more. Once you start using it you will want to get more. Download it here.

Database Project Setup
Once the prerequisite software is installed, open Visual Studio 2010 and create a new SQL Server 2008 project.

Database Project Type

For demo purposes I am creating (reverse engineering by importing an existing database during project setup wizard) a database project for AdventureWorks SQL Server 2008 database [If AdventureWorks sample database is not installed on your database server, it can be downloaded from CodePlex]. Complete the project setup by following the necessary steps as per your database configuration. Leaving them in their default settings is also fine for the moment.
Once the project setup wizard completes, the solution explorer should resemble the fig. below. Right click on the project name “AdventureWorks” and select Properties to bring up the project settings. Click on the ‘Build’ option to view the location of the deployment script. Select the ‘Deploy’ tab on the left to view the deployment settings.

AdventureWorks Deployment Options

Now that we have the project ready, Right click the project and select ‘Build’. The status bar should go from ‘Build Started’ to ‘Build Succeeded’ status. After the Build succeeds, deploy the project by Right clicking the project and selecting ‘Deploy’. This will create a deployment script named ‘AdventureWorks.sql’ (in the Visual Studio\Projects\YourProjectFolder \sql\debug). With the database project deployment, two options are available (for now leave it in its default state: 1).

1.    Create a deployment script (default)
2.    Create a deployment script and run it against a database.

Location of AdventureWorks.sql deployment script

The deployment script is the most important artifact for a successful CI system setup. This is a compilation of all the database objects belonging to your database project (including seed scripts, security etc.). After first time deployments, when a change is made to the database project, a script with the same name will be generated which will include the changes.
The next step is to add your project to Subversion source control. To version control your project, right click on the project and select ‘Add solution to subversion’. Select the repository path and add the project. And finally right click and Add files and then commit\check-in the solution. The project is now ready to be shared by anyone who has the correct setup as listed earlier.
 

Preparing artifacts for CI
An isolated database instance of SQL Server needs to be provisioned for continuous build and deploy of database scripts (tear-down and reinstall). This database should not accessible to developers to use for development or testing purposes. The sole reason for the existence of this database is to test the continuous deployment of a database on either code commits or regular intervals of time. This also serves as a sanity check of your end product at any point in time.
Visual Studio 2010 database project provides the tear-down and install script (tear-down = recreate database) on the right click and select deploy action. But, in a continuous integration environment we would like to have this file created automatically on every build scenario. This can be implemented using the VSDBCMD command. You could use VSDBCMD for just creating the re-runnable deployable script (with the /dd- command line option) or use it for creating and running the deployable script (with /dd+ option at command line). If you plan to use it just for creating the re-runnable deployable script, then the script needs to be executed by either ‘sqlcmd’ or ‘OSQL’ in any environment separately.

    Ideally, I would prefer using VSDBCMD just to create the deployment script and then handover the deployment script to the DBA specifying the parameters (documenting them in an implementation plan of the database). The DBAs are familiar with sqlcmd\OSQL than VSDBCMD, plus using the VSDBCMD to execute the deployment script requires a bunch of assemblies (dll files) to be copied on the database server. I am not sure as to how the production DBA of today will accept this change. Thinking on the likes of a developer; sure, VSDBCMD is cool and you should definitely use in qa and production environments. But, in the real world scenario DBAs run the show. By just creating the deployable file in development and then running the same on QA and Production using sqlcmd you standardize your deployments and make your deployments simpler and worry free. (Did I mention that VSDBCMD also requires a registry change on the machine if Visual Studio is not installed on the machine, which is the database server?).
Not always a smooth ride; enter the obstacle: the hardcoded variables in the deployment file.

Hardcoded variables in Visual Studio deployment file
: Visual Studio deploy process creates three default parameters in the deployment file: DatabaseName, DefaultDataPath and DefaultLogPath. The ability to edit\override them is what makes the discussion of SqlCmd vs VSDBCMD interesting.
The main advantage with SqlCmd over VSDBCMD is the ability to pass variables as parameters to the deployment script from the command line. This is a big advantage as the VS DB project hardcodes the database name, data and log file path (mdf and ldf) in the deployment script (AdventureWorks.sql, see setvar commands below) and although there is a way to get around it, it is painful. 


:setvar DatabaseName "AdventureWorks"
:setvar DefaultDataPath "C:\Program Files\...\DATA\"
:setvar DefaultLogPath "C:\Program Files\...\DATA\"

Note: The above variables can be suppressed by editing the project deployment configurations. (This option can be used at runtime via command line params also).


At this point you have two options with sqlcmd: manually changing the DatabaseName and DefaultDataPath and DefaultLogPath variables in the deployment file, or use option two i.e. changing the variables on command line with ‘SqlCmd’ using the “-v” flag for variables.
Ex: sqlcmd –S -d master -v DatabaseName=“NewAdventureWorks” DefaultDataPath=“C:\Data\” DefaultLogPath=“C:\Data\”

If you decide to go with VSDBCMD for creating the deployment file and deploying to the database server, a workaround is required to make this work. Complete the following workaround steps (skip both steps if you are going to go with sqlcmd for qa & production deployments):
1.    Override the DatabaseName at runtime with the TargetDatabase command line option. Ex: /p:TargetDatabase="NewAdventureWorks". This will override the :setvar DatabaseName "AdventureWorks" to “NewAdventureWorks”.

2.    Overriding file path variables: Let’s get something straight first – ‘The variables DefaultDataPath and DefaultLogPath cannot be overwritten’. Microsoft has received requests for this and is planning to allow for overwriting in the next release of database projects. For now we will have to do with a workaround.

a. Right click on project ‘AdventureWorks’ and select ‘Deploy’. Edit the Sql command variables file by clicking the Edit button.



b. Add two additional variables ‘myDataPath’ and ‘myLogPath’ as shown below.


c.    In the database project, navigate to Schema Objects \Storage\Files and change the data file path variable in the AdventureWorks_Data.sqlfile.sql and the log file path variable in AdventureWorks_Log.sqlfile.sql to reference the newly created command variables.
-    Rename $(DefaultDataPath) to $(myDataPath)
-    Rename $(DefaultLogPath) to $(myLogPath)


d.    Right click and Build the project. Navigate to .\AdventureWorks\sql\debug (location of your project) and open the AdventureWorks_Database.sqlcmdvars with Notepad. The new variables will be available to change in here.


As you can observe from steps 1 & 2 above, the workaround for using VSDBCMD can be a bit painful. One other important thing to keep in mind is that VSDBCMD does not execute pre-prepared deployment files. This is also an item that the MS team is considering to change in the next iteration. To create a deployment package VSDBCMD needs the necessary assemblies, build files, manifest, sqlcommandvars file etc. to prepare the end product (deployment file) and run it. On the other hand sqlcmd is easier to run pre-prepared deployment files (like AdventureWorks.sql).

Creating the build package (build files) & executing the deployment output
For now, I am going to demonstrate creating the deployable file with VSDBCMD (minus steps 1 & 2 above) and deploying them on different environments with sqlcmd instead of using VSDBCMD.

The workflow of the continuous builds and deployments that we are trying to emulate is:
a. Clear existing deployable file: In the \sql\debug folder, delete the file AdventureWorks.sql.

b. Build the project: For now just right click and select “Build”. I will be using MSBuild to perform this task in the next article. Behind the scenes, when you right click and build, Visual Studio uses MSBuild for the build process to create the build files in \sql\debug folder.

c. Generate deployable file AdventureWorks.sql: Using the VSDBCMD command line tool on the integration machine and the database manifest file (AdventureWorks.deploymanifest) to generate the deployment file “AdventureWorks.sql”

Before trying out this step, make sure VSDBCMD is installed on your integration machine.
-    If Visual Studio is not already installed, then follow the instructions here to download and install VSDBCMD.
-    If Visual Studio is already installed (VSDBCMD is located in “C:\Program Files\Microsoft Visual Studio 10.0\VSTSDB\Deploy”), make a reference to it by adding it to your PATH variable in environment variables (instead of copying the files).

Execute the following command in the \debug\sql folder from command prompt: 

VSDBCMD /dd:- /a:Deploy /manifest: AdventureWorks.deploymanifest


The /dd- ensures that a deployment script ‘AdventureWorks.sql’ is generated and is not executed against the database server.

d.    The output of step above (AdventureWorks.sql) is executed against the integration database instance. Execute AdventureWorks.sql using ‘sqlcmd’ to test out the deployment.

Sqlcmd –S -d -E –i -o -v DatabaseName=“DefaultDataPath=“DefaultLogPath=“

Ex: SqlCmd –S myServer -d master -E –i C:\AdventureWorks.sql -o C:\LogOutput.txt -v DatabaseName=“NewAdventureWorks” DefaultDataPath=“C:\Data\” DefaultLogPath=“C:\Data\”

Note: Variables set with –v parameters overwrite hardcoded variables set in the deployment script.

Summary
This completes our preparation of build items needed for setting up the database project for Continuous Integration. Not to worry, the steps above are for understanding the working knowledge of how the CI product (Hudson) is going to orchestrate the above steps on the server for us in the next part of this series. As far as the options go with selecting VSDBCMD for build or using it for both build and deploy, it depends on your environment. If you are flexible enough knowing the changes you have to accommodate to get VSDBCMD working in your environment, then go for it, otherwise just use it for build purposes on the build server to create the deployable product and use that going into the next environment phases (QA and Production).

The next part of the series deals with orchestrating the steps a through d above in a repetitive manner based on either code commit\check-in or regular intervals of time. This will be set up using free tools - Hudson and NAnt. The next article will explain in detail setting up Hudson as a CI server for Database projects and configuring NAnt tasks for the actual implementation. I will also take some time to discuss a proactive vs. a reactive CI setup and how that affects development. That’s all I have for now. Thanks for reading.