softdev/installer softdev/ softdev/copyDB

That wonderful software product you've developed and want to put live, how do you know it is going to work?

You have arranged with your client that their new software will be put live, today, out of hours.  It's been a long day, you're tired, if anything goes wrong, you know you're past fixing it.  Nervous?


You tested it, it's going to work, so you start going through your deployment can't fail, right?

I met a friend for lunch some time ago.  He was most amused telling me the story of his morning.  His colleague was going on holiday, but the project he had been working on for a client was due to go live whilst he was away.  It was Very Important Stuff, though clearly needed signing off before going live, so this wasn't going to happen today.

His colleague had handed him a handwritten sheet of paper, listing all the files which needed to be copied from an export of HEAD in CVS over the top of their counterparts on the live server.  Alongside this were other instructions on directories to create, config files to change and .php files to manually adjust on the live server.  This was going to be very slow, nerve-racking, and would leave the system slightly broken for short amounts of time however fast he typed.

We chuckled about this for a while in disbelief, speculating about how much effort it might be to restore this server to its current state if it blew up, given all the random files being copied around manually in the name of deployment.  Luckily, nothing came of it as the manager involved forgot to sign it off for a couple of weeks.  So a happy ending, of sorts, at least for my friend.

But if you don't notice anything wrong with this situation, if you think the stress and the time taken to do a deployment like this is just a part of the fun of it, you should read on.

So what's the key to successful software deployment?  Everything.  Everything done on this project since its inception leads up to its perceived success.  Unlike all good stories, let's start at the end.

Note: This is written from a "PHP running on a LAMP stack" perspective, but the principals should translate well to almost any other set-up.

Pressing the button

What does it mean, to "press the button"?  To me, and I am very lazy, it means to press as few buttons as possible to deploy software.  No complication, no stress, knowing it will just work.  Unlike in the story above, I want to have some kind of program do everything for me - to remove the old code and put the new code in place, to load stored procedures in to the database, to do any other housekeeping.

Write an installer.  It doesn't have to do anything clever, and in fact you should write all your code in such a way that nothing clever needs to happen at install time:

  • Where configuration options are not in the main config file, have sensible defaults used or warnings sent out.
  • Make it so that no code needs to be changed in the production system.
  • Make it so the whole working directory (vhost) can be deleted and replaced without any other work needing to be done: Place your site configuration file outside the vhost.  Make sure any downloadables which are not in the database are outside the vhost.  Make sure temporary files/caches are not relied on, such that they are outside the vhost.

I can't stress that enough: No code should ever be manually changed on the production system.  There just isn't any need for it, even for reproducing bugs that only seem to happen on production data (see below).  You run the risk of introducing bugs now or having your changes removed by the next proper deployment.

To be technical about it, a layout like this works quite nicely to fulfil all these requirements:

Config file:


Vhost directory:


The code inside the vhost can easily find the config file as it is named after the directory it is in and is one level up.

Now, you might baulk at the idea of writing an installer.  Your code needs to be installed and live nownownow, you don't have time to write an installer.  So don't.  Write an installer creator, which will take your lovely, well tested code and wrap it up in to a single file with an install script on the front.  There is an example one attached at the top; feel free to modify it and use it.  Protip: Where it is being modified to use in a single project, change the program arguments such that only the bare minimum needs to be typed each time.  For instance, your Install Paths are always going to be the same, as is your project name.  In your case, perhaps only the "release tag" is needed - be lazy, minimise your typing, minimise the knowledge you might need one day to hand over to a friend or colleague, minimise potential mistakes.

What else might we need to do when we "hit the button"?  With the new code in place, it should be a matter of upgrading the database and the config file.  And that is only if this is a "major" release.

Maintain upgrade.sql and readme files

Inside your code directory, keep a /docs.  Use htaccess to deny any access and file away any information on APIs, error codes, an upgrade.sql file and a readme file.

In your upgrade.sql file goes any database alter command or any data loading.  If you are using phpMyAdmin, it will give you the SQL to pop in here when you run it on your dev database.  If writing by hand, obviously just copy and paste it from your console.

To go live, simply:

cat upgrade.sql | mysql -u root -p myproddb

In your readme file goes anything else you might need to remember for later - mostly setting things in your config file, but maybe supporting packages to install in production too.  This is your checklist.  Get in to the habit of checking this before deployment and write it in a way that means you can just copy and paste to production any commands you might need to run.  The aim of the game is to minimise mistakes.  The sooner you're finished, the sooner you get dinner.

One could automate these things, but personally I prefer to have these steps under manual control.

But how do you know that when you're "pressed the button" it will all work?

Test your installer, test your code, test your DB

For the deployment to be immediately successful, you need to have well tested code and scripts.

But what does it mean to have well tested code?  You can just check things seem ok in your dev area and then push straight to live, right?

What we are after here is confidence.  Confidence that our software will work properly when put live, that it will behave properly with live data, that the live database can be upgraded cleanly and perfectly by your script.

Three steps to deployment testing heaven:

  1. Branch your code to a new area with a separate configuration file, configured for a separate database.
  2. Create and maintain a "Live Mirror" database
  3. Copy your Live Mirror to your testing database and run your upgrade.sql script over it

Step one should be fairly self explanatory: You should not test in your dev area.  You should create a separate area for testing, from scratch.  Remember: be lazy.  This should be fast to do, really just a single local branch of your code.  More on branching later.

Step two: a live mirror.  You don't want production data on your development server, it probably isn't secure.  But you do want a database which has that volume of data, where that data is completely natural and follows certain patterns.  

The way around this paradox is with anonymisation.  And another script.  You want making a new Live Mirror to be this easy:

Production server:

  mysqldump -u root -p livedbname | gzip -c9 > livedb.sql.gz

Dev server:

  scp prodserver:livedb.sql.gz .

  buildLiveMirror ~/livedb.sql.gz

Production server:

  rm livedb.sql.gz

 When you get back from making a cup of tea, your new Live Mirror database will be created.  But what's that script do?

  1. Drop and create myproj_livemirror database
  2. Load in that .sql dump file
  3. Run an anonymising .sql script over that database
  4. Remove that dump file of live data

Step three is where the magic happens: For every table which contains sensitive data, such as name and address, usernames, (hashed - always hashed) passwords, dates of birth, credit card numbers etc, you will have a line in that script to do an "update table" and replace each letter or digit in that field with a random character of the same type (letter or number).  You can write an SQL function quite easily that will do this, taking the current field contents and returning new field contents.  Do not use cryptography here, we are looking to replace the data with something that looks similar but is utter garbage. Cryptography would slow the process down and potentially make it insecure by making it reversible.  It does not need to be reversible: In a customer name, it doesn't matter what a letter is, as long as it is upper or lower case and the name is the correct length.  Same with a credit card or phone number.  Preserve non-alphanumeric characters, such as spaces or apostrophes.

As a nice-to-have, have the script change the admin password to something easy for development.  There is nothing of interest now in your database, so just "password" would do nicely.

This anonymisation is obviously a slow step, that's why you now have a nice hot cup of tea in your hand.

Note: I have not included a script here for example, as by this point you should be able to happily create a script to drop/create a DB, load in a dump file and execute an SQL script on your database.  Your anonymisation script is personal to you.

What are you going to do with this nice, accurate, live mirror of your production database?  Nothing.  Absolutely nothing.  You're not going to touch it.  You're going to copy it.  This bit is a lot faster.

./copyDB myproj_test

Another time saving script you will use many, many times: This will drop the database you specify, make a new one and copy your myproj_livemirror to it before running another script to load your stored procedures (more on this later).

Now you have a nice, fresh database to test with, you can test your upgrade.sql script.

cat upgrade.sql | mysql -u root -p myproj_test

If it works, great.  If not, fix and repeat the copyDB step to copy the _livemirror to a new _test DB and try again.  Once all good, go through your app - at least the bits you know the spec applies to at least - and make sure it works.

At this stage, a full regression test of your product might be in order.  You could even go through the last few steps to create a Staging area, completely separate from Dev, Test and Production, where your colleagues and even your client can have a play safely.

You're developing on a production quality database, testing code for deployment, but how do you know it will work on your production server?

Production Environment = Development Environment

Of course, all of this assumes that your development and production servers are in sync version-wise.  I like to do this in several ways:

  1. Use a Linux server distribution that has major version numbers.
  2. Do identical installs on Production and Development - say, Ubuntu 12.04 minimal, with no extras.
  3. Use a single "apt-get install" command to install your LAMP stack and all the PHP modules you think might be required.
  4. Put these few details in a page on your Wiki.  If you don't have a Wiki, set one up, they are invaluable for specs, notes, howtos, asset registers and, as here, server build details
  5. Make sure the systems are regularly dist-upgraded.  If you install a package on one that relates to your code (ie, a PHP module), install it on the other.  Note this in your readme file / production checklist.

Simple, huh?  It should be.  If you think this advice is too obvious, think again about the environments you have coded in, on ancient servers, which in no way match development, where the sysadmins are scared of the day they might have to build a working replica of one from scratch.

Here's a fun thing you might like to do on your production and development servers: Use a Source Control tool to version control your configuration files.  More on Source Control later, but for now it is enough to know that you can use a tool like bzr to keep old versions of every configuration file.

cd /etc

bzr init

bzr add *

bzr commit -m "initial"

The set of commands above will place every file under /etc under control of bzr.  When you change any Apache config or vsftp or exim or any number of other programs' config files in /etc, if you are happy with the changes you can "bzr commit" to tell bzr to take these files as the next version to store.  If you later change something and are not happy with the changes, you can diff to see what changed or you can revert to undo all changes.  Everything stays sane, everything stays happy, you stay lazy.

So now you know your development and production environments are in sync, your database upgrade will work, your code is tested and works and you're ready to press the button.

But meanwhile, back at the range... where did you create the installer from?  How did you write this code?  How have you kept this code in any way sane over the many thousands of lines you've written in an abstract order?

Source Control for fun and profit

What is "Source Control"?  It is a way of managing how your code changes over time.  You may have heard of bzr, git, mercurial, cvs... these are tools that help you incrementally change your code, review changes and, eventually, export a working version to test and eventually install to Live.

Good use of good Source Control software can save your skin.

I prefer bzr, so any examples will be based on bzr commands.  But most tools like this have many common operations, just differing syntax.

If you want to learn bzr, git, mercurial, etc, there are plenty of resources online which will explain these commands and all the theory behind.  Here, we will talk about how to use these tools effectively to help you in the deployment process.

The key to making sure of a good deployment is making sure your "trunk" is always, within reason, stable.

What do we mean by trunk?  

You are maintaining a Source Code "tree", which has a trunk.  At the start of the project, there is only a Trunk.  You add files to the project, edit them and "commit" them.  To "commit" means to push them in to Source Control, saving new versions of all changed files on to Trunk.  You have made Trunk longer.  The previous versions can all be pulled out, and one can compare the differences between any two versions of a file any time.


--->(commit v1)--->(commit v2)--->(commit v3)--->0

"0" is the tip of the "Trunk" branch.  You should aim to keep this as close as possible to Live.  If you have a bug on Live, you fix it on Trunk, test and promote Trunk to Live as quickly as possible.  If something goes wrong with the new release, you export the previous version of Trunk and put that Live.

Your client will approach you at some point and ask for a new feature.  This feature might take a few hours, days, weeks or even months.  How can you develop this feature but keep Trunk identical to Live?  You create a Feature Branch.  

A Feature Branch (Or just Branch) is a snapshot of the code you can develop in parallel to Trunk.  This lets you keep Trunk clean but carry on development.  When you are happy, you Merge the Branch on to Trunk, do some testing and promote to Live.

Here's how your Source Tree will look with a Branch:

--->(commit v1)--->(commit v2)--->(commit v3 - bugfix)--->(commit v4 - bugfix)--->0


                                      \----->(commit v2.1)---->(commit v2.2)---->1

So, at the point where the code was "v2", we created a Branch.  The code was changed and committed several times, independently of the Trunk.  At the same time, bugs were found on live, fixed on Trunk and Trunk was placed live - completely independently of your new whizzbang feature.

What bzr will do for you is fold or merge "1" on to "0".  It will take all those changes you made in all those files and give you a new version which combines all those changes, quickly and sanely.  

Now you can test, build your installer and deploy from Trunk.  If you really want to be crafty, and you know testing will take some time (days, weeks), you can make a new testing branch and merge everything on to there and test.  This way, your trunk stays clean and in step with Live, allowing for bug fixing.

The two other most useful functions of bzr are tag and export.  You may have wondered earlier what a "Release Tag" on.

When you want batch up all your files and stamp them as "this is all ready to test or go live", you tag them with something descriptive.  You can then always refer to this exact set of files at this exact point in time when you export or when you need to check for differences between versions.

So you might tag your testing branch as release_1.30_RC1, then push to your testing/staging area.  As bugs are found, you rev this to RC2, RC3 etc.  When you are ready for production, you finally tag as release_1.30 and instruct your installer creator to build you an installer for release_1.30.

Now you can export, test and deploy, nice and clean.

But how do you know what you have written up to now is what is needed?

Spec small, spec clean

Well, this is the million pound question.  How do we write software that people want to use?  How do we satisfy our clients?  You may recognise some of the below, and indeed the above, from the Agile or  Extreme programming methodologies.  I see a lot of these methodologies as common sense or obvious, and the idea that these rules are collected together in one place and called a methodology is a great way to give the business confidence that much method is being applied to the madness, and that software quality is being maintained.

Never write a whole spec for the product in one go.  You are going to be working on this product for a long time and much will change.

By all means, sketch out what the interface will look like overall, where things might go eventually.  But not everything has to be planned in minute details from the beginning.

I'm not saying do not plan; quite the opposite.  Plan, but only plan for what is needed now, but with some thought as to what will be implemented next.  Do not plan and write code "just in case", write it with a mind for what may be required later, but plan to change it later: refactor regularly.  Don't write generic code when you have a single use-case.  

From a design point of view, try to picture your product as a set of features which need to be implemented in a certain order, and that your customers will be using each feature as soon as it is ready and stable.

Break each feature down in to logical blocks, plan to do similar features around the same time and, where possible, make them share code.

Do not write generic code unless you have multiple use-cases. Write re-usable code, write specific code.  If you find a use-case which is similar in spirit to something you have previously coded, write now and refactor later.  

Try to re-use UI features, but do not write main driver functions to control disparate UI features.  Your users will thank you for the consistency, as will your software testers.

If you find yourself writing "generic" functions which take huge numbers of arguments to support multiple uses and contain much "spaghetti" code to decide what to do, you are doing it wrong.

If you're working in a UI, look at each screen as a set of features, each of which is a living, breathing entity, and write down how it should behave.  A feature could be a single screen widget or it could be a whole process of finding and displaying data.  Make it, where possible, so that a feature is self-contained.  By that, I mean it should be useful in its own right or add on to something else.  "Search" is not a feature unless you also implement "Search Results".  Though "Filtering of search results" would be a self-contained, useful add-on.  And from this, you have a "feature spec", which leads to an implementation and test plan.

Break down the features to logic blocks until you can picture pieces small enough to code and can attribute time to these problems...a few hours here, a few days there.  Note down in your spec any unknowns to fill in the gaps later and also jot down ideas for solving particular problems or useful libraries you think should be used, or design patterns which will make something reusable, because remember, you're not planning anything else right now but you are thinking about everything.

Use bullet-points for requirements as well as a clean description of behaviour.  Your client is going to read and understand these, and people are lazy.

Prioritise these features in order of use and put them in phases, where each phase will last for a few weeks at most.  At the end of a phase, there may be a fully tested product installed in to production - or there will at the very least be a client demo.  This involves your client in the process, which keeps everyone happy.

Always be happy to throw away bad code.  Demo early, demo often, let your client lead you.  If you are throwing away code and reimplementing it due to refactoring or you (or a colleague) has found a better way, be happy you have learned something.

But never try to implement too much in one go - your end product will have too many features and requirements to describe in a single spec.

Now you have planned, branched, coded, tested, merged, tested, db+install+code tested and deployed a feature, and it was almost entirely stress-free.

But where do these "features" come from?

Talk to your users

They are, after all, the people who will have the dubious pleasure of using your software on a daily basis.  Sit with them, watch what they do.  If they are describing what they have to do with their current software and are pointing out problems with it, ask yourself "should it be implemented this way without the bugs or has their last programmer approached this problem from completely the wrong angle?".  

Think about the way they work and imagine doing their job for them.  Are they doing too much?  What can be automated?

Make suggestions.  Word them as questions.  Never, ever be afraid to be wrong, as you will never learn anything if you are always right.

Go away and think.  Sketch things.  Go back to them with several ideas.  Involve your users in the whole process, make them proud of what you create together.  One of the benefits of working on the smallest, most useful cogs at a time is that you can always throw it away or change it if it is wrong.  Be flexible in what you produce and you'll turn your users in to excited designers.

The beginning.

...Click for More
software development
source control