How to restart your Azure workers in less than a minute

Ever been waiting in front the Windows Azure Console for your apps to get deployed and restarted ?

Well, although this behavior is rather annoying, the console only behaves as expected. Indeed, even if you’re only deploying a 1MB package, the Windows Azure fabric ends up redeploying a whole virtual machine, or rather of a whole set of virtual machines if your role happens to have multiple instances.

Obviously, part of the problem comes from the super heavy stack that an OS represents. Progresses should be expected as Just Enough OSes get trimmed down for the sake of public clouds (Windows Azure or Amazon); but I suspect it will take years.

Meantime, at Lokad, we are too much in a hurry to spend our life waiting for Azure deployments. Thus, we decided to cheat.

Instead of going through the Windows Azure Console to redeploy our apps, we did setup an AppDomain isolation layer within Lokad.Cloud, our open source O/C mapper. Basically, instead of redeploying apps though the Windows Azure Console or through the Management API, we redeploy by uploading a ZIP archive in the Blob Storage.

Lokad.Cloud monitors the Blob Storage. When, a new ZIP archive becomes available, Lokad.Cloud simply unloads the client AppDomain running on the WorkerRoles and restart the processes with fresh AppDomains. Offloading AppDomains and reloading them takes about 2s.

Then, in order to keep the overhead very low, we only ping the storage for updated assemblies every 30s (pinging costs 0.1cts / month / worker), but this value can obviously be increased depending of your needs.

Don’t wait, just go for AppDomain isolation too.