I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)


Entries in azure (26)


Fat entities for Table Storage in Lokad.Cloud

After realizing the value of the Table Storage, giving a lot of thoughts about higher level abstractions, and stumbling upon a lot of gotcha, I have finally end-up with what I believe to be a decent abstraction for the Table Storage.

The purpose of this post is to outline the strategy adopted for this abstraction which is now part of Lokad.Cloud.

Table Storage (TS) comes with an ADO.NET provider part of the StorageClient library. Although I think that TS itself is a great addition to Windows Azure, frankly, I am disappointed by the quality of the table client library. It looks like an half backed prototype, far from what I typically expect from a v1.0 library produced by Microsoft.

In many ways, the TS provider now featured by Lokad.Cloud is a pile of workarounds for glitches that are found in the underlying ADO.NET implementation; but it's also much more than that.

The primary benefit brought by TS is a much cheaper way of accessing fine grained data on the cloud, thanks to the Entity Group Transactions.

Although, secondary indexes may bring extra bonus points in the future, cheap access to fine grained data is basically the only advantage of Table Storage compared to the Blob Storage at present day.

I believe there are couple of frequent misunderstandings about Table Storage. In particular, TS is nowhere an alternative to SQL Azure. TS features nothing you would typically expect from a relational database.

TS does feature a query language (pseudo equivalent of SQL), that supposedly support querying entities against any properties. Unfortunately, for scalability purposes, TS should never be queried without specifying row keys and/or partition keys. Indeed, specifying arbitrary properties may give a false impression that it just works ; yet to perform such queries, TS has no alternative but to scan the entire storage, which means that your queries will become intractable as soon your storage grows.

Note: If your storage is not expect to grow, then don't even bother about Table Storage, and go for SQL Azure instead. There is no point in dealing with the quirks of a NoSQL store, if you don't need to scale in the first place.

Back to original point, TS features cheaper data access costs, and obviously this aspect had to be central in Lokad.Cloud - otherwise, it would not been worthwhile to even bother with TS in the first place.

Fat entities

To some extend, Lokad.Cloud puts aside most of the property-oriented features of TS. Indeed, query aspects of properties don't scale anyway (except for the system ones).

Thus, the first idea of was to go for fat entities. Here is the entity class shipped in Lokad.Cloud:

    public class CloudEntity < T >
        public string RowRey { get; set; }
        public string PartitionKey { get; set; }
        public DateTime Timestamp { get; set; }
        public T Value { get; set; }

Lokad.Cloud exposes the 3 system properties of TS entities. Then, the CloudEntity class is generic and exposes a single custom property of type T.

When an entity is pushed toward TS, the entity is serialized using the usual serialization pattern applied within Lokad.Cloud.

This entity is said to be fat because the maximal size for CloudEntity in 1MB (actually it's 960KB) which corresponds to the maximal size for an entity in TS in the first place.

Instead of going for 64KB limitation per property, Lokad.Cloud offers an implementation that come with a single 1MB limitation for the whole entity.

Note: Lokad.Cloud relies, under the hood, on a hacky implementation which involves spreading the serialized representation of the CloudEntity over 15 binary properties.

At first glance, this design appears questionable as it introduces some serialization overhead instead of relying on the native TS property mechanism. Well, a raw naked entity costs about 1KB due to its Atom representation. In fact, the serialization overhead is negligible, even for small entities; and for complex entities, our serialized representation is usually more compact anyway due to GZIP compression.

The whole point of fat entities is to remove as much friction as possible from the end-developer. Instead of worrying about tight 64KB limits for each entity, the developer has only to worry about a single and much higher limitation.

Furthermore, instead of trying to cram your logic into a dozen of supported property types, Lokad.Cloud offers full strong-typing support through serialization.

Batching everywhere

Lokad.Cloud features a table provider that abstracts the Table Storage. A couple of key methods are illustrated below.

public interface ITableStorageProvider
  void Insert(string tableName, IEnumerable> entities);
  void Delete(string tableName, string partitionKeys, IEnumerable rowKeys);
  IEnumerable> Get(string tn, string pk, IEnumerable rowKeys);

Those methods have no limitations concerning the number of entities. Lokad.Cloud takes care of building batches of 100 entities - or less, since the group transaction should also ensures that the total request weight less than 4MB.

Note that the 4MB restriction of the TS for transactions is a very reasonable limitation (I am not criticizing this aspect) , but the client code of the cloud app is really not the right place to enforce this constraint as it significantly complicates the logic.

Then, the table provider also abstracts away all subtle retry policies that are needed while interacting with TS. For example, when posting a 4MB transaction request, there is a non-zero probability of hitting a OperationTimedOut error. In such a situation, you don't want to just retry your transaction, because its very likely to fail again. Indeed, time-out happens when your upload speed does not match the 30s time-out of the TS. Hence, the transaction needs to be split into small batches, instead of being retried as such.

Lokad.Cloud goes through those details so that you don't have to.


Table Storage gotcha in Azure

Table Storage is a powerful component of the Windows Azure Storage. Yet, I feel that there is quite a significant friction working directly against the Table Storage, and it really calls for more high level patterns.

Recently, I have been toying more with the v1.0 of the Azure tools released in November'09, and I would like to share a couple of gotchas with the community hoping it will save you a couple of hours.

Gotcha 1: no REST level .NET library is provided

Contrary to other storage services, there is no .NET library provided as a wrapper around the raw REST specification of the Table Storage. Hence, you have no choice but to go for ADO.NET.

This situation rather frustrating because ADO.NET does not really reflect the real power of the Table Storage. Intrinsically, there nothing fundamentaly wrong with ADO.NET, it just suffers the law of leaky abstractions, and yes, the table client is leaking.

Gotcha 2: constraints on Table names are specific

I would have expected all the storage units (that is to say queues, containers and tables) in Windows Azure to come with similar naming constraints. Unfortunately it's not the case, as table names do not support hyphens for example.

Gotcha 3: table client performs no client-side validation

If your entity has properties that do not match the set of supported property types then properties get silently ignored. I got burned through a int[] property that I was naively expecting to be supported. Note that I am perfectly fine with the limitations of the Table Storage, yet, I would have expected the table client to throw an exception instead of silently ignoring the problem.

Similarly, since the table client performs no validation, DataServiceContext.SaveChangesWithRetries behaves very poorly with the default retry policy as a failing call due to, say, and entity that already exists in the storage, is going to attempted again and again, as if it was a network failure. In this sort of situation, you really want to fail fast, not to spend 180s re-attempting the operation.

Gotcha 4: no batching by default

By default DataServiceContext.SaveChanges does not save entities in batch, but performs 1 storage call for each entity. Obviously, this is a very inefficient approach if you have many entities. Hence, you should really make sure that SaveChanges is called with the option SaveChangeOptions.Batch.

Gotcha 4: paging takes a lot of plumbing

Contrary to Blob Storage library that abstracts away most nitty-gritty details such as the management of continuation tokens, the table client does not. You are forced into a lot of plumbing to perform something as simple as paging through entities.

Then, back to the method SaveChanges, if you need to save more than 100 entities at once, you will have to deal yourself with the capacity limitations of the Table Storage. Simply put, you will have to split your calls into smaller ones: the table client doesn't do that for you.

Gotcha 5: random access to many entities are once takes even more plumbing

As outlined before, the primary benefit of the Table Storage is to provide a cloud storage much more suited than the Blob Storage for fine-grained data access (up to 100x cheaper actually). Hence, you really want to grab entities by batch of 100 whenever possible.

Turns out that retrieving 100 entities following a random access pattern (within the same partition obviously) is really far from being straightforward. You can check my solution posted on the Azure forums.

Gotcha 6: table client support limited tweaks through events

Although there is no REST level API available in the StorageClient, the ADO.NET table does support limited customization through events: ReadingEntity and WritingEntity.

It took me a while to realize that such customization was possible in the first place as those events feel like outliers in the whole StorageClient design. It's about the only part where events are used, and leveraging side-effects on events is usually considered as really brittle .NET design.

Stay tuned for an O/C mapper to be included in Lokad.Cloud for Table Storage. I am still figuring out how to deal with overflowing entities.


Scaling-down for Tactical Apps with Azure

Cloud computing is frequently quoted for unleashing the scalability potential of your apps, and the press goes wild quoting the story of XYZ a random Web 2.0 company that as gone from a few web servers to 1 zillion web servers in 3 days due to massive traffic surge.

Yet, the horrid truth is: most web apps won’t ever need to scale, and an even smaller fraction will even need to scale out (as opposed of scaling up).

A pair of servers already scales up to millions of monthly visitors for an app as complex as StackOverflow. Granted, people behind SO did a stunning job at improving the performance of their app, but still, it illustrates that moderatly scaling up already brings you very far.

At Lokad, although we direly need our core forecasting technology to be scalable, nearly all  other parts do not. I wish we had so many invoices to proceed that we would need to scale out our accounting database, but I don’t see that happen any time soon.

Actually, over the last few months, we have discovered that cloud computing have the potential to unleash yet another aspect in the software industry: the power of scaled-down apps.

There is an old rule of thumb in software development that says that increasing the complexity of a project by 25% increases the development efforts by 200%. Obviously, it does not look too good for the software industry, but the bright side is: if you cut the complexity by 20% then you halve the development effort as well.

Based on this insight, I have refined the strategy of Lokad with tactical apps. Basically, a tactical app is a stripped-down web app:

  • not core business, if the app crashes, it’s not a showstopper.
  • features are fanatically stripped down.
  • from idea to live app in less than 2 weeks, single developer on command.
  • no need to scale, or rather very unlikely.
  • addresses an immediate need.

Over the last couple of weeks, I have released 3 tactical apps based on Windows Azure:

Basically, each app took me less than 10 full working days to develop, and each app is addressing some long standing issues in its own narrow yet useful way:

  • Website localization had been a pain for us from the very beginning. Formalized process where tedious, and by the time the results were obtained, translations were already outdated. Lokad.Translate automates most of the mundane aspect of website localization.
  • Helping partners figuring out their own implementation bugs while they were developing against our Forecasting API was a slow painful process. We had to spend hours guessing what could be the problem in partner's code (as we typically don’t have access to the code).
  • Helping prospects to figure out how to integrate Lokad in their IT, we end-up facing about 20 new environments (ERP/CRM/MRP/eCommerce/…) every week, which is a daunting task for a small company such as Lokad. Hence, we really need to plug partners in, and Lokad.Leads is helping us to that in a more straightforward manner.

Obviously, if we were reach 10k visits per day for any one of those apps that would be already a LOT of traffic.

The power of Windows Azure for tactical apps

Tactical apps are not so much a type of apps but rather a fast-paced process to deliver short-term results. The key ingredient is simplicity. In this respect, I have found that the combination of Windows Azure + ASP.NET MVC + SQL Azure + NHibernate + OpenID is a terrific combo for tactical apps.

Basically, ASP.NET MVC offers an app template that is ready to go (following Ruby on Rails motto of convention over contention). Actually, for Lokad.Translate and Lokad.Debug, I did not even bother in re-skinning the app.

Then, Windows Azure + SQL Azure offer an ultra-standardized environment. No need to think about setting up the environment, environment is already setup, and it leaves you very little freedom to change anything which is GREAT as far productivity is concerned.

Also, ultra-rapid development is obviously error-prone (which is OK because tactical apps are really simple). Nevertheless, Azure provides a very strong isolation from one app to the next (VM level isolation). It does not matter much if one app fails and dies suffering some terminal design error, damage will be limited to app itself anyway. Obviously, it would not have been the case in a shared environment.

Finally, through OpenID (and its nice .NET implementation), you can externalize the bulk of your user management (think of registration, confirmation emails, and so on).

At this point, the only major limitation for tactical apps is the Windows Azure pricing which is rather unfriendly to this sort of apps, but I expect the situation to improve over 2010.

·         not part of your core business, if the app crashes, it’s annoying, but not a showstopper.

·         features are fanatically stripped down.

·         from idea to live app in less than 2 weeks, single developer on command.

·         no need to scale, or rather, the probability of needing scalability is really low.

·         addresses an immediate need, ROI is expected in less than a few months.


Lokad.Translate v1.0 released (and best wishes for 2010)

A few weeks ago, I have been discussing the idea of continuous localization. In summary, the whole point is to do for localization (either websites or webapps) what is done by the continuous integration server.

Obviously, the translation itself should be done by professional translators, as automated translation tools are still light years away from the required quality.

Beyond this aspect, nearly all the mundane steps involved in localization works can be automated.

This project has taken the shape of an open-source contribution codenamed Lokad.Translate. This webapp is based on ASP.NET MVC / C# / NHibernate and targets Windows Azure.

This first release comes as a single-tenant app. We are hosting our own instance at but if you want start using Lokad.Translate, you will need to setup your own.

Considering that Lokad.Translate is using both a WebRole and a WorkerRole (plus a 1Gb SQL Azure instance too), hosting Lokad.Translate on Azure should cost a bit less than $200 / month (see the Windows Azure pricing for the details).

Obviously, that's a pretty steep pricing for a small webapp. It's not surprising that Make it less expensive to run my very small service on Azure comes as the No1 community most-voted feature.

Yet, I think the situation will improve within 2010. Many cloud hosting providers such as RackSpace are already featuring small VMs that would be vastly sufficient for a single tenant version of Lokad.Translate. Considering that Microsoft will be offering similar VMs at some point, it would already drop the price around $30.

If we add also that CPU pricing isn't going to stay stuck at $0.12 forever, the hosting price of Lokad.Translate is likely to drop significantly within 2010.

Obviously, the most efficient way to cut the hosting costs would be to turn Lokad.Translate into a multi-tenant webapp. Depending on community feedback, we might consider going that road later on.

The next milestone for Lokad.Translate will be to add support for RESX files in order to support not only website localization, but webapp localization as well.


O/C mapper for TableStorage 

The Table Service API is the most subtle service provided among the cloud storage services offered by Windows Azure (with also include Blob and Queue Series for now). I did struggle a while to eventually figure out what was the unique specificity of Table Storage from a scalability perspective or rather from a cost-to-scale perspective as the cloud charges you according to your consumption.

Since the scope of the Table Storage remained a fuzzy element for me for a long time, the beta version of Lokad.Cloud does not include (yet) support for Table Storage although. Rest assured that this is definitively part of our roadmap.

TableStorage vs. others

Let's start by identifying the specifics of TableStorage compared to other storage options:

  • Compared to Blob Storage,
    • Table Storage provides a much cheaper fine-grained access to individual bits of information. In terms of I/O costs, Table Storage is up to 100x cheaper than Blob Storage through Entity Group Transaction.
    • Table Storage will (in a near feature) provides secondary indexes while the Blob Storage only provide 1 single hierarchical access to blobs.
  • Compared to SQL Azure,
    • Table Storage lacks about everything you would expect from a relational database. You cannot perform any Join operation or establish a Foreign key relationship and this is very unlikely to be ever available.
    • yet, while SQL Azure is limited to 10GB (this value might increase in the future, this is really not the way to go), Table Storage is expected to be nearly infinitely scalable for its own limited set of operations.

The StorageClient library shipped with Azure SDK is nice as it provides a first layer of abstraction against the raw REST API.  Nevertheless, coding your app directly against the ADO.NET client library seems painful due to the many implementation contraints that comes with the REST API. Further separation of concerns is needed here.

The Fluent NHibernate inspiration

TableStorage has way much less expressivity than relational databases, nonetheless, classical O/R mappers are great source of inspiration, especially nicely designed ones such as NHibernate and its must-have addon Fluent NHibernate.

Although, the mapping entity-to-object isn't that complex in the case of TableStorage, I firmly believe that a proper mapping abstraction ala Fluent NH could considerably ease the implementation of cloud apps.

Among key scenarios that I would like to see addressed by Lokad.Cloud:

  • A seamless management of large entity batches when no atomicity is involved: let's say you want to update 1M entities in your Table Storage. Entity Group can actually reduce I/O costs by 100x. Yet, Entity Group comes with various constraints such as no more than 100 entities per batch, no more than 4MB by operation, ... Fine-tuning I/O from the client app would have to be replicated for every table, it really makes sense to abstract that away.
  • A seamless overflowing management toward the Blob Storage. Indeed, Lokad.Cloud already natively push overflowing queued items toward the Blob Storage. In particular, Table Storage assume than no properties should weight more than 64kb, but manually handling the overflow from the client app seems very tedious (actually a similar feature is already considered for blogs in SQL Azure).
  • A more customizable mapping from .NET type to native property types. The amount of property types supported by the Table Storage is very limited. Although a few more types might be added in the future, Table Storage won't (ever?) be handling native .NET type. Yet, if you have a serializer at hand, problem is no more.
  • A better versioning management as .NET properties may or may not match the entity properties. Fluent NH has an exemplary approach here: by default, match with default rule, otherwise override matching. In particular, I do not want the .NET client code to be carved in stone because of some legacy entity that lies in my Table Storage.
  • Entity access has to be made through indexed properties (ok, for now, there isn't many). With the native ADO.NET, it's easy to write Linq queries that give a false sense of efficiency as if entities can be accessed and filtered against any property. Yet, as data grow, performance is expected to be abysmal (beware of timeouts) unless entities are accessed through their indexes. If data is not expected to grow, then you go for SQL Azure instead, as it's way more convenient anyway.

Any further aspects that should be managed by the O/C mapper? Any suggestion? I will be coming back soon with some more implementation details.