Author

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)

Meta

Entries in review (9)

Monday
Feb122007

What's wrong with PAD files

There are quite a lot of things that are just simply wrong in the IT industry nowadays, I have already discussed the case of the Google Adwords, let's move to the subject of PAD files.

PAD stands for Portable Application Description, it's an XML format designed by the shareware industry to facilitate the submission of software products to software directories. The idea is pretty simple and pretty nice. As a software manufacturer, you create a PAD file for each one of your products; then you publish this PAD file directly on your website. For example, when Lokad did release its first open source product named Lokad Sales Forecasting for ASP.Net, I have created (and submitted) a PAD file for this application.

Submitting through cut-and-paste


Before PAD, you were just manually submitting your product description to every software directory of the web. Now with PAD, you're still submitting your product description to every single software directory on the web; but the submit operation is now (usually) restricted to a single operation: cut-and-pasting the URL of your PAD file. The support for PAD among the shareware/freeware distributor industry is really impressive. I would guess that over 95% of the freeware / shareware industry now supports PAD files.

But the only thing really impressive about PAD is its absolute lack of design.

When the XML design makes no sense


As a software producer, you don't need to manually generate your PAD file, you got a free editor for that. Yet, I don't think I have ever seen an XML schema that is so massively adopted while being so poorly designed.

They are so many issues with PAD that it's actually hard to even summarize the topic. Following a quasi-random order, the main PAD issues would be

  • You need to specify the size of your software in bytes, kilo-bytes AND mega-bytes (File_Size_Byte, File_Size_K AND File_Size_MB). Don't you think that this information is somehow redundant?

  • The requirement description is restricted to OS version. What about required 3rd party software like DirectX or .Net?

  • Open source (or source availability) is not part of the fields; furthermore it is not really possible to use PAD to describe open source software.

  • Software components / library cannot be described. It does not really "fit" the PAD template.

  • The software category field make no sense; a tag based system (think swik.net) would have been some much simpler AND so much more efficient.

  • No (X)HTML support for your description fields. Your software description ends up plain text. As a result, big lump of texts (like the 2000 characters description) are almost totally unreadable.

  • No consistence in XML tags naming
    • some tags are UPPER_CASED

    • some tags are Camel_Cased

    • some tags are explicit Program_System_Requirements

    • some tags are abbreviated Char_Desc_45
  • The localization makes no sense (localizing a software ~ translating the software + adapting to the regional settings)
    • only the Description tag can be localized.

    • not possible to localize the other fields like contact or support emails, like the screenshot.

    • no encoding specified upfront in the XML file.
  • The company address fields makes no sense for non-US locations (State_Province only apply to USA/Canada).

  • Why hard-coding the cost in US Dollars (Program_Cost_Dollars)? There are a lot of currency out-there. Then why not being able to support a price list? (list of currency/value).

  • The Download URL section is just moronic. You can specify up to 4 download URLs (why 4?) and the each URL gets its own special tag with no naming consistency
    • Primary_Download_URL / Secondary_Download_URL / Additional_Download_1 / Additional_Download_2

    • why not simply providing a URL lists?
  • The screenshot section is restricted to a single image URL. Why not a list?

  • No extensive mechanism for the affiliate programs (because the list of affiliate programs is hard-coded).

Note that all those suggestions would have made PAD easier to document, to produce and to consume.

Then some high-level criticisms could also be made

  • no mechanism to link to other PAD files (especially useful to support software versions).

  • no persistent mechanism using Global Identifiers (especially useful to detect replicated PADs).

  • no mechanism to retrieve the PAD files by simply crawling the web (think XML feed links in HTML pages).

Summary: PAD has been designed by junior high kids (probably)


Based on the previous elements, we could say that the PAD authors had no clue about

  • XML design: tag naming is random, data structures like lists are ignored.

  • Web design text readability is not a concern, screenshots are unimportant.

  • The world outside of the USA: utterly naive attempt to support internationalization.

  • Software industry: operating systems are the only components worth to be mentioned.

Still, I think that a Portable Application Description is based on a good idea, but it would really need to be re-designed from scratch.

Wednesday
Nov082006

A small developer-oriented PowerShell wish list

I have just started to use the PowerShell a few days ago; and I am more and more impressed by the work that have been done by the MS folks. Yet, being a developer, I have the feeling that many aspects of PowerShell still need to be polished.

  • Too bad that there is no Visual Studio project templates for CmdLets. Providing an hello-world CmdLet with its associated SnapIn would really make the life of the developers easier, smoothing the learning curve.

  • Implementing a CmdLet is so full of traps, I would really see some strong FxCop support to help the poor developers to avoid the most obvious errors.

  • A few objects dedicated to CmdLet unit testing. Currently, in order to get a CmdLet unit tested, you have to resort to some custom tricks. A small dedicated & embedded testing framework would ensure that the CmdLet environment is simulated in a proper manner.

  • A dedicated MSBuild task to include whole PowerShell scripts (using the CDATA); the MSBuild properties being inherited as PowerShell variables. MSBuild and PowerShell do have very complementary design; integration would be the key to leverage both of them.

  • Some XSLT convertion to transform the .Net documentation produced by the compiler into CmdLet Help documentation. In the current situation, the CmdLet documentation end-up duplicated in the .Net source and in the CmdLet XML documentation file.
Wednesday
Oct182006

Blogosphere quatitative elements, a few slides

Being part of the Corps des Telecoms, I have been assigned for a modest blog study within the scope of a communication course. I have found the subject highly interesting because blogs open whole new directions of research and whole new methodologies. Indeed, contrary to most usual communication events (ex: person-to-person talks), it's potentially possible to retrieve most of the blogosphere content through web crawling intensive methods.

The slides of the talk have been uploaded. Disclaimer: don't take the content for granted, the professor in charge of the communication course was thinking that this talk was a big pile of stupidities from the first slide to the last. My personal opinion is quite different on this matter. Well, I can't make everybody happy.

Tuesday
Oct032006

The unteachable parts of software engineering

Having the responsibility to handle the software engineering course at the ENS in spring 2007, I have started to think about the desirable qualities that make the difference between an average developer and a brilliant one. Indeed, I can't think of a better goal for this course but to actually try to develop such qualities.

What makes a "good" software developer?

It's an obvious fact that many qualities and skills are required to make a good developer. Smart and Get things done are often cited as the top criterions to decide whether a candidate should be recruited or not.

... a passionate curiosity for software related matter ...

I would complete those two criterions with a third one that I consider to be no less important: a passionate curiosity for software-related matters. Most well-known antipatterns such as programming by permutation, golden hammering, re-inventing the wheel, ... are caused by a lack of curiosity. There is far too much to know of the subject of software engineering to trust any particular school diploma (or certification) to be sufficient to produce even a "passable" developer.

Additionally, the software world is fast-paced. Hardware and software get obsolete alike. Development methods evolve. Better tools are released continuously. The sheer (intellectual) complexity of those evolutions is beyond what a single individual can possibly handle. Curiosity when leveraged through teamwork is a strong driver to actually maintain the development practices and tools as close as possible to a state-of-the-art level.

Finally, on the long run, I do not see how it is possible to stay motivated (and therefore productive) if there is no eager interest in mastering this constant flow of evolutions.

How to train such a developer?

I have listed smart, get things done and passionate curiosity as being the top qualities for a software engineer. But this list is more than troublesome for a teacher: it's not even clear whether any of those qualities can be actually taught

Concerning the first criterion, i.e. smart, I have simply surrendered all "academic" ambitions. The course schedule is far too short; beside I have the chance of having ENS students who are already very smart (though entrance exams have already filtered out students who were not so smart). Yet, the course can them the opportunity to apply their intelligence to a large variety of fuzzy problems that are inherent to software development.

The second criterion get things done is probably the most actionable element of the three. The French education system (highly selective and highly individualistic) usually produces people that are relatively weak against this criterion, the ENS students being no exception (quite the opposite in fact). I have planned to incorporate a student software project within the course mostly to push student develop a get things done mentality .

As for the first criterion, I haven't much ambition to actually transmit a passionate curiosity the students. In this respect, my first ambition will be to avoid the issue that usually plagues software engineering courses: an overwhelming boredom. I do not think it is really possible to create curiosity ex-nihilo if people have no interest beforehand. But, assuming that students are at least somehow interested (well, if not, it's going to be tough for the students and me alike), an "expand your horizons" strategy for the course might give them materials to apply and develop their curiosities.

There are two kinds of students: those who are too weak to be taught anything and those who are so strong that the teacher is totally unnecessary.

Bottom-line: out of three main qualities to make a good developer, the ambitions associated with the software engineering course I am thinking of are pretty weak. Well, considering the importance of the subject, I still believe it's a worthy attempt (stay tuned, more on later posts ... )

Page 1 2