Say you've just created an application, and it uses a new type of file. This new type of file will be identified by its very own file extension, associated with the new app.

For example, you might use a ".snapper" file extension, even though the file itself is just "xml".

The ".snapper" filename is not very helpful to a user, as it hides the fact that this is an xml file. The only way for a person to work out that this is an xml file, would be to look inside.

Conversely, a ".xml" file extension would be unhelpful to the operating system, as it hides the fact that this is a snapper file. Again, the only way to work out that this is a snapper file, would be to look inside, and find the schema that this document matches (if any) (knowing me, probably none... sorry).

And this is a common scenario, particularly with variations of xml files.

So i'm suggesting a new micro-format, and this micro-format has nothing to do with the current microformat buzz on the internet. This is to do with multiple file extensions, set theory, cascading inheritance, and all sorts of tricky stuff. Yet it's very simple.

You can pick it up in under a minute.


Instead of just one file extension, why not give a file a whole bunch of file extensions, starting with the least specific and ending with the most specific!

An over the top example would be:


What does this file name mean?

  • Everything before the first dot is the name itself.
  • After the first dot we have the most general type of the file: it's a text file.
  • Then we have a more specific rule: it's an sgml file.
  • Then a more specific fact again: this particular sgml file is xml.
  • Then a more specific fact again: this particular xml file is of type 'snapper'.

Now this could be useful if, for example, the only verb defined for .snapper files is 'open', but the 'edit' verb is defined for .xml files.

Or maybe on your system, you don't know how to edit xml files, but you do know how to edit text files. Then, right clicking on the file in windows explorer, you'd not only have the choice to open the file with TimeSnapper, for example, but also to edit it with a text editor.

Today we often layer a specific format inside a general open format. And general open formats are built upon more general, more open, formats. (We could be fancy and call it some kind of aristotelian hierarchical classification system... but it's been too long since i read sophie's world, so i'm not gonna keep pretending i remember that stuff)

Anyway, I came up with this idea for a different reason altogether.

Bloody Polyglotics Again!

What if a file combined two languages, intermingled in the one document. For example, what if a file could be opened both as a valid sql file, say, and as a seXml file. Or as a C# file, and a seXml file.

There's a technical name for a program that can be compiled by two different compilers, and after a lot of googling i tracked it down... polyglot!

A more general case: what about files that contain multiple discrete syntaxes in a single document. A common example: a Valid html file might also be a valid xml file. You want to view it as html, but you want to edit it as xml.

(Okay we have the xhtml extension for that... but if we invent new extensions for every combination of two or more existing extensions, we'll be looking at a lot of extensions within the next ten thousand years.)

Or how about a file that combines javascript, css, and html. Perhaps you'd like to edit the css component in one application, the javascript component in another and the html component in a third. Maybe these multiple file extensions could allow for such behaviour.

(In this last case, the applications would need to be clever enough to know the data they're interested in, and to avoid the data they're not interested in. But it's kind a possible.)

(What i'd like to see is a codegenerator that spits out all types of files (it might create ".cs" files, ".config" files, ".sql" files and everything else). But by adding other names earlier in the list ".wscg.cs", ".wscg.config", ".wscg.sql", it can still reserve the right to edit these file types... even though it knows nothing about them. Provided it knows how comments work in the target format, it can embed it's own iXml or seXml tags amongst these comments.... possibly providing enough information to re-generate the files, and identify user edited portions...)

Well, that's my 'microformat' idea of the day. It's only micro-useful, so don't micro-flame me.

A follow on thought from this was covered in yesterday's iXml post.

[Update: renamed as 'Cascading File Types' based on comment from Jonno. Cheers Jonno!]


Your comment, please?

Your Name
Your Url (optional)
Note: I may edit, reuse or delete your comment. Don't be mean.