Goals | Data Storage and Organisation

With our data distributed among many devices, on many different kinds of file systems it’s important to the user to be able to maintain coherence. It’s also important to be able to maintain a log of what has changed, when and how it was changed and interestingly by who.

Synchronisation and Conflict Handling

Most existing distributed file systems don’t allow conflicts – so no disconnected operation is possible. Those that do, resolve conflicts by complicated and error prone rules when the system synchronises. This means that the user must be aware of and know how to correct errors which occur as a result of the synchronisation process.

Conversely a distributed version control system is capable of resolving conflicts in textual documents reasonably easily be it structured in XML, C or just plain text. Branching and merging are two of the most important features of DVCS and they’re also the basis of what we need in order to achieve a distributed file system that enhances the user experience across all of their devices.

There is of course one final issue which needs resolving. What if the data formats aren’t textual? For instance, what if you have a GIMP Image (XCF) which has many layers, you then edit one copy of the document modifying one layer on your desktop computer, and modify a copy of the document on your laptop computer adding a new layer? In this case we will build the conflict resolution into the applications providing a set of tools which the application can use to interrogate different versions of the file, as only the applications fully appreciate their data format. This also means we don’t interrupt someone’s work flow by forcing them to merge at an inappropriate time.

Continuous backup

The backup problem has been fixed in many ways in the past, however it has always been a difficult issue to solve effectively. If we look at Apple’s time machine for instance we see that it’s capable of doing a lot of what we’d expect from a backup system, and it achieves this in a fairly brilliant way. However it doesn’t account for the subtlety of changes, changes themselves are tracked as duplicates of changed documents. The changes are sent off to either an external hard disk or time capsule.

So lets look at a DVCS as a backup solution, our changes are tracked as individual ‘change sets’ we therefore have the ability to cherry pick our backup restores, consider each change set as an atomic object which can be joined into or removed from the current active version. Of course we can still duplicate this DVCS file system on an external device for data security or push it off onto the web or store our data in a ‘cloud’ of replicators for added security.

Collaboration

With Data Storage and Organisation we get collaboration tools for free! For different file formats, across many different devices and across networks.

Semantic Organisation

Folder heirarchies are a thing of the past. Users tend to build complicated and difficult to manage file systems, or alternatively drop all of their files into a single folder. Instead of relying on the user to build a structure for organising their own data why don’t we allow the computer to use information already present, and information supplied by the user to organise data.

Documents already include lots of the information about themselves. This information is currently being used by technologies such as Apple Spotlight, Beagle Search and Meta Tracker. After looking at projects like Organise Framework we decided that building this technology on top of Meta Tracker and integrating Meta Tracker and Data Storage and Organisation would provide a self organising file system to meet the needs of todays computer users.