I saw an ad on StackOverflow today for a project called WinFsp.
The site mentions the following features:
- Allows for easy development of file systems in user mode. There are no restrictions on what a process can do in order to implement a file system (other than respond in a timely manner to file system requests).
- Support for disk and network based file systems.
- Support for NTFS level security and access control.
- Support for memory mapped files, cached files and the NT cache manager.
- Support for file change notifications.
- Support for file locking.
- Correct NT semantics with respect to file sharing, file deletion and renaming.
Out of sheer curiosity: what might such library be used for? Why would anyone want to re-implement NTFS or something similar?
There are two aspects to that question: why would one want to develop a filesystem interface to data, and the other, why would one want to do that in userspace?
Let’s answer the second one first, because it’s the obvious one: because it is waaaaayyyyy easier. In fact, the question should be asked the other way around: why would one want to implement a filesystem in the kernel? Why would one want to implement anything in the kernel, really? And the answer is: you don’t want to implement anything in the kernel. You do it, because you have to, because you can’t implement it any other way. Well, in this case, you can, and so you do.
The first one is a bit trickier: why do we want to have a filesystem interface to data? And the answer is: because there is a vast amount of tools that can work with files, so by exposing data as files, you get access to all those tools.
Imagine for example a metadata filesystem, which exposes file metadata as files within a folder. So, an MP3 or a JPEG file would appear as folders and you could
cd into an MP3 file and inside that folder, there would be files like
lyrics.txt, etc. Now, if you wanted to mass-edit the titles of your MP3 files, you wouldn’t need to use a tag editor that someone else wrote, which may lack the certain feature you need, you could just use all the tools for editing text files you are familiar with – and editing text files is something we have been doing ever since the dawn of computing, so we are very good at that! (Or similarly, delete all author info from all Word documents on your system, would be absolutely trivial!)
Or, imagine a gitfs, which exposes branches, tags, commits, etc. as folders, and you can explore your history using the same tools you already know how to use for exploring your filesystem. Or, a mediaconvertfs, which exposes all your audio files, regardless of which format they are actually in, as MP3 files and converts them on-the-fly. A firefoxfs which exposes Firefox’s browsing history or settings as a folder hierarchy. A filesystem that can mount your Google Drive, Dropbox, iCloud, OneDrive, etc. as a folder hierarchy. A gmailfs that can mount your emails as a folder hierarchy. A wordpressfs that lets you edit your blog as a folder hierarchy. And so on.
Note that something similar is already happening in the Windows world. For example, PowerShell has very generic “cmdlets” that work with all sorts of objects, and files and folders are just yet another kind of object. However, this requires all sorts of new cmdlets to be written that can work with those generic objects, and it only works in PowerShell. The approach of a userspace filesystem is dual: instead of treating files as just objects and writing cmdlets that deal with objects, you treat objects as just files and rely on the vast amount of tools that have already been written for dealing with files. Similarly, the Windows explorer tries in some cases to pretend that something is a folder hierarchy when it really isn’t, and lets you navigate it that way. But, that only works in the Explorer, it doesn’t work anywhere else, whereas exposing something as a filesystem works everywhere, because everything knows how to deal with files.
Another different thing you can do with a userspace filesystem is to rapidly implement other actual filesystems. Remember, writing kernel code is harder than writing userspace code. So, it is often faster to implement a filesystem in userspace than in the kernel. In the Linux world, for example, there used to be a very clever implementation of NTFS using a userspace filesystem: it was implemented as a compatibility layer that emulates the NT kernel, and then simply used the actual
ntfs.sys file from Windows! The disadvantage is that this is somewhat slow, and it requires you to own a Windows license (to copy the
ntfs.sys from), but the advantage is that it is by definition always correct! How much more compatible can you be than using the actual filesystem driver code from Microsoft itself?
You could do something similar the other way around. E.g.
libext2fs which is a userspace implementation of the ext2/3/4 filesystem family. This library is already used by several Ext2 for Windows tools, but those tools are usually one-off filemanagers that allow you to copy files to and from the filesystem but not much else. It shouldn’t be too hard to build a “proper” userspace filesystem driver around it, to seamlessly mount ext2/3/4 filesystems the same way you can mount NTFS and FAT filesystems today.
More specifically, why would anyone want to build a User Space file system?
- Easier semantics. No obscure system calls.
- Better isolation. A User Space file system crash doesn’t bring down the whole OS.
- Easier to update than the kernel.
- Support for several programming languages.
You’re not re-implementing NTFS. What you’re actually doing is adding features to an already-existing file system, creating a file system metaphor over some existing data store, or providing a universal file system API. Imagine being able to browse a SQL database using file explorer, or integrating mail communications as files.
At a glance, it appears to be similar to FUSE, which is used for the development of dozens of filesystems and filesystem-like tools. Essentially, anything that makes sense in the form of files and directories can be exposed as a filesystem, and mounted through FUSE or WinFsp as if it were a physical disk, all with the added benefits of running as a user process instead of as a kernel service.
For example, Gnome GVFS uses FUSE to expose its GIO interfaces and to present SFTP, SMB, HTTP, DAV, and a lot of other services as filesystem mounts, so that you can use, for example, SFTP as if it were a disk in your system.