Advanced Linux Desktop Search -- MetaFS
MetaFS is still in the pre-alpha stage of development. That means
not everything discussed here will work, or has even been written
yet.
Background
So ... what's the problem, anyway?
Most modern computers use hierarchical filesystems for storing and retrieving
data. This method of organization has been in wide and successful use for
decades. However, the amount of information we store and access has grown by
many orders of magnitude since filesystems were first created.
Pure hierarchies don't work anymore. Users often create immense folder
structures for their information, however, such structures require a
non-trivial amount of effort to maintain. Furthermore, users sometimes have
diffculty remembering how they categorized a particular document, so they have
to waste time looking for it. Some users may not create folder structures at
all, instead preferring to save everything in one big dumping ground. They will
have to search through this dumping ground repeatedly looking for lost
information.
Manual searching (or even automated searching without an index) is a
time-consuming and frustrating operation. This is especially true with more
heterogeneous sets of information, or information that has been only loosely
categorized.
But hasn't someone else solved this?
No. There have been other attempts at solving the organizational problem, however,
they are either too narrowly-focused (such as pure search tools), or they are not
compatible with existing storage APIs.
Some, such as Google's Desktop Search,
Apple's Spotlight, or
GNOME's Beagle, focus primarily
on searching and presenting search results to the user. Instead of building
search into the filesystem, they provide it as a separate application. This
approach is not helpful to power users, who often would prefer to work mostly
within their super-customized xterms and shells. It is also limited in its
ability to provide additional metadata (such as the title and artist of a song
in MP3 format) to other applications. (Also, the only one of these projects
that works on Linux is Beagle.)
Other projects, such as MIT's
Haystack Project, throw out the hierarchical filesystem completely,
preferring instead to use their own internal structure, which is then exposed
to the user and other applications via a proprietary interface. However, in
order to be used comprehensively (and thus effectively), these projects would
require all applications to be refactored to take advantage of their
APIs.
So why is MetaFS better?
MetaFS strikes a balance between these two extremes. It aims to build advanced
functionality (such as search) into the filesystem in a way that is transparent to all
applications (including a hacker's favorite shell). Users and developers may continue
using the applications and APIs they know and love, and need only step into the
realm of MetaFS when it suits them.
Details
MetaFS is an enhanced layer of functionality that sits atop the standard Linux
filesystem. It provides the usual semantics one would expect from a UNIX
filesystem -- inodes, directories, files, (sym)links, and such -- as well as
additional information about the files themselves. This additional information
comes in the form of either extended attributes, or "services" that appear as
virtual files or directories.
Plugins
MetaFS is strongly based on plugins, and is therefore extensible in almost every
way imaginable. These plugins fall into a few general categories:
- Attribute plugins -- These extract data from files (such as artist/title
information from MP3s), and present that data through the filesystem as
extended attributes.
- Service plugins -- These provide an "alternate view" of a file. For
example, a service plugin may allow you to browse through a .tar.gz file as
though it were a normal directory.
- Root plugin -- The root plugin, as its name implies, provides MetaFS
with its root directory (usually a subdirectory somewhere in the underlying
"real" filesystem), and thus the basic filesystem hierarchy we've come to
know and loath--er, love.
MetaFS plugins can mix and match in these categories as appropriate; they are
not limited to being a single type of plugin.
Searching and Services
MetaFS will also provide searching functionality through a service plugin. For
example, suppose you want to search for all your Beatles MP3s. You need only
create a text file containing your search parameters:
zaphod@deep-thought ~ $ cat > beatles-mp3s.ms
metasearch
find {user.tag.artist} == "Beatles"
Then, go into the search service of the file you just created:
zaphod@deep-thought ~ $ cd beatles-mp3s.ms#search
zaphod@deep-thought ~/beatles-mp3s.ms#search $ ls
...
Inside the search service, you will find symbolic links to all your Beatles
MP3s. Furthermore, if you buy and rip a new CD, the new MP3s will automatically
appear in the search as they are added to your collection.
Now, suppose you have a .tar.gz file, and you want to check what's inside before
extracting it. Services should be nestable, like so:
zaphod@deep-thought ~ $ cd suspicious-tarball.tar.gz#gz#tar
zaphod@deep-thought ~/suspicious-tarball.tar.gz#gz#tar $ ls
...
The gz service decompresses, and the tar service looks inside the
decompressed tarball.