Distributing Content

One of the really interesting things that I heard at FUDCon was loupgaroublond’s thoughts on distributing content.  Content has been a bit of a hot topic for fedora recently.  In the past month I’ve seen mailing list and IRC discussions on:

  • Whether rpm packages of the books on Project Gutenberg would be good for Fedora
  • How to decide whether someone’s random pictures of London deserves to be packaged in an rpm that people can download from Fedora as wallpapers
  • How to best manage package (or not rpm package) the documentation that the Fedora Docs project produces given that translations and which Fedora Release a document is for can cause an explosion of documents.

These are all attempts to manage content using rpm, software which was originally meant for packaging software.  There are several advantages to using the package manager:

  • It lets the system administrator easily see that certain packages of content are installed on the system via the tool that they already know.
  • Having it managed this way means that individual account holders don’t all have to download separate copies.  You canhave a single copy of the files on the computer.

However, there’s also many disadvantages including:

  • How do we avoid conflicts in package names?  This is especially prevelant in content like “london-pictures”.
  • How do we decide where to break up content?  Should my pictures of London and your pictures of London be forced into the same package or separate?  On the one extreme we will end up generating huge packages for users to download, on the other we explode the number of packages (and hence the meta-data associated with the repositories).
  • How do we provide people with the ability to search and browse the content?  If you compare deviantart, KDE Look, or flickr to the Fedora Package List you’ll see that the process of looking for content in the package list is definitely suboptimal.

What loupgaroublond is proposing as a GSoC project is to make an alternate way of hosting, delivering, and managing content that does not involve the packaging formats that we use for software. In the comments to loupgaroublond’s post, Kevin Kofler links to the Open Collaboration Services API that’s being used by opendesktop.org.  This looks like a good place to start for the delivery portion of the equation.  Since opendesktop drives gnome-look.org, kde-look.org, and others, they probably have some of the hosting equation worked out as well.  (Although I have a feeling that we could bring a lot to the table wrt managing mirror networks of the content via mirrormanager and other things we’ve developed purely for hosting the software that makes up Fedora).

However, that still leaves us needing to figure out how to manage the content once it’s on the computer.  This consists of at least two parts:

  1. A standard for how to organize and find the content on the system.  This probably includes a set place in the filesystem for installing content systemwide and another spot for installing software in the user’s home directory.  It would also include metadata that could be associated with the software and some standard APIs for accessing the information.
  2. A method of tracking what things are installed on the system.  loupgaroublond hints about this when he talks about a “content database system”.  For software packages in Fedora, rpm keeps a database in /var/lib/rpm that tracks what software is installed, metadata about the software, and where on the filesystem it lives.  We’d need something similar for tracking content with the addition that it needs to merge with tracking information stored in the user’s home directory.

Figuring out answers to these problems would be a worthy summer project that would benefit a broad spectrum of projects.  Unix-like distributions, would gain a way to deliver a hoard of free content.  Content authors would find a larger audience for their work.  Programmers would be provided with a means of sharing content resources on the system with each other and an API to access the content locally and possibly remotely.  System administrators would be able to manage free content similarly to how they manage free software on their systems.  End users should gain access to content from a consistent interface rather than having to traverse the internet, searching a variety of different websites with different UIs for what they’re looking for.

Any takers?  Please contact loupgaroublond using one of his preferred contact methods.