Analysis

For those who need a detailed explanation about what RSP is and why it can be useful, the following sections cover the economic aspects of resource sharing for system administration.

The Problem

Sometimes free software projects face a resource crisis than can be understood as a lack of means to keep their efforts going on, but at the same time can be understood as a crisis of the current paradigm of organization (maybe both technological and social).

That happens especially with softwares which are part of the long part of the tail and if they're not willing to use a too much standardized hosting service such as Sourceforge or Google Code.

Maybe wouldn't be too hard to do a paradigm shift, not a profound one, but one that can address the current precarity of the server infostructure.

Some kind of coordination/organization and/or even a standard or protocol can help to to facilitate the resource sharing because currently things are done more or less loosely among different groups that share things -- not exactly because people are "loose" but maybe because that was the way that worked very well until the community realizes that they need something better -- and then things can get easily messed with stuff such as no one knowing whether some system is updated, has backups or if the disks are failing.

For groups which depend on servers, its possible to frame the problem in an economic fashion, where we consider some limited resources, usually like:

  • Servers
    • Bandwidth
    • Disk
    • CPU
    • Memory
  • Operational: number of people and hours/week they can help.
  • Financial
  • Legal

The possible solutions for the group organization are sets of ways these resources can be combined in a given group or group of groups in order to maximize the resource usage.

One approach is to build social organizations that help to use better the resources, like the sharing of protocols, software, configurations or even the physical resources. Softwares are useful ways to "encode" in a package some efforts that can be easily replicated and the same happens with protocols and configurations.

Organizational crisis

When such things happen, the groups pass from an accumulation "level" to another. A tech-infrastructure crisis can be understood as a failure to sustain an accumulation level: groups so far accumulated a lot of (sometimes sensitive) data the we try to protect from an hostile environment.

The current way groups organize their resources usage might be insufficient to protect them from multiple different threats (data loss, centralization, etc).

Maybe this happens because they're still not so used to share they physical resources. Sharing a software solution is way easier because one just needs to make a regular package and give it to people, while sharing concurrent/rival resources seens to be a lot more complex.

So, in short: a resource crisis can be seem as a failure to organize the resource usage inside a group or among a group of groups. While groups are partially successfull is sharing non-rival goods (protocols, software, configurations, and partially because its possible to do it better), they are still failing in sharing our concurrent/rival resources like diskspace and bandwidth.

It's not just a way to just have a big server and create a lot of vservers and give it to groups because a solution that scales in a way that can be easily managed and doesn't need more workload is still needed.

Each group usually can take care physically of just one or two servers/racks in the same or different locations. But having just servers in two locations is still not so safe, so an easy way to share this physical resource would be to let other groups give and get virtual spaces inside the real boxes. This way, a group with just one server can have lots of virtual spaces in friend's groups.

Virtualization Alliance

Some "alliance" assumes some sort of mutual benefit -- altough there are groups can't or don't need to give retributions for received help. There are groups that can help each other (but not a requisite for participation) and a coordination goes beyond a given technology such as server virtualization and then needs also a social technology.

Usually, replication of virtualization technologies consists basically of backing up both the server space and the configuration needed to run it. There are advantages and disadvantages of each virtualization platform (vserver uses a lot less resources while other platforms allows running things in kernel mode like file system encryption independently from the host).

Considering a group that requests virtual space and depending in the use the group want to give, it doesn't matter too much whether their virtual space uses technology X or Y (as virtual spaces are intended to act as independent servers anyway).

So a general "Virtualization Alliance" could be formed by

  • Groups that hosts virtualized environments using technology X.
  • Groups that hosts virtualized environments using technology Y.
  • [...]
  • Groups that needs virtualized environments using technology X.
  • Groups that needs virtualized environments using technology Y.
  • [...]
  • Groups that needs virtualized environments with no tech. preference.

So groups using the same technology perhaps could be grouped toghether to share data, specific configurations, etc.

Backups of virtualized environments

As an example, for backup purposes, we could consider the following options:

  • External backups (i.e, backups made from the host server by the group hosting the virtual space):
    • Groups that hosts virtualized environments and make backups of them.
    • Groups that hosts virtualized environments and do not make backups of them.
  • Internal backups: those made from inside a virtual environment to somewhere else made by the hosted group.

Note that its possible that a given virtual environment be backed up by both "external" and "internal" methods, they're not mutual exclusive methods.

Labor

There's also some other parameters to consider: how much labor is needed to keep a server running? Considering a debian box with vservers, how much work is needed to:

  • Install/reinstall a system?
  • Keep the system up to date?
  • Change configurations?
  • Emergencies?

Sure these points depend in the configuration (eg. with puppet perhaps the cost of install/reinstall falls to near zero). Crossing this information with the nummber of available people (and how much work they can do) can help to shape if the group is ready to have more boxes.

If the server setup is scalable and easy to maintain, maybe a small group can do the job, but one important point is to shape everything to minimize points of failures (like systems depending on single servers or in just a small number of people to operate).

Towards a resource sharing protocol

If the problems is restricted just to virtualized environments, a lot of effort can be lost in system and service replication while saving labour. Its importante to note that there are groups that can deal with:

  • Hardware.
  • Sysadmin (hosts or virtual spaces).
  • Applications (websites, mailing lists, etc).
  • Other stuff.

Maybe there are groups that can do all, but we can assume that these are three big types of labor so we'll have groups that can/want to do just one, two or all these jobs. So the problem would be to combine these resources and specializations together.

In order to integrate groups working at different "levels" and allow then to share resources, lets start thinking about layers and a protocol to help them to share things among different layers.

Layer-based replication

Spliting resources in service layers helps not just groups sharing their resources but also ease the task of service and data replication. That's not new: the whole OSI model address at least part of these goals.

As an example, consider two different machines composed by service layers like in the diagram below:

 _________               _________
|         | replication |         |
| service |------------>| service |
|_________|     .------>|_________|
|         |----'        |         |
|_________|------------>|_________|

 machine 1               machine 2

We can even consider that:

  • Each service can inherit some characteristics from the lower layers.
  • Hence, each service can depend on some set of characteristics from the lower layers.
  • Service/data replication can be deal as backups and vice-versa.
  • Layers usually can "see" and access each other, but usually an upper layer cannot replicate the lower ones, but the contrary is generally true.

Then, replication can in general take place from a layer who copies itself to another layer it has read/write access or lower layers that can copy themselves and the upper layers to other places. There's not even the requisite to a layer (eg. a vserver) to copy itself to another vserver: it can copy itself to wherever it has access to, altough some groups might prefer to set some layer hierarchy and replication rules to avoid confusion.

A consequence of property inheritance between layers should be considered when replicating a layer: the overall layer characteristics depends on the characteristics of the layers below.

Layers can be even virtual: consider the equivalence of these two systems:

 _________
|         |
|  email  |
|_________|                    ____________
|         |                   |            |
|  Zone   |    equivalence    | OS / Email |
|_________|  -------------->  |____________|
|         |                   |            |
|   OS    |                   |   Machine  |
|_________|                   |____________|
|         |        
| Machine |
|_________|

Then, for a layer sitting on top of one or another system there's not much difference in terms of property inheritance.

How RSP fits into all this mess

The RSP can be used to help the groups:

  • Evaluate their available resources and possible uses for it, like establishing classes of resources with similar metadata which fits sets of uses.
  • Inform other groups about the resources they have and need.
  • Set their layer/replication policy.