May 7, 2017

What Are Capabilities?

 

Some preliminary remarks

You can skip this initial section, which just sets some context, without loss to the technical substance of the essay that follows, though perhaps at some loss in entertainment value.

At a gathering of some of my coconspirators friends a couple months ago, Alan Karp lamented the lack of a good, basic introduction to capabilities for folks who aren’t already familiar with the paradigm. There’s been a lot written on the topic, but it seems like everything is either really long (though if you’re up for a great nerdy read I recommend Mark Miller’s PhD thesis), or really old (I think the root of the family tree is probably Dennis and Van Horn’s seminal paper from 1966), or embedded in explanations of specific programming languages (such as Marc Stiegler’s excellent E In A Walnut or the capabilities section of the Pony language tutorial) or operating systems (such as KeyKOS or seL4), or descriptions of things that use capabilities (like smart contracts or distributed file storage), or discussions of aspects of capabilities (Norm Hardy has written of ton of useful fragments on his website). But nothing that’s just a good “here, read this” that you can toss at curious people who are technically able but unfamiliar with the ideas. So Alan says, “somebody should write something like that,” while giving me a meaningful stare. Somebody, huh? OK, I can take a hint. I’ll give it a shot. Given my tendency to Explain Things this will probably end up being a lot longer than what Alan wants, but ya gotta start somewhere.

The first thing to confront is that term, “capabilities”, itself. It’s confusing. The word has a perfectly useful everyday meaning, even in the context of software engineering. When I was at PayPal, for example, people would regularly talk about our system’s capabilities, meaning what it can do. And this everyday meaning is actually pretty close to the technical meaning, because in both cases we’re talking about what a system “can” do, but usually what people mean by that is the functionality it realizes rather than the permissions it has been given. One path out of this terminological confusion takes its cue from the natural alignment between capabilities and object oriented programming, since it’s very easy to express capability concepts with object oriented abstractions (I’ll get into this shortly). This has lead, without too much loss of meaning, to the term “object capabilities”, which embraces this affinity. This phrase has the virtue that we can talk about it in abbreviated form as “ocaps” and slough off some of the lexical confusion even further. It does have the downside that there are some historically important capability systems that aren’t really what you’d think of as object oriented, but sometimes oversimplification is the price of clarity. The main thing is, just don’t let the word “capabilities” lead you down the garden path; instead, focus on the underlying ideas.

The other thing to be aware of is that there’s some controversy surrounding capabilities. Part of this is a natural immune response to criticism (nobody likes being told that they’re doing things all wrong), part of it is academic tribalism at work, and part of it is the engineer’s instinctive and often healthy wariness of novelty. I almost hesitate to mention this (some of my colleagues might argue I shouldn’t have), but it’s important to understand the historical context if you read through the literature. Some of the pushback these ideas have received doesn’t really have as much to do with their actual merits or lack thereof as one might hope; some of it is profoundly incorrect nonsense and should be called out as such.

The idea

Norm Hardy summarizes the capability mindset with the admonition “don’t separate designation from authority”. I like this a lot, but it’s the kind of zen aphorism that’s mainly useful to people who already understand it. To everybody else, it just invites questions: (a) What does that mean? and (b) Why should I care? So let’s take this apart and see…

The capability paradigm is about access control. When a system, such as an OS or a website, is presented with a request for a service it provides, it needs to decide if it should actually do what the requestor is asking for. The way it decides is what we’re talking about when we talk about access control. If you’re like most people, the first thing you’re likely to think of is to ask the requestor “who are you?” The fundamental insight of the capabilities paradigm is to recognize that this question is the first step on the road to perdition. That’s highly counterintuitive to most people, hence the related controversy.

For example, let’s say you’re editing a document in Microsoft Word, and you click on the “Save” button. This causes Word to issue a request to the operating system to write to the document file. The OS checks if you have write permission for that file and then allows or forbids the operation accordingly. Everybody thinks this is normal and natural. And in this case, it is: you asked Word, a program you chose to run, to write your changes to a file you own. The write succeeded because the operating system’s access control mechanism allowed it on account of it being your file, but that mechanism wasn’t doing quite what you might think. In particular, it didn’t check whether the specific file write operation in question was the one you asked for (because it can’t actually tell), it just checked if you were allowed to do it.

The access control model here is what’s known as an ACL, which stands for Access Control List. The basic idea is that for each thing the operating system wants to control access to (like a file, for example), it keeps a list of who is allowed to do what. The ACL model is how every current mainstream operating system handles this, so it doesn’t matter if we’re talking about Windows, macOS, Linux, FreeBSD, iOS, Android, or whatever. While there are a lot of variations in the details of how they they handle access control (e.g., the Unix file owner/group/others model, or the principal-per-app model common on phone OSs), in this respect they’re all fundamentally the same.

As I said, this all seems natural and intuitive to most people. It’s also fatally flawed. When you run an application, as far as the OS is concerned, everything the application does is done by you. Another way to put this is, an application you run can do anything you can do. This seems OK in the example we gave of Word saving your file. But what if Word did something else, like transmit the contents of your file over the internet to a server in Macedonia run by the mafia, or erase any of your files whose names begin with a vowel, or encrypt all your files and demand payment in bitcoins to decrypt them? Well, you’re allowed to do all those things, if for some crazy reason you wanted to, so it can too. Now, you might say, we trust Word not to do evil stuff like that. Microsoft would get in trouble. People would talk. And that’s true. But it’s not just Microsoft Word, it’s every single piece of software in your computer, including lots of stuff you don’t even know is there, much of it originating from sources far more difficult to hold accountable than Microsoft Corporation, if you even know who they are at all.

The underlying problem is that the access control mechanism has no way to determine what you really wanted. One way to deal with this might be to have the operating system ask you for confirmation each time a program wants to do something that is access controlled: “Is it OK for Word to write to this file, yes or no?” Experience with this approach has been pretty dismal. Completely aside from the fact that this is profoundly annoying, people quickly become trained to reflexively answer “yes” without a moment’s thought, since that’s almost always the right answer anyway and they just want to get on with whatever they’re doing. Plus, a lot of the access controlled operations a typical program does are internal things (like fiddling with a configuration file, for example) whose appropriateness the user has no way to determine anyhow.

An alternative approach starts by considering how you told Word what you wanted in the first place. When you first opened the document for editing, you typically either double-clicked on an icon representing the file, or picked the file from an Open File dialog. Note, by the way, that both of these user interface interactions are typically implemented by the operating system (or by libraries provided by the operating system), not by Word. The way current APIs work, what happens in either of these cases is that the operating system provides the application with a character string: the pathname of the file you chose. The application is then free to use this string however it likes, typically passing it as a parameter to another operating system API call to open the file. But this is actually a little weird: you designated a file, but the operating system turned this into a character string which it gave to Word, and then when Word actually wanted to open the file it passed the string back to the operating system, which converted it back into a file again. As I’ve said, this works fine in the normal case. But Word is not actually limited to using just the string that names the particular file you specified. It can pass any string it chooses to the Open File call, and the only access limitation it has is what permissions you have. If it’s your own computer, that’s likely to be permissions to everything on the machine, but certainly it’s at least permissions to all your stuff.

Now imagine things working a little differently. Imagine that when Word starts running it has no access at all to anything that’s access controlled – no files, peripheral devices, networks, nothing. When you double click the file icon or pick from the open file dialog, instead of giving Word a pathname string, the operating system itself opens the file and gives Word a handle to it (that is, it gives Word basically the same thing it would have given Word in response to the Open File API call when doing things the old way). Now Word has access to your document, but that’s all. It can’t send your file to Macedonia, because it doesn’t have access to the network – you didn’t give it that, you just gave it the document. It can’t delete or encrypt any of your other files, because it wasn’t given access to any of them either. It can mess up the one file you told it to edit, but it’s just the one file, and if it did that you’d stop using Word and not suffer any further damage. And notice that the user experience – your experience – is exactly the same as it was before. You didn’t have to answer any “mother may I?” security questions or put up with any of the other annoying stuff that people normally associate with security. In this world, that handle to the open file is an example of what we call a “capability”.

This is where we get back to Norm Hardy’s “don’t separate designation from authority” motto. By “designation” we mean how we indicate to, for example, the OS, which thing we are talking about. By “authority” we mean what we are allowed by the OS to do with that thing. In the traditional ACL world, these are two largely disconnected concepts. In the case of a file, the designator is typically a pathname – a character string – that you use to refer to the file when operating upon it. The OS provides operations like Write File or Delete File that are parameterized by the path name of the file to be written to or deleted. Authority is managed separately as an ACL that the OS maintains in association with each file. This means that the decision to grant access to a file is unrelated to the decision to make use of it. But this in turn means that the decision to grant access has to be made without knowledge of the specific uses. It means that the two pieces of information the operating system needs in order to make its access control decision travel to it via separate routes, with no assurance that they will be properly associated with each other when they arrive. In particular, it means that a program can often do things (or be fooled into doing things) that were never intended to be allowed.

Here’s the original example of the kind of thing I’m talking about, a tale from Norm. It’s important to note, by the way, that this is an actual true story, not something I just made up for pedagogical purposes.

Once upon a time, Norm worked for a company that ran a bunch of timeshared computers, kind of like what we now call “cloud computing” only with an earlier generation’s buzzwords. One service they provided was a FORTRAN compiler, so customers could write their own software.

It being so many generations of Moore’s Law ago, computing was expensive, so each time the compiler ran it wrote a billing record to a system accounting file noting the resources used, so the customer could be charged for them. Since this was a shared system, the operators knew to be careful with file permissions. So, for example, if you told the compiler to output to a file that belonged to somebody else, this would fail because you didn’t have permission. They also took care to make sure that only the compiler itself could write to the system accounting file – you wouldn’t want random users to mess with the billing records, that would obviously be bad.

Then one day somebody figured out they could tell the compiler the name of the system accounting file as the name of the file to write the compilation output to. The access control system looked at this and asked, “does this program have permission to write to this file?” – and it did! And so the compiler was allowed to overwrite the billing records and the billing information was lost and everybody got all their compilations for free that day.

Fixing this turned out to be surprisingly slippery. Norm named the underlying problem “The Confused Deputy”. At heart, the FORTRAN compiler was deputized by two different masters: the customer and the system operators. To serve the customer, it had been given permission to access the customer’s files. To serve the operators, it had been given permission to access the accounting file. But it was confused about which master it was serving for which purpose, because it had no way to associate the permissions it had with their intended uses. It couldn’t specify “use this permission for this file, use that permission for that file”, because the permissions themselves were not distinct things it could wield selectively – the compiler never actually saw or handled them directly. We call this sort of thing “ambient authority”, because it’s just sitting there in the environment, waiting to be used automatically without regard to intent or context.

If this system had been built on capability principles, rather than accessing the files by name, the compiler would instead have been given a capability by the system operators to access the accounting file with, which it would use to update the billing records, and then gotten a different capability from the customer to access the output file, which it would use when outputting the result of the compilation. There would have been no confusion and no exploit.

You might think this is some obscure problem those old people had back somewhere at the dawn of the industry, but a whole bunch of security problems plaguing us today – which you might think are all very different – fit this template, including many kinds of injection attacks, cross-site request forgery, cross site scripting attacks, click-jacking – including, depending on how you look at it, somewhere between 5 and 8 members of the OWASP top 10 list. These are all arguably confused deputy problems, manifestations of this one conceptual flaw first noticed in the 1970s!

Getting more precise

We said separating designation from authority is dangerous, and that instead these two things should be combined, but we didn’t really say much about what it actually means to combine them. So at this point I think it’s time to get a bit more precise about what a capability actually is.

A capability is single thing that both designates a resource and authorizes some kind of access to it.

There’s a bunch of abstract words in there, so let’s unpack it a bit.

By resource we just mean something the access control mechanism controls access to. It’s some specific thing we have that somebody might want to use somehow, whose use we seek to regulate. It could be a file, an I/O device, a network connection, a database record, or really any kind of object. The access control mechanism itself doesn’t much care what kind of thing the resource is or what someone wants to do with it. In specific use cases, of course, we care very much about those things, but then we’re talking about what we use the access control mechanism for, not about how it works.

In the same vein, when we talk about access, we just mean actually doing something that can be done with the resource. Access could be reading, writing, invoking, using, destroying, activating, or whatever. Once again, which of these it is is important for specific uses but not for the mechanism itself. Also, keep in mind that the specific kind of access that’s authorized is one of the things the capability embodies. Thus, for example, a read capability to a file is a different thing from a write capability to the same file (and of course, there might be a read+write capability to that file, which would be yet a third thing).

By designation, we mean indicating, somehow, specifically which resource we’re talking about. And by authorizing we mean that we are allowing the access to happen. Hopefully, none of this is any surprise.

Because the capability combines designation with authority, the possessor of the capability exercises their authority – that is, does whatever it is they are allowed to do with the resource the capability is a capability to – by wielding the capability itself. (What that means in practice should be clearer after a few examples). If you don’t possess the capability, you can’t use it, and thus you don’t have access. Access is regulated by controlling possession.

A key idea is that capabilities are transferrable, that someone who possesses a capability can convey it to someone else. An important implication that falls out of this is that capabilities fundamentally enable delegation of authority. If you are able to do something, it means you possess a capability for that something. If you pass this capability to somebody else, then they are now also able do whatever it is. Delegation is one of the main things that make capabilities powerful and useful. However, it also tends to cause a lot of people to freak out at the apparent loss of control. A common response is to try to invent mechanisms to limit or forbid delegation, which is a terrible idea and won’t work anyway, for reasons I’ll get into.

If you’re one of these people, please don’t freak out yourself; I’ll come back to this shortly and explain some important capability patterns that hopefully will address your concerns. In the meantime, a detail that might be helpful to meditate on: two capabilities that authorize the same access to the same resource are not necessarily the same capability (note: this is just a hint to tease the folks who are trying to guess where this goes, so if you’re not one of those people, don’t worry if it’s not obvious).

Another implication of our definition is that capabilities must be unforgeable. By this we mean that you can’t by yourself create a capability to a resource that you don’t already have access to. This is a basic requirement that any capability system must satisfy. For example, using pathnames to designate files is problematic because anybody can create any character string they want, which means they can designate any file they want if pathnames are how you do it. Pathnames are highly forgeable. They work fine as designators, but can’t by themselves be used to authorize access. In the same vein, an object pointer in C++ is forgeable, since you can typecast an integer into a pointer and thus produce a pointer to any kind of object at any memory address of your choosing, whereas in Java, Smalltalk, or pretty much any other memory-safe language where this kind of casting is not available, an object reference is unforgeable.

As I’ve talked about all this, I’ve tended to personify the entities that possess, transfer, and wield capabilities – for example, sometimes by referring to one of them as “you”. This has let me avoid saying much about what kind of entities these are. I did this so you wouldn’t get too anchored in specifics, because there are many different ways capability systems can work, and the kinds of actors that populate these systems vary. In particular, personification let me gloss over whether these actors were bits of software or actual people. However, we’re ultimately interested in building software, so now lets talk about these entities as “objects”, in the traditional way we speak of objects in object oriented programming. By getting under the hood a bit, I hope things may be made a little easier to understand. Later on we can generalize to other kinds of systems beyond OOP.

I’ll alert you now that I’ll still tend to personify these things a bit. It’s helpful for us humans, in trying to understand the actions of an intentional agent, to think of it as if it’s a person even if it’s really code. Plus – and I’ll admit to some conceptual ju-jitsu here – we really do want to talk about objects as distinct intentional agents. Another of the weaknesses of the ACL approach is that it roots everything in the identity of the user (or other vaguely user-like abstractions like roles, groups, service accounts, and so on) as if that user was the one doing things, that is, as if the user is the intentional agent. However, when an object actually does something it does it in a particular way that depends on how it is coded. While this behavior might reflect the intentions of the specific user who ultimately set it in motion, it might as easily reflect the intentions of the programmers who wrote it – more often, in fact, because most of what a typical piece of software does involves internal mechanics that we often work very hard to shield the user from having to know anything about.

In what we’re calling an “object capability” system (or “ocap” system, to use the convenient contraction I mentioned in the beginning), a reference to an object is a capability. The interesting thing about objects in such a system is that they are both wielders of capabilities and resources themselves. An object wields a capability – an object reference – by invoking methods on it. You transfer a capability by passing an object reference as a parameter in a method invocation, returning it from a method, or by storing it in a variable. An ocap system goes beyond an ordinary OOP system by imposing a couple additional requirements: (1) that object references be unforgeable, as just discussed, and (2) that there be some means of strong encapsulation, so that one object can hold onto references to other objects in a way that these can’t be accessed from outside it. For example, you can implement ocap principles in Java using ordinary Java object references held in private instance variables (to make Java itself into a pure ocap language – which you can totally do, by the way – requires introducing a few additional rules, but that’s more detail than we have time for here).

In an ocap system, there are only three possible ways you can come to have a capability to some resource, which is to say, to have a reference to some object: creation, transfer, and endowment.

Creation means you created the resource yourself. We follow the convention that, as a byproduct of the act of creation, the creator receives a capability that provides full access to the new resource. This is natural in an OOP system, where an object constructor typically returns a reference to the new object it constructed. In a sense, creation is an optional feature, because it’s not actually a requirement that a capability system have a way to produce new resources at all (that is, it might be limited to resources that already exist), but if it does, there needs to be way for the new resources to enter into the capability world, and handing them to their creator is a good way to do it.

Transfer means somebody else gave the capability to you. This is the most important and interesting case. Capability passing is how the authority graph – the map of who has what authority to do what with what – can change over time (by the way, the lack of a principled way to talk about how authorities change over time is another big problem with the ACL model). The simple idea is: Alice has a capability to Bob, Alice passes this capability to Carol, now Carol also has a capability to Bob. That simple narrative, however, conceals some important subtleties. First, Alice can only do this if she actually possesses the capability to Bob in the first place. Hopefully this isn’t surprising, but it is important. Second, Alice also has to have a capability to Carol (or some capability to communicate with Carol, which amounts to the same thing). Now things get interesting; it means we have a form of confinement, in that you can’t leak a capability unless you have another capability that lets you communicate with someone to whom you’d leak it. Third, Alice had to choose to pass the capability on; neither Bob nor Carol (nor anyone else) could cause the transfer without Alice’s participation (this is what motivates the requirement for strong encapsulation).

Endowment means you were born with the capability. An object’s creator can give it a reference to some other object as part of its initial state. In one sense, this is just creation followed by transfer. However, we treat endowment as its own thing for a couple of reasons. First, it’s how we can have an immutable object that holds a capability. Second, it’s how we avoid infinite regress when we follow the rules to their logical conclusion.

Endowment is how objects end up with capabilities that are realized by the ocap system implementation itself rather by code executing within it. What this means varies depending on the nature of the system; for example, an ocap language framework running on a conventional OS might provide a capability-oriented interface to the OS’s non-capability-oriented file system. An ocap operating system (such as KeyKOS or seL4) might provide capability-oriented access to primitive hardware resources such as disk blocks or network interfaces. In both cases we’re talking about things that exist outside the ocap model, which must be wrapped in special privileged objects that have native access to those things. Such objects can’t be created within the ocap rules, so they have to be endowed by the system itself.

So, to summarize: in the ocap model, a resource is an object and a capability is an object reference. The access that a given capability enables is the method interface that the object reference exposes. Another way to think of this is: ocaps are just object oriented programming with some additional strictness.

Here we come to another key difference from the ACL model: in the ocap world, the kinds of resources that may be access controlled, and the varieties of access to them that can be provided, are typically more diverse and more finely grained. They’re also generally more dynamic, since it’s usually possible, and indeed normal, to introduce new kinds of resources over time, with new kinds of access affordances, simply by defining new object classes. In contrast, the typical ACL framework has a fixed set of resource types baked into it, along with a small set of access modes that can be separately controlled. This difference is not fundamental – you could certainly create an extensible ACL system or an ocap framework based on a small, static set of object types – but it points to an important philosophical divergence between the two approaches.

In the ACL model, access decisions are made on the basis of access configuration settings associated with the resources. These settings must be administered, often manually, by direct interaction with the access control machinery, typically using tools that are part of the access control system itself. While policy abstractions (such as groups or roles, for example) can reduce the need for humans to make large numbers of individual configuration decisions, it is typically the case that each resource acquires its access control settings as the consequence of people making deliberate access configuration choices for it.

In contrast, the ocap approach dispenses with most of this configuration information and its attendant administrative activity. The vast majority of access control decisions are realized by the logic of how the resources themselves operate. Most access control choices are subsumed by the code of the corresponding objects. At the granularity of individual objects, the decisions to be made are usually simple and clear from context, further reducing the cognitive burden. Only at the periphery, where the system comes into actual contact with its human users, do questions of policy and human intent arise. And in many of these cases, intent can be inferred from the normal acts of control and designation that users make through their normal UI interactions (such as picking a file from a dialog or clicking on a save button, to return to the example we began with).

Consequently, thinking about access control policy and administration is an entirely different activity in an ocap system than in an ACL system. This thinking extends into the fundamental architecture of applications themselves, as well as that of things like programming languages, application frameworks, network protocols, and operating systems.

Capability patterns

To give you a taste of what I mean by affecting fundamental architecture, let’s fulfill the promise I made earlier to talk about how we address some of the concerns that someone from a more traditional ACL background might have.

The ocap approach both enables and relies on compositionality – putting things together in different ways to make new kinds of things. This isn’t really part of the ACL toolbox at all. The word “compositionality” is kind of vague, so I’ll illustrate what I’m talking about with some specific capability patterns. For discussion purposes, I’m going to group these patterns into a few rough categories: modulation, attenuation, abstraction, and combination. Note that there’s nothing fundamental about these, they’re just useful for presentation.

Modulation

By modulation, I mean having one object modulate access to another. The most important example of this is called a revoker. A major source of the anxiety that some people from an ACL background have about capabilities is the feeling that a capability can easily escape their control. If I’ve given someone access to some resource, what happens if later I decide it’s inappropriate for them to have it? In the ACL model, the answer appears to be simple: I merely remove that person’s entry from the resource’s ACL. In the ocap model, if I’ve given them one of my capabilities, then now they have it too, so what can I do if I don’t want them to have it any more? The answer is that I didn’t give them my capability. Instead I gave them a new capability that I created, a reference to an intermediate object that holds my capability but remains controlled by me in a way that lets me disable it later. We call such a thing a revoker, because it can revoke access. A rudimentary form of this is just a simple message forwarder that can be commanded to drop its forwarding pointer.

Modulation can be more sophisticated than simple revocation. For example, I could provide someone with a capability that I can switch on or off at will. I could make access conditional on time or date or location. I could put controls on the frequency or quantity of use (a use-once capability with a built-in expiration date might be particularly useful). I could even make an intermediary object that requires payment in exchange for access. The possibilities are limited only by need and imagination.

The revoker pattern solves the problem of taking away access, but what about controlling delegation? Capabilities are essentially bearer instruments – they convey their authority to whoever holds them, without regard to who the holder is. This means that if I give someone a capability, they could pass it to someone else whose access I don’t approve of. This is another big source of anxiety for people in the ACL camp: the idea that in the capability model there’s no way to know who has access. This is not rooted in some misunderstanding of capabilities either; it’s actually true. But the ACL model doesn’t really help with this, because it has the same problem.

In real world use cases, the need to share resources and to delegate access is very common. Since the ACL model provides no mechanism for this, people fall back on sharing credentials, often in the face of corporate policies or terms of use that specifically forbid this. When presented with the choice between following the rules and getting their job done, people will often pick the latter. Consider, for example, how common it is for a secretary or executive assistant to know their boss’s password – in my experience, it’s almost universal.

There’s a widespread belief that an ACL tells you who has access, but this is just an illusion, due to the fact that credential sharing is invisible to the access control system. What you really have is something that tells you who to hold responsible if a resource is used inappropriately. And if you think about it, this is what you actually want anyway. The ocap model also supports this type of accountability, but can do a much better job of it.

The first problem with credential sharing is that it’s far too permissive. If my boss gives me their company LDAP password so I can access their calendar and email, they’re also giving me access to everything else that’s protected by that password, which might extend to things like sensitive financial or personnel records, or even the power to spend money from the company bank account. Capabilities, in contrast, allow them to selectively grant me access to specific things.

The second problem with credential sharing is that if I use my access inappropriately, there’s no way to distinguish my accesses from theirs. It’s hard for my boss to claim “my flunky did it!” if the activity logs are tied to the boss’s name, especially if they weren’t supposed to have shared the credentials in the first place. And of course this risk applies in the other direction as well: if it’s an open secret that I have my boss’s password, suspicion for their misbehavior can fall on me; indeed, if my boss was malicious they might share credentials just to gain plausible deniability when they abscond with company assets. The revoker pattern, however, can be extended to enable delegation to be auditable. I delegate by passing someone an intermediary object that takes note of who is being delegated to and why, and then it can record this information in an audit log when it is used. Now, if the resource is misused, we actually know who to blame.

Keep in mind also that credential sharing isn’t limited to shared passwords. For example, if somebody asks me to run a program for them, then whatever it is that they wanted done by that program gets done using my credentials. Even if what the program did was benign and the request was made with the best of intentions, we’ve still lost track of who was responsible. This is the reason why some companies forbid running software they haven’t approved on company computers.

Attenuation

When I talk about attenuation, I mean reducing what a capability lets you do – its scope of authority. The scope of authority can encompass both the operations that are enabled and the range of resources that can be accessed. The later is particularly important, because it’s quite common for methods on an object’s API to return references to other objects as a result (once again, a concept that is foreign to the ACL world). For example, one might have a capability that gives access to a computer’s file system. Using this, an attenuator object might instead provide access only to a specific file, or perhaps to some discrete sub-directory tree in a file hierarchy (i.e., a less clumsy version of what the Unix chroot operation does).

Attenuating functionality is also possible. For example, the base capability to a file might allow any operation the underlying file system supports: read, write, append, delete, truncate, etc. From this you can readily produce a read-only capability to the same file: simply have the intermediary object support read requests without providing any other file API methods.

Of course, these are composable: one could readily produce a read-only capability to a particular file from a capability providing unlimited access to an entire file system. Attenuators are particularly useful for packaging access to the existing, non-capability oriented world into capabilities. In addition to the hierarchical file system wrapper just described, attenuators are helpful for mediating access to network communications (for example, limiting connections to particular domains, allowing applications to be securely distributed across datacenters without also enabling them talk to arbitrary hosts on the internet – the sort of thing that would normally be regulated by firewall configuration, but without the operational overhead or administrative inconvenience). Another use would be controlling access to specific portions of the rendering surface of a display device, something that many window systems already do in an almost capability-like fashion anyway.

Abstraction

Abstraction enters the picture because once we have narrowed what authority a given capability represents, it often makes sense to refactor what it does into something with a more appropriately narrow set of affordances. For example, it might make sense to package the read-only file capability mentioned above into an input stream object, rather than something that represents a file per se. At this point you might ask if this is really any different from ordinary good object oriented programming practice. The short answer is, it’s not – capabilities and OOP are strongly aligned, as I’ve mentioned several times already. A somewhat longer answer is that the capability perspective usefully shapes how you design interfaces.

A core idea that capability enthusiasts use heavily is the Principle of Least Authority (abbreviated POLA, happily pronounceable). The principle states that objects should be given only the specific authority necessary to do their jobs, and no more. The idea is that the fewer things an object can do, the less harm can result if it misbehaves or if its integrity is breached.

Least Authority is related to the notions of Least Privilege or Least Permission that you’ll frequently see in a lot of the traditional (non-capability) security literature. In part, this difference in jargon is just a cultural marker that separates the two camps. Often the traditional literature will tell you that authority and permission and privilege all mean more or less the same thing.

However, we really do prefer to talk about “authority”, which we take to represent the full scope of what someone or something is able to do, whereas “permission” refers to a particular set of access settings. For example, on a Unix system I typically don’t have permission to modify the /etc/passwd file, but I do typically have permission to execute the passwd command, which does have permission to modify the file. This command will make selected changes to the file on my behalf, thus giving me the authority to change my password. We also think of authority in terms of what you can actually do. To continue the example of the passwd command, it has permission to delete the password file entirely, but it does not make this available to me, thus it does not convey that authority to me even though it could if it were programmed to do so.

The passwd command is an example of abstracting away the low level details of file access and data formats, instead repackaging them into a more specific set of operations that is more directly meaningful to its user. This kind of functionality refactoring is very natural from a programming perspective, but using it to also refactor access is awkward in the ACL case. ACL systems typically have to leverage slippery abstractions like the Unix setuid mechanism. Setuid is what makes the Unix passwd command possible in the first place, but it’s a potent source of confused deputy problems that’s difficult to use safely; an astonishing number of Unix security exploits over the years have involved setuid missteps. The ocap approach avoids these missteps because the appropriate reduction in authority often comes for free as a natural consequence of the straightforward implementation of the operation being provided.

Combination

When I talk about combination, I mean using two or more capabilities together to create a new capability to some specific joint functionality. In some cases, this is simply the intersection of two different authorities. However, the more interesting cases are when we put things together to create something truly new.

For example, imagine a smartphone running a capability oriented operating system instead of iOS or Android. The hardware features of such a phone would, of course, be accessed via capabilities, which the OS would hand out to applications according to configuration rules or user input. So we could imagine combining three important capabilities: the authority to capture images using the camera, the authority to obtain the device’s geographic position via its built-in GPS receiver, and the authority to read the system clock. These could be encapsulated inside an object, along with a (possibly manufacturer provided) private cryptographic key, yielding a new capability that when invoked provides signed, authenticated, time stamped, geo-referenced images from the camera. This capability could then be granted to applications that require high integrity imaging, like police body cameras, automobile dash cams, journalistic apps, and so on. If this capability is the only way for such applications to get access to the camera at all, then the applications’ developers don’t have to be trusted to maintain a secure chain of evidence for the imagery. This both simplifies their implementation task – they can focus their efforts on their applications’ unique needs instead of fiddling with signatures and image formats – and makes their output far more trustworthy, since they don’t have prove their application code doesn’t tamper with the data (you still have to trust the phone and the OS, but that’s at least a separable problem).

What can we do with this?

I’ve talked at length about the virtues of the capability approach, but at the same time observed repeatedly (if only in criticism) that this is not how most contemporary systems work. So even if these ideas are as virtuous as I maintain they are, we’re still left with the question of what use we can make of them absent some counterfactual universe of technical wonderfulness.

There are several ways these ideas can provide direct value without first demanding that we replace the entire installed base of software that makes the world go. This is not to say that the installed base never gets replaced, but it’s a gradual, incremental process. It’s driven by small, local changes rather than by the unfolding of some kind of authoritative master plan. So here are a few incremental ways to apply these ideas to the current world. My hope is that these can deliver enough immediate value to bias practitioners in a positive direction, shaping the incentive landscape so it tilts towards a less dysfunctional software ecosystem. Four areas in particular seem salient to me in this regard: embedded systems, compartmentalized computation, distributed services, and software engineering practices.

Embedded systems

Capability principles are a very good way to organize an operating system. Two of the most noteworthy examples, in my opinion, are KeyKOS and seL4.

KeyKOS was developed in the 1980s for IBM mainframes by Key Logic, a spinoff from Tymshare. In addition to being a fully capability secure OS, it attained extraordinarily high reliability via an amazing, high performance orthogonal persistence mechanism that allowed processes to run indefinitely, surviving things like loss of power or hardware failure. Some commercial KeyKOS installations had processes that ran for years, in a few cases even spanning replacement of the underlying computer on which they were running. Towards the end of its commercial life, KeyKOS was also ported to several other processor architectures, making it a potentially interesting jumping off point for further development. KeyKOS has inspired a number of follow ons, including Eros, CapROS, and Coyotos. Unfortunately most of these efforts have been significantly resource starved and consequently have not yet had much real world impact. But the code for KeyKOS and its descendants is out there for the taking if anybody wants to give it a go.

seL4 is a secure variant of the L4 operating system, developed by NICTA in Australia. While building on the earlier L3 and L4 microkernels, seL4 is a from scratch design heavily influenced by KeyKOS. seL4 notably has a formal proof of functional correctness, making it an extremely sound basis for building secure and reliable systems. It’s starting to make promising inroads into applications that demand this kind of assurance, such as military avionics. Like KeyKOS, seL4, as well as seL4’s associated suite of proofs, is available as open source software.

Embedded systems, including much of the so called “Internet of Things”, are sometimes less constrained by installed base issues on account of being standalone products with narrow functionality, rather than general purpose computational systems. They often have fewer points where legacy interoperability is as important. Moreover, they’re usually cross-developed with tools that already expect the development and runtime environments to be completely different, allowing them to be bootstrapped via legacy toolchains. In other words, you don’t have to port your entire development system to the new OS in order to take advantage of it, but rather can continue using most of your existing tools and workflow processes. This is certainly true of the capability OS efforts I just mentioned, which have all dealt with these issues.

Furthermore, embedded software is often found in mission critical systems that must function reliably in a high threat environment. In these applications, reliability and security can take priority over cost minimization, making the assurances that a capability OS can offer comparatively more attractive. Consequently, using one of these operating systems as the basis for a new embedded application platform seems like an opportunity, particularly in areas where reliability is important.

A number of recent security incidents on the internet have revolved around compromised IoT devices. A big part of the problem is that the application code in these products typically has complete access to everything in the device, largely as a convenience to the developers. This massive violation of least privilege then makes these devices highly vulnerable to exploitation when an attacker finds flaws in the application code.

Rigorously compartmentalizing available functionality would greatly reduce the chances of these kinds of vulnerabilities, but this usually doesn’t happen. Partly this is just ignorance – most of these developers are not generally also security experts, especially when the things they are working on are not, on their face, security sensitive applications. However, I think a bigger issue is that the effort and inconvenience involved in building a secure system with current building blocks doesn’t seem justified by the payoff.

No doubt the developers of these products would prefer to produce more secure systems than they often do, all other things being equal, but all other things are rarely equal. One way to tilt the balance in our favor would be to give them a platform that more or less automatically delivers desirable security and reliability properties as a consequence of developers simply following the path of least resistance. This is the payoff that building on top of a capability OS offers.

Compartmentalized computation

Norm Hardy – one of the primary architects of KeyKOS, who I’ve already mentioned several times – has quipped that “the last piece of software anyone ever writes for a secure, capability OS is always the Unix virtualization layer.” This is a depressing testimony to the power that the installed base has over the software ecosystem. However, it also suggests an important benefit that these kinds of OS’s can provide, even in an era when Linux is the defacto standard.

In the new world of cloud computing, virtualization is increasingly how everything gets done. Safety-through-compartmentalization has long been one of the key selling points driving this trend. The idea is that even if an individual VM is compromised due to an exploitable flaw in the particular mix of application code, libraries, and OS services that it happens to be running, this does not gain the attacker access to other, adjacent VMs running on the same hardware.

The underlying idea – isolate independent pieces of computation so they can’t interfere with each other – is not new. It is to computer science what vision is to evolutionary biology, an immensely useful trick that gets reinvented over and over again in different contexts. In particular, it’s a key idea motivating the architecture of most multitasking operating systems in the first place. Process isolation has long been the standard way for keeping one application from messing up another. What virtualization brings to the table is to give application and service operators control over a raft of version and configuration management issues that were traditionally out of their hands, typically in the domain of the operators of the underlying systems on which they were running. Thus, for example, even if everyone in your company is using Linux it could still be the case that a service you manage depends on some Apache module that only works on Linux version X, while another some other wing of your company has a service requiring a version of MySQL that only works with Linux version Y. But with virtualization you don’t need to fight about which version of Linux to run on your company server machines. Instead, you can each have your own VMs running whichever version you need. More significantly, even if the virtualization system itself requires Linux version Z, it’s still not a problem, because it’s at a different layer of abstraction.

Virtualization doesn’t just free us from fights over which version of Linux to use, but which operating system entirely. With virtualization you can run Linux on Windows, or Windows on Mac, or FreeBSD on Linux, or whatever. In particular, it means you can run Linux on seL4. This is interesting because all the mainstream operating systems have structural vulnerabilities that mean they inevitably tend to get breached, and when somebody gets into the OS that’s running the virtualization layer it means they get into all the hosted VMs as well, regardless of their OS. While it’s still early days, initial indications are that seL4 makes a much more solid base for the virtualization layer than Linux or the others, while still allowing the vast bulk of the code that needs to run to continue working in its familiar environment.

By providing a secure base for the virtualization layer, you can provide a safe place to stand for datacenter operators and other providers of virtualized services. You have to replace some of the software that manages your datacenter, but the datacenter’s customers don’t have to change anything to benefit from this; indeed, they need not even be aware that you’ve done it.

This idea of giving applications a secure place to run, a place where the rules make sense and critical invariants can be relied upon – what I like to call an island of sanity – is not limited to hardware virtualization. “Frozen Realms”, currently working its slow way through the JavaScript standardization process, is a proposal to apply ocap-based compartmentalization principles to the execution environment of JavaScript code in the web browser.

The stock JavaScript environment is highly plastic; code can rearrange, reconfigure, redefine, and extend what’s there to an extraordinary degree. This massive flexibility is both blessing and curse. On the blessing side, it’s just plain useful. In particular, a piece of code that relies on features or behavior from a newer version of the language standard can patch the environment of an older implementation to emulate the newer pieces that it needs (albeit sometimes with a substantial performance penalty). This malleability is essential to how the language evolves without breaking the web. On the other hand, it makes it treacherous to combine code from different providers, since it’s very easy for one chunk of code to undermine the invariants that another part relies on. This is a substantial maintenance burden for application developers, and especially for the creators of application frameworks and widely used libraries. And this before we even consider what can happen if code behaves maliciously.

Frozen Realms is a scheme to let you to create an isolated execution environment, configure it with whatever modifications and enhancements it requires, lock it down so that it is henceforth immutable, and then load and execute code within it. One of the goals of Frozen Realms is to enable defensively consistent software – code that can protect its invariants against arbitrary or improper behavior by things it’s interoperating with. In a frozen realm, you can rely on things not to change beneath you unpredictably. In particular, you could load independent pieces of software from separate developers (who perhaps don’t entirely trust each other) into a common realm, and then allow these to interact safely. Ocaps are key to making this work. All of the ocap coding patterns mentioned earlier become available as trustworthy tools, since the requisite invariants are guaranteed. Because the environment is immutable, the only way pieces of code can affect each other is via object references they pass between them. Because all external authority enters the environment via object references originating outside it, rather than being ambiently available, you have control over what any piece of code will be allowed to do. Most significantly, you can have assurances about what it will not be allowed to do.

Distributed services

There are many problems in the distributed services arena for which the capability approach can be helpful. In the interest of not making this already long essay even longer, I’ll just talk here about one of the most important: the service chaining problem, for which the ACL approach has no satisfactory solution at all.

The web is a vast ecosystem of services using services using services. This is especially true in the corporate world, where companies routinely contract with specialized service providers to administer non-core operational functions like benefits, payroll, travel, and so on. These service providers often call upon even more specialized services from a range of other providers. Thus, for example, booking a business trip may involve going through your company’s corporate intranet to the website of the company’s contracted travel agency, which in turn invokes services provided by airlines, hotels, and car rental companies to book reservations or purchase tickets. Those services may themselves call out to yet other services to do things like email you your travel itinerary or arrange to send you a text if your flight is delayed.

Now we have the question: if you invoke one service that makes use of another, whose credentials should be used to access the second one? If the upstream service uses its own credentials, then it might be fooled, by intention or accident, into doing something on your behalf that it is allowed to do but which the downstream service wouldn’t let you do (a classic instance of the Confused Deputy problem). On the other hand, if the upstream service needs your credentials to invoke the downstream service, it can now do things that you wouldn’t allow. In fact, by giving it your credentials, you’ve empowered it to impersonate you however it likes. And the same issues arise for each service invocation farther down the chain.

Consider, for example, a service like Mint that keeps track of your finances and gives you money management advice. In order to do this, they need to access banks, brokerages, and credit card companies to obtain your financial data. When you sign up for Mint, you give them the login names and passwords for all your accounts at all the financial institutions you do business with, so they can fetch your information and organize it for you. While they promise they’re only going to use these credentials to read your data, you’re actually giving them unlimited access and trusting them not to abuse it. There’s no reason to believe they have any intention of breaking their promise, and they do, in fact, take security very seriously. But in the end the guarantee you get comes down to “we’re a big company, if we messed with you too badly we might get in trouble”; there are no technical assurances they can really provide. Instead, they display the logos of various security promoting consortia and double pinky swear they’ll guard your credentials with like really strong encryption and stuff. Moreover, their terms of use work really hard to persuade you that you have no recourse if they fail (though who actually gets stuck holding the bag in the event they have a major security breach is, I suspect, virgin territory, legally speaking).

While I’m quite critical of them here, I’m not actually writing this to beat up on them. I’m reasonably confident (and I say this without knowing or having spoken to anyone there, merely on the basis of having been in management at various companies myself) that they would strongly prefer not to be in the business of running a giant single point of failure for millions of people’s finances. It’s just that given the legacy security architecture of the web, they have no practical alternative, so they accept the risk as a cost of doing business, and then invest in a lot of complex, messy, and very expensive security infrastructure to try to minimize that risk.

To someone steeped in capability concepts, the idea that you would willingly give strangers on the web unlimited access to all your financial accounts seems like madness. I suspect it also seems like madness to lay people who haven’t drunk the conventional computer security kool aid, evidence of which is the lengths Mint has to go to in its marketing materials trying to persuade prospective customers that, no really, this is OK, trust us, please pay no attention to the elephant in the room.

The capability alternative (which, I stress, is not an option currently available to you), would be to obtain a separate credential – a capability! – from each of your financial institutions that you could pass along to a data management service like Mint. These credentials would grant read access to the relevant portions of your data, while providing no other authority. They would also be revocable, so that you could unilaterally withdraw this access later, say in the event of a security breach at the data manager, without disrupting your own normal access to your financial institutions. And there would be distinct credentials to give to each data manager that you use (say, you’re trying out a competitor to Mint) so that they could be independently controlled.

There are no particular technical obstacles to doing any of this. Alan Karp worked out much of the mechanics at HP Labs in a very informative paper called “Zebra Copy: A reference implementation of federated access management” that should be on everyone’s must read list.

Even with existing infrastructure and tools there are many available implementation options. Alan worked it out using SAML certificates, but you can do this just as well with OAuth2 bearer tokens, or even just special URLs. There are some new user interface things that would have to be done to make this easy and relatively transparent for users, but there’s been a fair bit of experimentation and prototyping done in this area that have pointed to a number of very satisfactory and practical UI options. The real problem is that the various providers and consumers of data and services would all have to agree that this new way of doing things is desirable, and then commit to switching over to it, and then standardize the protocols involved, and then actually change their systems, whereas the status quo doesn’t require any such coordination. In other words, we’re back to the installed base problem mentioned earlier.

However, web companies and other large enterprises are constantly developing and deploying hordes of new services that need to interoperate, so even if the capability approach is an overreach for the incumbents, it looks to me like a competitive opportunity for ambitious upstarts. Enterprises in particular currently lack a satisfactory service chaining solution, even though they’re in dire need of one.

In practice, the main defense against bad things happening in these sorts of systems is not the access control mechanisms at all, it’s the contractual obligations between the various parties. This can be adequate for big companies doing business with other big companies, but it’s not a sound basis for a robust service ecosystem. In particular, you’d like software developers to be able to build services by combining other services without the involvement of lawyers. And in any case, when something does go wrong, with or without lawyers it can be hard to determine who to hold at fault because confused deputy problems are rooted in losing track of who was trying to do what. In essence we have engineered everything with built in accountability laundering.

ACL proponents typically try to patch the inherent problems of identity-based access controls (that is, ones rooted in the “who are you?” question) by piling on more complicated mechanisms like role-based access control or attribute-based access control or policy-based access control (Google these if you want to know more; they’re not worth describing here). None of these schemes actually solves the problem, because they’re all at heart just variations of the same one broken thing: ambient authority. I think it’s time for somebody to try to get away from that.

Software engineering practices

At Electric Communities we set out to create technology for building a fully decentralized, fully user extensible virtual world. By “fully decentralized” we meant that different people and organizations could run different parts of the world on their own machines for specific purposes or with specific creative intentions of their own. By “fully user extensible” we meant that folks could add new objects to the world over time, and not just new objects but new kinds of objects.

Accomplishing this requires solving some rather hairy authority management problems. One example that we used as a touchstone was the notion of a fantasy role playing game environment adjacent to an online stock exchange. Obviously you don’t want someone taking their dwarf axe into the stock exchange and whacking peoples’ heads off, nor do you want a stock trader visiting the RPG during their lunch break to have their portfolio stolen by brigands. While interconnecting these two disparate applications doesn’t actually make a lot of sense, it does vividly capture the flavor of problems we were trying to solve. For example, if the dwarf axe is a user programmed object, how does it acquire the ability to whack people’s heads off in one place but not have that ability in another place?

Naturally, ocaps became our power tool of choice, and lots of interesting and innovative technology resulted (the E programming language is one notable example that actually made it to the outside world). However, all the necessary infrastructure was ridiculously ambitious, and consequently development was expensive and time consuming. Unfortunately, extensible virtual world technology was not actually something the market desperately craved, so being expensive and time consuming was a problem. As a result, the company was forced to pivot for survival, turning its attentions to other, related businesses and applications we hoped would be more viable.

I mention all of the above as preamble to what happened next. When we shifted to other lines of business, we did so with a portfolio of aggressive technologies and paranoid techniques forged in the crucible of this outrageous virtual world project. We went on using these tools mainly because they were familiar and readily at hand, rather than due to some profound motivating principle, though we did retain our embrace of the ocap mindset. However, when we did this, we made an unexpected discovery: code produced with these tools and techniques had greater odds of being correct on the first try compared to historical experience. A higher proportion of development time went to designing and coding, a smaller fraction of time to debugging, things came together much more quickly, and the resulting product tended to be considerably more robust and reliable. Hmmm, it seems we were onto something here. The key insight is that measures that prevent deliberate misbehavior tend to be good at preventing accidental misbehavior also. Since a bug is, almost by definition, a form of accidental misbehavior, the result was fewer bugs.

Shorn of all the exotic technology, however, the principles at work here are quite simple and very easy to apply to ordinary programming practice, though the consequences you experience may be far reaching.

As an example, I’ll explain how we apply these ideas to Java. There’s nothing magic or special about Java per se, other than it can be tamed – some languages cannot – and that we’ve had a lot of experience doing so. In Java we can reduce most of it to three simple rules:

  • Rule #1: All instance variables must be private
  • Rule #2: No mutable static state or statically accessible authority
  • Rule #3: No mutable state accessible across thread boundaries

That’s basically it, though Rule #2 does merit a little further explication. Rule #2 means that all static variables must be declared final and may only reference objects that are themselves transitively immutable. Moreover, constructors and static methods must not provide access to any mutable state or side-effects on the outside world (such as I/O), unless these are obtained via objects passed into them as parameters.

These rules simply ensure the qualities of reference unforgeability and encapsulation that I mentioned a while back. The avoidance of static state and static authority is because Java class names constitute a form of forgeable reference, since anyone can access a class just by typing its name. For example, anybody at all can try to read a file by creating an instance of java.io.FileInputStream, since this will open a file given a string. The only limitations on opening a file this way are imposed by the operating system’s ACL mechanism, the very thing we are trying to avoid relying on. On the other hand, a specific instance of java.io.InputStream is essentially a read capability, since the only authority it exposes is on its instance methods.

These rules cover most of what you need. If you want to get really extreme about having a pure ocap language, in Java there are a few additional edge case things you’d like to be careful of. And, of course, it would also be nice if the rules could be enforced automatically by your development tools. If your thinking runs along these lines, I highly recommend checking out Adrian Mettler’s Joe-E, which defines a pure ocap subset of Java much more rigorously, and provides an Eclipse plugin that supports it. However, simply following these three rules in your ordinary coding will give you about 90% of the benefits if what you care about is improving your code rather than security per se.

Applying these rules in practice will change your code in profound ways. In particular, many of the standard Java class libraries don’t follow them – for example lots of I/O authority is accessed via constructors and static methods. In practice, what you do is quarantine all the unavoidable violations of Rule #2 into carefully considered factory classes that you use during your program’s startup phase. This can feel awkward at first, but it’s an experience rather like using a strongly typed programming language for the first time: in the beginning you keep wondering why you’re blocked from doing obvious things you want to do, and then it slowly dawns on you that actually you’ve been blocked from doing things that tend to get you into trouble. Plus, the discipline forces you to think through things like your I/O architecture, and the result is generally improved structure and greater robustness.

(Dealing with the standard Java class libraries is a bit of an open issue. The approach taken by Joe-E and its brethren has been to use the standard libraries pruned of the dangerous stuff, a process we call “taming”. But while this yields safety, it’s less than ideal from an ergonomics perspective. A project to produce a good set of capability-oriented wrappers for the functionality in the core Java library classes would probably be a valuable contribution to the community, if anyone out there is so inclined.)

Something like the three rules for Java can be often devised for other languages as well, though the feasibility of this does vary quite a bit depending on how disciplined the language in question is. For example, people have done this for Scala and OCaml, and it should be quite straightforward for C#, but probably hopeless for PHP or Fortran. Whether C++ is redeemable in this sense is an open question; it seems plausible to me, although the requisite discipline somewhat cuts against the grain of how people use C++. It’s definitely possible for JavaScript, as a number of features in recent versions of the language standard were put there expressly to enable this kind of thing. It’s probably also worth pointing out that there’s a vibrant community of open source developers creating new languages that apply these ideas. In particular, you should check out Monte, which takes Python as its jumping off point, and Pony, which is really its own thing but very promising.

There’s a fairly soft boundary here between practices that simply improve the robustness and reliability of your code if you follow them, and things that actively block various species of bad outcomes from happening. Obviously, the stronger the discipline enforced by the tools is, the stronger the assurances you’ll have about the resulting product. Once again, the analogy to data types comes to mind, where there are best practices that are basically just conventions to be followed, and then there are things enforced, to a greater or lesser degree, by the programming language itself. From my own perspective, the good news is that in the short term you can start applying these practices in the places where it’s practical to do so and get immediate benefit, without having to change everything else. In the long term, I expect language, library, and framework developers to deliver us increasingly strong tools that will enforce these kinds of rules natively.

Conclusion

At its heart, the capability paradigm is about organizing access control around specific acts of authorization rather than around identity. Identity is so fundamental to how we interact with each other as human beings, and how we have historically interacted with our institutions, that it is easy to automatically assume it should be central to organizing our interactions with our tools as well. But identity is not always the best place to start, because it often fails to tell us what we really need to know to make an access decision (plus, it often says far too much about other things, but that’s an entirely separate discussion).

Organizing access control around the question “who are you?” is incoherent, because the answer is fundamentally fuzzy. The driving intuition is that the human who clicked a button or typed a command is the person who wanted whatever it was to happen. But this is not obviously true, and in some important cases it’s not true at all. Consider, for example, interacting with a website using your browser. Who is the intentional agent in this scenario? Well, obviously there’s you. But also there are the authors of every piece of software that sits between you and the site. This includes your operating system, the web browser itself, the web server, its operating system, and any other major subsystems involved on your end or theirs, plus the thousands of commercial and open source libraries that have been incorporated into these systems. And possibly other stuff running on your (or their) computers at the time. Plus intermediaries like your household or corporate wireless access point, not to mention endless proxies, routers, switches, and whatnot in the communications path from here to there. And since you’re on a web page, there’s whatever scripting is on the page itself, which includes not only the main content provided by the site operators but any of the other third party stuff one typically finds on the web, such as ads, plus another unpredictably large bundle of libraries and frameworks that were used to cobble the whole site together. Is it really correct to say that any action taken by this vast pile of software was taken by you? Even though the software has literally tens of thousands of authors with a breathtakingly wide scope of interests and objectives? Do you really want to grant all those people the power to act as you? I’m fairly sure you don’t, but that’s pretty much what you’re actually doing, quite possibly thousands of times per day. The question that the capability crowd keeps asking is, “why?”

Several years ago, the computer security guru Marcus Ranum wrote: “After all, if the conventional wisdom was working, the rate of systems being compromised would be going down, wouldn’t it?” I have no idea where he stands on capabilities, nor if he’s even aware of them, but this assertion still seems on point to me.

I’m on record comparing the current state of computer security to the state of medicine at the dawn of the germ theory of disease. I’d like to think of capabilities as computer security’s germ theory. The analogy is imperfect, of course, since germ theory talks about causality whereas here we’re talking about the right sort of building blocks to use. But I keep being drawn back to the parallel largely because of the ugly and painful history of germ theory’s slow acceptance. On countless occasions I’ve presented capability ideas to folks who I think ought know about them – security people, engineers, bosses. The typical response is not argument, but indifference. The most common pushback, when it happens, is some variation of “you may well be right, but…”, usually followed by some expression of helplessness or passive acceptance of the status quo. I’ve had people enthusiastically agree with everything I’ve said, then go on to behave as if these ideas had never ever entered their brains. People have trouble absorbing ideas that they don’t already have at least some tentative place for in their mental model of the world; this is just how human minds work. My hope is that some of the stuff I’ve written here will have given these ideas a toehold in your head.

Acknowledgements

This essay benefitted from a lot of helpful feedback from various members of the Capabilities Mafia, the Friam group, and the cap-talk mailing list, notably David Bruant, Raoul Duke, Bill Frantz, Norm Hardy, Carl Hewitt, Chris Hibbert, Baldur Jóhannsson, Alan Karp, Kris Kowal, William Leslie, Mark Miller, David Nicol, Kevin Reid, and Dale Schumacher. My thanks to all of them, whose collective input improved things considerably, though of course any remaining errors and infelicities are mine.

October 17, 2008

The Tripartite Identity Pattern

One of the most misunderstood patterns in social media design is that of user identity management. Product designers often confuse the many different roles required by various user identifiers. This confusion is compounded by using older online services, such as Yahoo!, eBay and America Online, as canonical references. The services established their identity models based on engineering-centric requirements long before we had a more subtle understanding of user requirements for social media. By conjoining the requirements of engineering (establishing sessions, retrieving database records, etc.) with the users requirements of recognizability and self-expression, many older identity models actually discourage user participation. For example: Yahoo! found that users consistently listed that the fear of spammers farming their e-mail address was the number one reason they gave for abandoning the creation of user created content, such as restaurant reviews and message board postings. This ultimately led to a very expensive and radical re-engineering of the Yahoo identity model which has been underway since 2006.

Consistently I’ve found that a tripartite identity model best fits most online services and should be forward compatible with current identity sharing methods and future proposals.

The three components of user identity are: the account identifier, the login identifier, and the public identifier.

Identity 2.gif

Account Identifier (DB Key)

From an engineering point of view, there is always one database key – one-way to access a user’s record – one-way to refer to them in cookies and potentially in URLs. In a real sense he account identifier is the closest thing the company has to a user. It is required to be unique and permanent. Typically this is represented by a very large random number and is not under the user’s control in any way. In fact, from the user’s point of view this identifier should be invisible or at the very least inert; there should be no inherent public capabilities associated with this identifier. For example it should not be an e-mail address, accepted as a login name, displayed as a public name, or an instant messenger address.

Login Identifier(s) (Session Authentication)

Login identifiers are necessary create valid sessions associated with an account identifier. They are the user’s method of granting access to his privileged information on the service. Historically, these are represented by unique and validated name/password pairs. Note that the service need not generate its own unique namespace for login identifiers but may adopt identifiers from other providers. For example, many services except external e-mail addresses as login identifiers usually after verifying that the user is in control of that address. Increasingly, more sophisticated capability-based identities are accepted from services such as OpenID, oAuth, and Facebook Connect; these provide login credentials without constantly asking a user for their name and password.

By separating the login identifier from the account identifier, it is much easier to allow the user to customize their login as the situation changes. Since the account identifier need never change, data migration issues are mitigated. Likewise, separating the login identifier from public identifiers protects the user from those who would crack their accounts. Lastly, a service could provide the opportunity to attach multiple different login identifiers to a single account — thus allowing the service to aggregate information gathered from multiple identity suppliers.

Public identifier(s) (Social Identity)

Unlike the service-required account and login identifiers, the public identifier represents how the user wishes to be perceived by other users on the service. Think of it like clothing or the familar name people know you by. By definition, it does not possess the technical requirement to be 100% unique. There are many John Smiths of the world, thousands of them on Amazon.com, hundreds of them write reviews and everything seems to work out fine.

Online a user’s public identifier is usually a compound object: a photo, a nickname, and perhaps age, gender, and location. It provides sufficient information for any viewer to quickly interpret personal context. Public identifiers are usually linked to a detailed user profile, where further identity differentiation is available; ‘Is this the same John Smith from New York that also wrote the review of the great Gatsby that I like so much?’ ‘Is this the Mary Jones I went to college with?’

A sufficiently diverse service, such as Yahoo!, may wish to offer multiple public identifiers when a specific context requires it. For example, when playing wild-west poker a user may wish to present the public identity of a rough-and-tumble outlaw, or a saloon girl without having that imagery associated with their movie reviews.

Update 11/12/2008: This model was presented yesterday at the Internet Identity Workshop as an answer to many of the confusion surrounding making the distributed identity experience easier for users. The key insight this model provides is that no publicly shared identifier is required (or even desirable) to be used for session authentication, in fact requiring the user to enter one on a RP website is an unnecessary security risk.

Three main critiques of the model were raised that should be addressed in a wider forum:

  1. There was some confusion of the scope of the model – Are the Account IDs global?

    I hand modified the diagram to add an encompassing circle to show the context is local – a single context/site/RP. In a few days I’ll modify the image in this post to reflect the change.

  2. The term “Public Identity” is already in use by iCards to mean something incompatible with this model.

    I am more than open to an alternative term that captures this concept. Leave comments or contact me at randy dot farmer at pobox dot com.

  3. Publically sharable capability-based identifiers are not included in this model. These include email addresses, easy-to-read-URLs, cel phone numbers etc.

    There was much controversy on this point. To me, these capability based identifiers are outside the scope of the model, and generating them and policies sharing them are withing the scope of the context/site/RP. Perhaps an interested party might adopt the tripartite pattern as a sub-pattern of a bigger sea of identifiers. My goal was not to be all encompassing, but to demonstrate that only three identifiers are required for sites that have user generated content, and that no public capability bound ID exchange was required. RPs should only see a the Public ID and some unique key for the session that grants permission bound access to the user’s Account.

May 19, 2007

Second Life History: The Jessie Massacre

Or: The first deployment of user-created WMDs in a 3D virtual world
As told by the perpetrator, Oracle Omega

My first impression of Second Life was formed when it was still under development, when Phillip came to visit Chip and me at our third little startup: State Software. Technically, it was pretty amazing. They’d finally created an extensible, programmable world with physics built right in. On the social side the model was that everyone would live and build on one of a few large continents. We cautioned that this would be fraught with peril. Even before the first beta testers arrived, they’d been warned that their biggest problems were going to be property encroachment, bad neighbors, and script-griefing. Alpha World had demonstrated that many of the neighborhoods would be something between garbage dumps, billboard farms, and smutty slums next to some amazingly creative and wonderful stuff. Much of the predicted chaos happened during beta, but the full force wasn’t felt until broader release, especially when anyone could join instantly and for free.

I happened to be unemployed during late alpha and early beta, and had been so intrigued by Second Life that I decided to run some experiments, pushing the limits of what how I thought future users would abuse the system, specifically property rights and scripting capabilities. As I’ve written elsewhere, regular beta testers normally don’t push the limits as much as we’d like them to because they fear losing their status as testers by being ejected.

Having co-created several of the progenitors of this type of system, I knew where to look for cracks. I had no fear of being ejected for taking the servers down. On the contrary, it was an explicit goal. Better now, during testing, than later with paying customers.

Probably the most legendary of my experiments was the Invisible Teleporting Grenade of Death. Nothing special compared to the offensive and defensive objects in Second Life today, but it caused quite a stir during beta because it was the first known deployment of a user-created Weapon of Mass Destruction in a 3D virtual world.

Note: This wasn’t the first programmable world I’d done massive damage to: Years earlier, after a certain Wizard on LambdaMOO decided to show-off and summon all the food in his world to our room for a food-fight, I was inspired to write a script that would summon all instances of any class into the room with me. I tried on it Class:Paper, and it worked perfectly , first try. It was at that moment I realized I had no way to put the paper back where it belonged! I quickly wrote a script that stuffed the paper into the pockets of their owners and reported this flaw to another Wizard. She was not happy.

During the Second Life beta test, its initial culture was starting to emerge. In my experience, worlds like this one attract early adopters of a somewhat democratic-libertarian bent – “Lets just all get along” and “Leave Real Life rules behind” often reflect the mentality of the most vocal users. But, something unusual happened this time – another virtualworld, called World War II Online, was failing and its 1940’s role-playing refugees migrated to Second Life, en masse. Since it provided for personal combat (hit points), death (teleport you home), and you could build just about anything, including weapons, it seemed like an ideal fit. Quickly they’d built up WWII cultural and military items, including Nazi uniforms, gear and propaganda, including flags and posters with Swastikas and the like. Eventually they took over the only remaining full-combat enabled simulator [patch of land], named Jessie, and made it their home.


A WWIIOL emplacement in Jessie

This ticked off many members of the existing community, who detested all of the pro-Nazi imagery. The WWII online-ers said they just wanted to be left alone to play their war games. Both sides were sniping at each other, both literally and with virtual weapons. Eventually there was a huge wall constructed separating Jessie from its neighbors. It didn’t help.

I’d built and run too many worlds and had seen this kind of thing end badly so many times that I just stayed out of it. Honestly, this was the kind of thing I’d warned about from the beginning and I just wanted to see what would happen.

Until the day I’d completed my latest experiment.

I’d been working with the object spawning directives in the scripting language. I’d also discovered that I could make an object very small (less than an inch in diameter), and very transparent (virtually invisible). It struck on me that I could make a weapon of mass destruction and do it very cheaply. It worked like this: a tiny invisible floating grenade that would explode into dozens of invisible tiny fragments flying outward spherically at maximum velocity and doing maximum damage and then immediately teleport itself to another random location in the simulator. It would be undetectable, unstoppable, and lethal: The perfect killing machine. It could only be stopped by me shouting the keyword: STOP!

Small-scale tests on my land were successful. It fired up to 100 rounds per minute. But, where could I test this at full scale? There was only one answer – Jessie – the only Sim with an active population and the fatality flag on. As a special guest beta tester I had 30 minutes early access to the servers, so I dropped six of these little gems in Jessie just before opening time, they wouldn’t have a chance to catch me. Back then, each object spawn cost $L10, so my balance indicator started fluctuating wildly as the invisible fragments spawn, flew, and eventually hit something or someone.

I flew to the simulators with the most users and tried to chat naturally, but it was difficult, knowing the chaos that was going on in Jessie when people arrived: Log in, poke around awhile then seem to randomly die, get teleported home, which is also in Jessie, wait a short moment, repeat!

After about a half hour, people around me were starting to say “Wow! Someone is slaughtering those WWII guys in Jessie!” “That place is in a panic!” “That guy’s my hero!” “Lets go see!” The grenades were working. Besides making my point about the scripting language, I’d created one of the first legendary events of the world. That was exciting.

But, only then did I realize I’d chosen sides in a fight that I didn’t really care about. I wasn’t really sure what to do at that moment, when I got an Instant Message from one of the Lindens: “Did you release an auto cannon in Jessie?” I had to be a smartass and answer: “No. I released six. I’ll go and deactivate them now.”

I flew to the edge of Jessie and shouted the keyword. My balance meter stopped jumping around and stabilized, the attack was over. It had been well over an hour since opening, and I was certain that I had the highest kill rate in Second Life history. But now I had a problem. I had no way to extract them (and I wasn’t about to enter Jessie at that moment anyway – I was certainly Kill On Sight at that point, assuming they knew the name of the bomber.

It turned out that my grenades were too small and invisible. Though they were now inert I couldn’t find them to remove them. In effect, they were a dormant virus in Jessie. So, I filed a bug report: “Unable to select small, invisible objects.” The in next day or two there was a patch to the client to “show transparency” so that it would be possible for me to see them, select them, and delete them – which I promptly did. But the legend remains.

In the end, very little was done to mitigate the design of WMDs like mine, and I was told that to “fix” the problem would put serious limits on the creativity of future users. So be it. But, given the history of the service since then, with so many sim-failures based on malicious and accidental infinite spawning scripts, I’m not so sure that ignoring this problem was the best choice. I hope it is not too late.

March 1, 2007

The Untold History of Toontown’s SpeedChat (or BlockChattm from Disney finally arrives)

This story has been recorded as part of the Social Media Clarity podcast: [sc_embed_player fileurl=”http://www.buzzsprout.com/16050/138068-disney-s-hercworld-toontown-and-blockchat-tm-s01e08.mp3″]

Disney's ToonTown SpeedChat

In 1992, I co-founded a company with Chip Morningstar and Douglas Crockford named Electric Communities. We initially did a lot of consulting for various media companies that were looking to leverage the emerging online gaming industry. One of those companies was Disney.

Disney had formed a group to look into taking the brand online, including a full-fledged multiplayer experience as early as 1996, when the were considering a product called HercWorld, which was to leverage the upcoming movie franchise Hercules. Having built Lucasfilm’s Habitat and WorldsAway, we were clearly amongst a handful of teams that had successfully constructed social virtual worlds that’d made any real money, and Crock had media connections from his days a Paramount, so they brought us in to discuss what it would take to build a kid-safe virtual world experience.

They had hired their own expert to lead the project, a former product manager for Knowledge Adventure – a kid’s software company that’d done some 3D work as well as their own online project KA-Worlds, which was meant to link sick children in hospitals together using computers and avatars.

Disney makes no bones about how tightly they want to control and protect their brand, and rightly so. Disney means "Safe For Kids". There could be no swearing, no sex, no innuendo, and nothing that would allow one child (or adult pretending to be a child) to upset another.

I found myself unable to reconcile the idea of a virtual world, where kids would run around, play with objects, and chat with each other without someone saying or doing something that might upset another. Even in 1996, we knew that text-filters are no good at solving this kind of problem, so I asked for a clarification: "I’m confused. What standard should we use to decide if a message would be a problem for Disney?"

The response was one I will never forget: "Disney’s standard is quite clear:

No kid will be harassed, even if they don’t know they are being harassed."

"So much for no-harm, no-foul," Chip grumbled, quietly. This requirement lead me to some deep thinking over the coming weeks and months about a moderation design I called "The Disney Panopticon", but that’s a post for another day…

"OK. That means Chat Is Out of HercWorld, there is absolutely no way to meet your standard without exorbitantly high moderation costs," we replied.

One of their guys piped up: "Couldn’t we do some kind of sentence constructor, with a limited vocabulary of safe words?"

Before we could give it any serious thought, their own project manager interrupted, "That won’t work. We tried it for KA-Worlds."

"We spent several weeks building a UI that used pop-downs to construct sentences, and only had completely harmless words – the standard parts of grammar and safe nouns like cars, animals, and objects in the world."

"We thought it was the perfect solution, until we set our first 14-year old boy down in front of it. Within minutes he’d created the following sentence:

I want to stick my long-necked Giraffe up your fluffy white bunny.

KA-Worlds abandoned that approach. Electric Communities is right, chat is out."

That was pretty much settled, but it felt like we had collectively gutted the project. After all, if the kids can’t chat, how could they coordinate? It’d end up being more like a world where you could see other players playing but you couldn’t really work with them much. [Side note: Sadly, a lot of MMORPG play is like this anyway, see Playing Alone Together.]

As I starting daydreaming about how to get chat back into this project, we moved on to what activities the kids might do in the now-chat-free HercWorld. It was standard fare: Collect stuff, ride stuff, shoot at stuff, build stuff… Oops, what was that last thing again?

"…kids can push around Roman columns and blocks to solve puzzles, make custom shapes, and buildings.", one of the designers said.

I couldn’t resist, "Umm. Doesn’t that violate the Disney standard? In this chat-free world, people will push the stones around until they spell Hi! or F-U-C-K or their phone number or whatever. You’ve just invented Block-Chattm. If you can put down objects, you’ve got chat. We learned this in Habitat and WorldsAway, where people would turn 100 Afro-Heads into a waterbed." We all laughed, but it was that kind of awkward laugh that you know means that we’re all probably just wasting our time.

HercWorld never happened.

Once again, into the breech

Electric Communities moved on, renamed itself Communities.com (which has nothing in common with the current company/site using that name and url.) and did some wonderful design work on a giant multimedia 3D kid’s world for Cartoon Network, which ended up being much too ambitious to fund, but I mention it because the project was headed by Brian Bowman. Brian eventually left Atlanta for Disney, where he was in charge of the online experience for Zoog Disney, a pre-teen programming block. Brian remembered his work with us and asked us to help build a world for the Zoog audience. Nothing so extravagant this time, just something simple, like The Palace (which, by then had been acquired by Communities.com.), a no-download, in-browser, 2D graphical chat with some programmed object capabilities.

"The Disney Standard" (now a legend amongst our employees) still held. No harassment, detectable or not, and no heavy moderation overhead.

Brian had an idea though: Fully pre-constructed sentences – dozens of them, easy to access. Specialize them for the activities available in the world. Vaz Douglas, our project manager working with Zoog, liked to call this feature "Chatless Chat." So, we built and launched it for them. Disney was still very tentative about the genre, so the only ran it for about six months; I doubt it was ever very popular.

Third time’s a charm

But the concept resurfaced at Disney a few years later [2002] in the form of SpeedChat in ToonTown. It was refined – you select a subject and then from a submenu of sentences, each automatically customized to the correct context. Selecting "I need to find …", would magically insert the names of the items you have quests for. For all walk-up users, all interactions would be via SpeedChat.

They added a method to allow direct chat between users that involves the exchange of secret codes that are generated for each user (with parental permission). The idea is that kids would print them out and give them to each other on the playground. This was a great way for Disney to end-run the standard – since Speed Chat was an effective method of preventing the exchange of these codes, and theoretically the codes had to be given "in-person", making the recipient not-a-stranger. Sure, some folks post them
on message boards, but presumably those are folks who 1) are adults, or 2) know each other, right? In any case, as long as no one could pass secret codes within Toontown itself, Disney feels safe.

The Ghost of BlockChattm past

Soon after ToonTown opened its doors, they added Toon Estates – a feature that gives you a house with furniture, initially just a bed, gumball machine, chair, and armoire. Then they added the ability to buy more furniture of all shapes and sizes from catalogs, and then you could invite people to visit your house to see how you have arranged all your cool stuff.

Sure enough, chatters figured out a few simple protocols to pass their secret code, several variants are of this general form:

User A:"Please be my friend."
User A:"Come to my house?"
User B:"Okay."
A:[Move the picture frames on your wall, or move your furniture on the floor to make the number 4.]
A:"Okay"
B:[Writes down 4 on a piece of paper and says] "Okay."
A:[Move objects to make the next letter/number in the code] "Okay"
B:[Writes…] "Okay"
A:[Remove objects to represent a "space" in the code] "Okay"
[Repeat steps as needed, until…]
A:"Okay"
B:[Enters secret code into Toontown software.]
B:"There, that worked. Hi! I’m Jim 15/M/CA, what’s your A/S/L?"

Passing bits in my ToonTown Estate

It seems that many of The Lessons of Lucasfilm’s Habitat still ring true.
I’ll consider this as The SpeedChat Corollary:

By hook, or by crook, customers will always find a way to connect with each other.

P.S: Brian tells me that Cartoon Network is actually resuming the project, more than ten years later; "… now that is being ahead of your time."

[Thanks to the legendary Robin Hood of Neopets for telling me about this Secret Code exchange prototcol.]

[Yes, the BlockChattm brand is a joke.]

July 31, 2004

The Birth of Habitat (with Many Digressions on the Early History of Lucasfilm Games and All That)

A posting by Andrew Kirmse in Orkut’s Lucas Valley High community (a group for current and former LucasArts and Lucasfilm Games people) expressed interest in the history of how Habitat came to be. I started to compose a response and then realized that this would probably be interesting to a wider audience than our little ghetto on Orkut, so I decided to publish it here instead. It also ended up being rather longer than I had counted on. Some of this may be a little rambling, but history is like that.

In 1985, when Habitat got started, Lucasfilm Games was a very different organization from the LucasArts Entertainment Company that people see today.

In the early 1980s, Lucasfilm had invested heavily in a number of different high-tech R&D projects intended to push the envelope of motion picture production technology. All this stuff was organized into a unit called the Lucasfilm Computer Division. There were groups working on computer graphics, digital audio processing, digital film editing, and a number of other things. Among the other things was a small computer games group that had been started as a joint venture with Atari, who had given Lucasfilm some money in hopes of capturing some kind of benefit by basking in the reflected glory of Star Wars (unfortunately, the proxy glow of celebrity did nothing to help Atari with its most fundamental problem, which was that it was quite possibly the worst managed company in the history of the universe). In order to understand where the games group went, I guess I first have to explain its somewhat odd position in the Lucasfilm organization.

Of the various elements of the Computer Division, by far the most prestigious and well-known was the Graphics Group, which had assembled a bunch of the smartest, most talented, most famous names in the computer graphics field into a world-class research team, working under the direction of Ed Catmull. Every year they would cut a wide swath through SIGGRAPH, stunning everyone with amazing images and brilliant technical papers. (This figures into how I actually came to work for Lucasfilm in the first place, but that’s a long and complicated story of its own for another time).

In 1984 they set out to make a really big splash. Various major bits of magic they had invented were written up for publication (I guess in those days you still got points for showing off how you did things rather than keeping them secret). Along with this they produced a short animated film, “The Adventures of Andre and Wally B”, for the SIGGRAPH film show. “Andre & Wally” made full use of the anti-aliased, motion-blurred, filtered and texture mapped rendering techniques they had developed. It was quite a tour de force relative to the kinds of things that could be put on film by computers in those days. They brought in an up-and-coming animator from Disney named John Lasseter to give the thing some life and filmic sensibility, to distinguish it from the run-of-the-mill SIGGRAPH film show fare, which generally consisted of (a) rendering demos, produced by and for engineers (and composed with exactly the aesthetic qualities that the obvious stereotypes would lead you to expect), (b) TV commercial promo reels, and (c) excruciating, unwatchable, hideous art crap (“Please God, not another Jane Veeder clip! Just kill me now.”).

“Andre & Wally” was, truth be told, perhaps a wee bit more ambitious than their production capabilities at the time were really ready for. It may well be that they thought it would do them good to try to stretch, but in any event it ended up absorbing a lot more time, effort, and money than anybody had really counted on, as they pushed to get all the pixels rendered by SIGGRAPH. In the end it had ballooned into a very big deal indeed. Rumors inside Lucasfilm put the production cost at something in the neighborhood of $500,000, which is a pretty tidy sum for a two minute demo film aimed at the computer graphics in-crowd. They invited George Lucas to come to the premiere at the SIGGRAPH 84 film show in Minneapolis — and he came. At the party afterwards, he was reserved and polite, but apparently he was not very happy with how things had developed.

At a time when Lucasfilm had no major film productions in the pipeline, the Computer Division was a major cash drain, even by the standards of George’s notoriously money sucking organizational empire. He’d invested in all this stuff on the promise that the technology could slash the wildly escalating costs of making the kinds of movies he wanted to make, and here he’d spent a fortune with no usable tools ready for prime time and no end in sight. He’d hired these guys to do technology development for him and they’d gone and spent half a million dollars of his money and made a movie, and not that great a movie either (though it was, admittedly, technically very advanced). He wasn’t paying these guys so they could make movies, he was paying them so he could make movies.

The fallout of this was that George decided he didn’t want to continue funding the world’s most glamorous technology research if there wasn’t going to be some fairly immediate, fairly concrete payoff in it for him. On the other hand, Lucasfilm had invested a lot of money in this stuff and didn’t want to just write it off either. So they made a decision to spin off these various projects into separate companies, in hopes that outside investors could be attracted and the technology might be better commercialized. In the end, the Computer Division was divided, like Gaul, into three pieces.

The Graphics Group became a company that was eventually called Pixar (“Pixar” was originally the name of a piece of hardware they had been developing, but the name ended up getting tacked onto the company because nobody could agree on another name they liked better). Pixar got sold off to Steve Jobs, who had the technology savvy, steely nerves, deep pockets and general megalomania needed to see it through to maturity. They eventually figured out that their biggest assets included not just their technology but John Lasseter, and the rest of that story you probably know. George (or perhaps his accountant) no doubt wishes now that he’d hung onto a bigger piece.

The digital audio and digital film editing projects got spun off into a company call DroidWorks. They attempted to market the film editing system — the EditDroid — and the Lucasfilm Audio Signal Processor (a huge DSP machine) — the SoundDroid. The film and television industry, contrary to what they’d like you to believe, is very conservative technologically. Selling big, expensive pieces of edge-of-the-art technology into that market was just too hard. DroidWorks sold a few EditDroids, but not enough to get any real traction. The company folded after just a few years (George’s reputed comment on this was, “Helluva waste of a good name.”) Ironically, 20 years later, Moore’s Law has brought the cost of this technology down to the point where it is now a big business. The kinds of video editing systems now sold by companies like Avid are pretty close to the EditDroid in both form and function. The vision that George originally bought into has largely been realized by the market; he was right, just too early.

The third piece was the Games Group. It was different from the rest of the various Computer Division elements in a couple of important ways. First of all, it was small — at this point, just eight people, so it was relatively inexpensive and didn’t demand a lot of management’s attention. And because Lucasfilm had gotten some money from Atari, it didn’t have a history as a major cost center. Second, whereas the other groups were principally concerned with creating technology that would be used to create entertainment products, the Games Group’s mission was to create entertainment products directly. This made it seem a lot closer to Lucasfilm’s core business. Also, George had an intuition that this interactive stuff was probably going to be an important part of the business somewhere off in the future, so it was probably a good idea to cultivate a native understanding of the medium in-house, in preparation for the distant day when the technology was mature. At one point George labeled us “The Lost Patrol” — Nobody knows for sure where they are or what they’re doing, but they’re somewhere out there; every now and again somebody sights their flag on the horizon. In some far off day they’ll return to us, bringing news of distant lands and wonders beyond imagining. So the Games Group was not spun off along with the rest, but instead became its own business unit within the company: the Lucasfilm Games Division.

Even though Games managed to stay within the fold, the institutional trauma of The Great Andre & Wally B Budget Blowout definitely had an impact on the group’s mandate. We had already been in the midst of shifting from a researchy mode into a more product oriented one, as the first games developed on Atari’s nickel came ready for market. This transformation was now to be accelerated and we were expected to become truly self-sufficient. We still had a bit of a research bent — we were expected not to be a drain, but we weren’t necessarily expected to make a big profit at first. Our mission statement was dictated directly by George: “Stay small, be the best, and don’t lose any money.”

Because of the phenomenal success of the Star Wars and Indiana Jones franchises, Lucasfilm existed in a weird kind of bubble that made it very different from other companies, especially companies in the computer games industry. Most of this weirdness had to do with money, or expectations about money. The basic attitude can be summed up as, “we are Lucasfilm, people will pay us.” The fundamental business concept of making an investment in expectation of a future return was not part of the general mindset. The expectation rather was that people would pay us to do things, and then we would take a share of the profits of whatever resulted. In other words, we wanted a cut of the proceeds but were not interested in sharing any of the risk. This attitude is a luxury most business people would love to have, but quite correctly recognize for the fantasy that it is. Except in Lucasfilm’s case it wasn’t a complete fantasy. Companies would, in fact, line up to make deals with us in which they took all the risk, either because they wanted what we had so badly that they were willing to pay an extraordinary premium to get it, or just because the cachet of being associated with us was so entrancing. (Lucasfilm insider joke: Q. How many Lucasfilm employees does it take to screw in a light bulb? A. Just one. He holds the light bulb and the world revolves around him.) Lucasfilm was, in fact, in the stone soup business, selling a very expensive, attractively branded stone. Thus, the predominant mindset in this extraordinarily successful and prosperous company was paradoxically one of extreme risk aversion.

This frame of mind colored everything. It lead to a couple of fundamental constraints that rather tightly restricted what we could do. The first rule was that we were not to do anything that required spending the company’s own money. We could do pretty much whatever we wanted, but we had to get somebody else to pay for it, arguments about ROI not withstanding. The second constraint was that although we had a fairly high level of creative freedom, we were absolutely forbidden from doing anything that made use of the company’s film properties, especially Star Wars. That was viewed as just like spending money, since these properties were, in effect, money in the bank. If somebody else wanted to make a Star Wars game, they had to pay a hefty license fee, and so we made money no matter how well or how poorly their game did, whereas if we made such a game ourselves we would be taking all the risk if it bombed (and never mind that we’d also get 100% of the upside if the game was a hit).

The practice the Games Division evolved in this environment was this: anyone in the group who came up with a serious design concept or project idea would write up a two or three page design proposal document, which we would kick around amongst ourselves for critical discussion but which would ultimately get placed in Steve Arnold’s ideas file (Steve was the head of the Games Division at the time). From time to time (a few times a month, I’d estimate), companies would come shopping, looking to do business with us. Sometimes they had specific projects in mind, but more often they just had a vague idea or two and were perhaps drawn here mostly because they thought it would be cool to visit Lucasfilm — tour the ranch, see some movie production facilities, maybe get a glimpse of George or some other famous person over lunch (the role of glamour in this process can’t be overstated). In the course of shmoozing with these folks, Steve would make a judgement about whether they were serious prospects, and if so he’d grab a couple of ideas from the idea file that seemed related to what they were interested in, grab the authors of those ideas, and have a meeting, where we’d make a pitch (often this process would extend over several visits and take weeks or months). Usually this ended up being just talk, but every now and then somebody would bite and we’d land a deal. During the time I was there we did projects not only with traditional games companies like Atari, Epyx, Activision, Electronic Arts, and Nintendo, but also RCA, Apple Computer, Phillips, IBM, the National Geographic Society, Fujitsu, Commodore, and others.

So one day, I think it was around October, 1984, I was chatting with my office mate, Noah Falstein, over lunch. We got to musing about the new generation of more powerful personal computers then appearing that were based on the Motorola 68000 16-bit processor, such as the Apple Macintosh and the soon-to-ship Amiga (not to mention such never-to-be-heard-from-again platforms as the Phillips CD-I box and the Mindset), as well as the increasing number of computer owners with modems. We had recently finished a many weeks’ long in-house experimental play through of Peter Langston’s ground-breaking multi-player Empire game (Peter was actually the founder of Lucasfilm Games, though he had left the company earlier that summer). One of the problems with Empire was that every game seemed to follow the same evolution: exploration, resource buildup, consolidation, nuclear annihilation. Even though you had the potential for rich interaction that came from having the players be real human beings (much more interesting than any game AI thus far produced), the same pattern always unfolded because there really wasn’t much else people could do. The game was closed-ended in this sense. We thought it might be more interesting if the world was much bigger and the player goals more open-ended. It seemed like the platform technology had matured to the point where it might be feasible to attempt such a thing. What resulted was a pair of proposals, one for something we called Lucasnet, which would correspond to what nowadays we’d call a games portal, and one for something we called the Lucasfilm Games Alliance, which would correspond to what nowadays we’d call a MMORPG (and indeed, which looked in concept a lot like what Star Wars Galaxies turned out to be in practice, albeit 20 years later). The latter proposal asserted the following goals:

  • open-ended
  • large-scale
  • low cost to play
  • permits people with varying time commitments to play
  • ability to rejoin if you get wiped out
  • science fiction/interstellar theme
  • distributed processing on home machines
  • permits different levels of interest and ability

Other than “science fiction/interstellar theme”, this actually seems like a reasonable set of desiderata for any large scale online game, even today. (The thematic goal was included partly because it seemed to provide a hook for the kind of open-endedness that we were seeking, but mainly because it appealed to us personally and we thought it would be cool. It’s really the only element of the proposal that was completely arbitrary.)

This proposal went through the usual process of discussion and revision, got expanded rather considerably (and retitled Lucasfilm’s Universe), and then found its way into Steve’s file along with the rest. And that was the last of it, aside from periodic bouts of wistful speculation about just what a fun project it would be if only we could find somebody to fund it. Then Clive Smith came along.

In 1985, the Commodore 64 was the king of the consumer-level machines, and Commodore International (nee Commodore Business Machines) was riding high. Clive Smith was their Vice President for Strategic Planning, and sometime during the spring or early summer of 1985 (I forget exactly when, and my notes aren’t clear on this), he came shopping. Every year, Commodore came up with some accessory to try to sell to Commodore 64 owners, in hopes of extracting another couple hundred bucks from each of them. One year it had been cheap printers, another year, floppy disk drives, and this year it was going to be a cheap 300 baud modem. In support of this, Commodore had made (at Clive’s instigation) a large investment in an up-and-coming, consumer-oriented online services company called Quantum Computer Services, who ran service called QuantumLink (Q-Link for short), which at the time was targeting the Commodore 64 exclusively.

Clive had also been the one to push Commodore into purchasing Amiga, an event that resulted in much bad blood between Commodore and Atari. Prior to Atari’s meltdown in 1984, everybody’s expectation had been that Amiga was destined to become part of Atari’s empire. Rumor had it that there had been a gentlemen’s agreement to this effect between the founders of Amiga and certain high Atari executives. The talent behind Amiga were the same folks who had developed the Atari 400 and 800 and at Amiga they were were fixing to take the next step along the same evolutionary pathway. The story went that Amiga had been set up as a separate company mainly because it was impossible to get anything done within Atari’s dysfunctional confines, but once the machine was ready Atari would buy back in (I don’t know if this is actually true, but it’s certainly the stuff of Silicon Valley legend). Jack Tramiel, Commodore’s founder, had been forced out of Commodore in a palace coup the previous year. In a weird sort of role reversal, he had then swallowed the bankrupt Atari’s remains and gone into competition with Commodore. There is evidence to suggest that Amiga had been a big part of what he (incorrectly) thought he was buying as part of the Atari deal. But Clive and Commodore had come along and snatched it up first, winning them Tramiel’s undying enmity (and also lawsuits), and forcing the rushed, impromptu, second-rate engineering behind the 520ST and 1040ST models that Atari introduced the following year.

Consequently, when Clive Smith came to Lucasfilm, he was shopping for two kinds of projects: things that would leverage modems and an online service, and things that would leverage the Amiga. Out of the file came the Lucasfilm’s Universe proposal, and another proposal for an Amiga-based space game that David Fox had written. Steve grabbed David and me, and off to the conference room we went to meet with Clive. We made our pitch, and Clive loved both proposals. However, they experienced very different fates.

Commodore, it turns out, was even more of a cheapskate company than Lucasfilm, and didn’t care to spend money on anything. They would be thrilled to have us develop David’s Amiga space game for them, but their concept of the deal seemed to consist of us developing the game and them being thrilled. They didn’t want to put up any actual cash as part of the transaction. Over the next few years we took a wait-and-see attitude towards how the market for Amiga games developed, as did all the other game developers, and so the whole thing was pretty much a flash in the pan: cool hardware, no sales. Mind you, Amiga enthusiasts are among the most dedicated in the industry, more so even than Mac zealots, so you would always hear a lot about the Amiga, but in round numbers (say, to the nearest million units) there weren’t any Amigas out there to develop for and so few of us did. David’s Amiga game never happened.

Quantum Computer Services, on the other hand, was a completely different matter. Although Commodore was one of their major investors, they were very much their own company, and they had considerably more of an entrepreneurial attitude than Commodore did (as one can readily see today from their respective fates — Commodore is gone, and Quantum Computer Services changed its name to America Online). From a business point of view, they were a good match for us because they already knew they were selling a consumer oriented product. We didn’t have to deal with any sales resistance on that point, as we might have with some of their competitors. All Clive had to do was broker some introductions, and we were off and running. Well, sort of. When two companies set out to do business with each other, there is this sort of dance that takes place involving the business people and the lawyers on each side. It became clear pretty early on in this process that we were going to do a deal, but the dance itself took a very long time.

The initial introduction took place sometime in the summer of 1985. It was promising enough that Steve put me to work preparing a design to pitch to them. I started thinking about open ended virtual worlds (though we didn’t have the vocabulary to talk about it then, which made the whole process much more difficult), and the design immediately started moving away from the kind of outer space, conquer the galaxy kind of fantasy that Noah and I had originally discussed. It was clear that a lot of the experience that QuantumLink was delivering to its customers lay purely in the social dimension — people interacting with other people. We wanted to appeal to people outside the hardcore gamer demographic, which tends to be adolescent males, and try instead to appeal to the more mainstream, non-gamer population who used Q-Link. So the design became much more generic, and we ended up with a world that looked kind of funky and suburban, rather than something based on science fiction or fantasy (though it certainly had many SF and fantasy elements).

This was my second big encounter with You Can’t Tell People Anything (the first having been Project Xanadu). The Commodore and QuantumLink people were visibly enthusiastic about the prospect of working with us, but nobody really had a clue what I was talking about when I tried to explain what the thing was supposed to be. Actually, most of the people at Lucasfilm pretty much didn’t have a clue either. At Steve’s suggestion, I wrote a number of “day in the life” scenarios, to help people who had never seen anything like this imagine what the experience might be like. I got our lead artist, Gary Winnick, to draw up lots of story boards to visualize these. Gary’s illustrations not only helped clarify what I was talking about to everyone else, they also helped me get my own head straight on how this would all end up working. Having all this visual material around gave the project a big creative boost. (It didn’t hurt that I really resonate with Gary’s artistic sensibilities and so found his visual wit and imagination very energizing.)

QuantumLink proved to be an interesting business. It was the first major online service that was really tailored for the consumer market. Although the existing services actually got a lot of their paying user hours from consumers, I think they tended to view the whole consumer aspect of things as a sort of disreputable sideline. The customers they really cared about were businesses, which they viewed as both more serious and more profitable. The 800 pound gorilla in this industry was CompuServe, which at the time charged $20.00/hour during “prime time” (basically, during normal business hours) and $12.00/hour “off peak” (evenings and weekends). This is pretty pricey for your average consumer, even if you think of it in inflated 2004 dollars. In contrast, QuantumLink charged $3.60/hour on evenings and weekends and wasn’t available at all during the work day. They did a number of clever things that let them get away with charging this radically lower price. First of all, they had a sweetheart deal with Telenet, in those days one of the major nationwide X.25 packet network vendors, to buy up unused off peak network capacity at deep discount rates. This was pure gravy for Telenet, who wouldn’t otherwise be selling those lines then anyway, while it gave Q-Link a huge reduction in operating costs. This is what accounted for Q-Link’s unusual operating hours. (It may have helped that some of Quantum’s founders were former Telenet people who had an inside track on who to talk to to make such an arrangement in the first place.) Second, they used a client-server architecture, then quite unheard of in an industry that was historically oriented toward alphanumeric terminals. This enabled them to offload a whole bunch of user interface computation onto the customer’s computer while at the same time dramatically reducing the telecommunications load they placed on the network and on their servers, further reducing their operating costs. The smart client also allowed them to make their interface vastly easier to use, since they could do all kinds of modern UI things that aren’t feasible in the keyboard/terminal world. This reduced their customer support costs. Finally, by standardizing on the Commodore 64, they made a virtue of the necessity to have platform-specific client software, because it meant that all client endpoints were essentially the same. This standardization cut customer support overhead even more.

From a technical point of view, QuantumLink was a good match because I didn’t have to sell them on the idea of a client-server game architecture. Heck, I didn’t even have to explain to them what a client-server architecture was, as I had had to do with many other folks (today it’s hard to imagine this was once considered an exotic way to do things). Plus, they had already worked out the technical details of doing data communications between Commodore 64 machines and their servers, and could even supply us with source code (well, mostly; as it turned out there were some issues wherein their implementation, uh, deviated from perfection, but that came up much later). Being based on the Commodore 64 was also advantageous since it meant that all of Lucasfilm Games’ heavy duty development tools and our big bag of C64 and 6502 tricks could be brought to bear. It wasn’t the slick, oomphy Amiga that we had originally been thinking in terms of, but it was a platform that we knew how to squeeze for every drop of performance it had.

As the summer of 1985 rolled on, it seemed increasingly likely that we were going to do this project (which had by this time been renamed again, and was now called MicroCosm — remember, this was an age when these machines we were developing for were still called “microcomputers”, so this name seemed like kind of a cool play on words). We had regular communications with the QuantumLink folks and the chemistry was pretty good. I started writing up a detailed technical design and putting together a project schedule. This was getting pretty serious. We already had in hand an amazing 6502-based cel animation engine that Charlie Kellner had developed for another Lucasfilm game, The Eidolon, which looked like it could provide a big chunk of the C64 client. It was all looking like it would come together.

Finally, in early October, Steve Arnold and I flew out to Virginia for a big nail-it-all-down meeting with Quantum. I spent a day with their technical team (Marc Seriff, their VP of Engineering, and Janet Hunter, who was going to be the technical lead on their end), going over the design and working through how it would mesh with their system. They liked what I had to show them and they in turn gave me a good long look at how their stuff worked. The next day we were joined by Steve Arnold from our side, Clive Smith from Commodore, and some of Quantum’s business and legal folks (notably their CEO, Jim Kimsey, and their head of marketing, Steve Case). Quantum management was really pumped and eager to do the deal.

So there we all are in this big conference room. All that remains at this point is for the business people to agree to the terms and for the lawyers to work out the details of the contract. Steve and the Quantum guys kick around the structure of the deal and block out the general shape of things. Actually, most of this has pretty much already been worked out ahead of time; the meeting is supposed to be a formality where management on both sides blesses the deal, they shake hands, and off we go. This is where the project has its first serious brush with death. Quantum’s lawyer jumps in and starts trying to renegotiate the deal: doing business with Lucasfilm is weird, it doesn’t work like anything he’s used to, so he starts trying to mold the relationship into a model he understands (basically, they pay us money, we give them software, they own everything). Steve attempts to enlighten him as to the nature of creative work as opposed to engineering, but the lawyer’s having none of it. But Lucasfilm just doesn’t do deals where the other guy ends up owning everything. Discussion with the business folks continues, ranging from engineering risks and scheduling, to creative and philosophical issues, but there’s this cloud of legal nastiness hanging over us the whole rest of the day. The lawyer doesn’t help things, every so often interjecting with further attempts to rip us off. Even Kimsey starts getting irritated with the guy.

We break at the end of the day with most of the outlines of the deal settled, but an air of extreme uneasiness about ownership and intellectual property issues. Steve Arnold and I go out to dinner with Clive Smith, Jim Kimsey, and Steve Case. Thankfully, over dinner we come to a handshake agreement about the ownership questions. The next day, Steve and I fly back to California. The project is on! (Except for actually finishing the legal paperwork.)

Once we returned to Marin, Lucasfilm’s lawyers got into the act. One thing I could always say with supreme confidence while working at Lucasfilm was: my lawyer can beat up your lawyer. Lucasfilm doesn’t have lawyers, it has legal ninjas — demons and wizards of the bar, each one utterly lacking in mercy. Our lawyers contacted Quantum’s lawyers, and a contract started taking shape. The worked on the contract through October. And through November. And through December…

While we waited, I worked on the technical specification. I worked on the spec through October. And through November. And through December. It proved to be a helluva design (and the vast amounts of unexpected preplanning would prove to really pay off, later). In December we concluded that this contract really was going to happen sooner or later (later, as it turned out, like February), and that we had better get started on the actual work. Steve gave me the go ahead to start recruiting development team members. I grabbed Aric Wilmunder, who had been a contractor working on Koronis Rift for Noah, and tapped Gary to do the artwork. The game was on! Really! This time, for sure!

And Steve pointed me to this other guy who we’d contracted with to port Koronis Rift to the Apple II. Maybe I should talk to him. I said “OK, Steve, if you say so”. So I went and talked to the guy. I didn’t have a lot of time to interview people, but this guy had done a really nice job on the port, and done it pretty fast. And he was totally excited about the project, he’d been thinking about this sort of thing for years, it’s just the kind of thing he always wanted to work on. OK, snap judgement, you’re hired.

His name was Randy Farmer.

Best. Management decision. Ever.

July 4, 2004

Beware the Platform II

A long time ago we said “The implementation platform is relatively unimportant.” This was a statement made at a time when a lot of people were insisting that to do “real” cyberspace (whatever that is), you needed an $80,000 Silicon Graphics system (at least), whereas we came along and somewhat arrogantly claimed that all the stuff that really interested us could be done with a $150 Commodore 64. And we still believe that, notwithstanding the fact that today’s analog of the C64, a $150 PlayStation II or Xbox, has performance specs that exceed 1989’s $80,000 SGI machine. (Hey, we never said that we didn’t want cool 3D graphics, just that they weren’t the main event.)

So it should come as no great surprise, at least to those of you who have come to recognize us for the crusty contrarians that we are, when I tell you that one of the lessons that we’ve had our noses rubbed in over the past decade or so is that the platform is actually pretty darned important.

Our point about the platform being unimportant was really about performance specs: pixels and polygons, MIPS and megabytes. It was about what our long-time collaborator Doug Crockford calls the “threshold of goodenoughness”. We were in an era when the most salient characteristic of a computational platform was its performance envelope, which seemed to define the fundamental limits on what you could do. Our thesis was simply that much of we wanted to do was already inside that envelope. Of course we always hunger for more performance, but the point remains. What we didn’t pay enough attention to in our original story, however, was that a platform is characterized by more than just its horsepower.

No matter how much our technical capabilities advance, there will always be something which acts as a limiting constraint. But though there are always limits, our experience had always been that these limits kept moving outward with the march of progress. While we were always champing at the bit for the next innovation, we were also fundamentally optimistic that the inexorable workings of Moore’s Law would eventually knock down whatever barrier was currently vexing us.

In the past 5-10 years, however, we have begun to encounter very different kinds of limits in the platforms that are available in the marketplace. These limits have little to do with the sorts of quantitative issues we worry about in the performance domain, and none of them are addressed (at least not directly) by Moore’s Law. They include such things as:

  • Operating system misfeatures
  • Dysfunctional standards
  • The ascendency of the web application model
  • The progressive gumming up of workings of the Internet by the IT managers and ISPs of the world
  • Distribution channel bottlenecks, notably customer reluctance or inability to download and/or install software
  • A grotesquely out of balance intellectual property system
  • The ascendency of game consoles and attendant closed-system issues
  • Clueless regulators, corrupt legislators, and evil governments

As with the performance limitations that the march of progress has overcome for us, none of these are fundamental showstoppers, but they are all “friction factors” impeding development of the kinds of systems that we are interested in. In particular, several of these problems interact with each other in a kind of negative synergy, where one problem impedes solutions to another.

For example, the technical deficiencies of popular operating systems (Microsoft Windows being the most egregious offender in this regard, though certainly not the only one) have encouraged the proliferation of firewalls, proxies, and other function impeding features by ISPs and corporate network administrators. These in turn have shrunk many users’ connectivity options, reducing them from the universe of IP to HTTP plus whatever idiosyncratic collection of protocols their local administrators have deigned to allow. (Folks should remind me, once I get the current batch of posts I’m working on through the pipeline, to write something about the grotty reality of HTTP tunneling.) Furthermore, the security holes in Windows have made people rationally hesitant to install new software off the net (setting aside for a moment the additional inhibiting issues of download bandwidth and the quantum leap in user confusion caused by any kind of “OK to install?” dialog). Yet such downloaded software is the major pathway by which one could hope to distribute workarounds to these various connectivity barriers. And working around these barriers in turn often comes down to overcoming impediments deliberately placed by self-interested vendors who attempt to use various kinds of closed systems to achieve by technical means what they could not achieve by honest competition. And these workarounds must be developed and deployed in the face of government actions, such as the DMCA, which attempt to impose legal obstacles to their creation and distribution. Although we enjoyed a brief flowering of the open systems philosophy during the 1990s, I think this era is passing.

Note that, barring the imposition of a DRM regime that is both comprehensive and effective (which strikes me as unlikely in the extreme), the inexorable logic of technological evolution suggests that these barriers will be both permeable and temporary. That is a hopeful notion if you are, for example, a human rights activist working to tunnel through “The Great Firewall of China”. On the other hand, these things are, as I said, friction factors. In business terms that means they increase the cost of doing business: longer development times due to more complex systems that need to be coded and debugged, the time and expense of patent and intellectual property licensing requirements, more complicated distribution and marketing relationships that need to be negotiated, greater legal expenses and liability exposure, and the general hassle of other people getting into your business. This in turn means a bumpier road ahead for people like Randy and me if we try to raise investment capital for The Next Big Thing.

April 21, 2004

Getting started

Chip:
In May, 1990, Randy and I gave a talk at the First International Conference on Cyberspace, which we entitled “The Lessons of Lucasfilm’s Habitat”. In it, we presented the work we had done creating Habitat, one of the first big online virtual worlds. We talked about the things we had learned, the mistakes we made, and gave some advice for others who might be traveling down the same road. (The written version of the paper was published the next year in the book Cyberspace: First Steps, the procedings edited by conference organizer Michael Benedikt.) The attention this generated surpassed our wildest expectations. It seemed we had struck a chord with a lot of people. We were invited to talk about Habitat in numerous other venues. Electronic copies of the paper were widely mirrored, first on FTP sites and then on the Web (even today, a Google search on an exact quote of the title will yield hundreds of hits). This all lead to a consulting practice and ultimately to a remarkable company, Electric Communities.

This year, the organizers of the 2004 Muddev conference invited us to give a “fireside chat” presentation, updating the multiplayer games developer community on our experiences since the publication of The Lessons of Lucasfilm’s Habitat. So, on March 27, we gave the original lessons a critical reappraisal, talked about the projects we’ve worked on during the intervening years, and then put forward a batch of new lessons based on our more recent experiences.

You can find a copy of our PowerPoint slides from that presentation here. However, we promised the conference attendees that we would render the presentation into a somewhat less elliptical form than a set of slides that only make sense together with the words that were spoken. At the time we made this promise, we expected to be writing another paper, but, upon discussion, Randy and I both realized that we wanted to do something a little more adventurous. Since the presentation is part history, part sermonizing, part prognostication, and all very subjective, the different pieces did not seem like they wanted to go together in any sort of traditional academic form. We’ve been talking all these years about the unique capabilities of the electronic realm. And lately we’ve been preaching the virtues of incrementalism. Plus there’s a lot of good stuff that we just didn’t have room for in the talk. Plus there’s a lot of good stuff out there that other folks have been doing that we’d like to direct everyone’s attention to. So we decided it made more sense to do something on the web, something a little more dynamic and open ended than just writing an academic paper and posting the PDF file. Hence this site you are now reading.

This is an experiment. It’s going to be part weblog, part document repository. In the coming months we hope to fill in some history, document some cool technology, explain some ideas, offer some advice, pontificate wildly. And we want to look forward as well as backward. There are many cool and wonderful things remaining to be done out there.

Watch this space!

Randy:
Besides virtual worlds/communities, Chip and I have also done work in the area of reputation systems, online-moderated negotiations, and e-commerce systems. If it’s about building systems for people to interact online, we’ve either done it, or have an educated (but sometimes unfounded) opinion on the subject – but so have you.

That’s why we’ve started a blog. :-) It’s not just about two old farts pontificating; it’s about sharing our thoughts and learning from each other. We expect this to be a dialog, with you.

For now, all we ask is to keep a civil tone, and to stay on topic.

This should be fun!