May 7, 2017

What Are Capabilities?

Some preliminary remarks

You can skip this initial section, which just sets some context, without loss to the technical substance of the essay that follows, though perhaps at some loss in entertainment value.

At a gathering of some of my coconspirators friends a couple months ago, Alan Karp lamented the lack of a good, basic introduction to capabilities for folks who aren’t already familiar with the paradigm. There’s been a lot written on the topic, but it seems like everything is either really long (though if you’re up for a great nerdy read I recommend Mark Miller’s PhD thesis), or really old (I think the root of the family tree is probably Dennis and Van Horn’s seminal paper from 1966), or embedded in explanations of specific programming languages (such as Marc Stiegler’s excellent E In A Walnut or the capabilities section of the Pony language tutorial) or operating systems (such as KeyKOS or seL4), or descriptions of things that use capabilities (like smart contracts or distributed file storage), or discussions of aspects of capabilities (Norm Hardy has written of ton of useful fragments on his website). But nothing that’s just a good “here, read this” that you can toss at curious people who are technically able but unfamiliar with the ideas. So Alan says, “somebody should write something like that,” while giving me a meaningful stare. Somebody, huh? OK, I can take a hint. I’ll give it a shot. Given my tendency to Explain Things this will probably end up being a lot longer than what Alan wants, but ya gotta start somewhere.

The first thing to confront is that term, “capabilities”, itself. It’s confusing. The word has a perfectly useful everyday meaning, even in the context of software engineering. When I was at PayPal, for example, people would regularly talk about our system’s capabilities, meaning what it can do. And this everyday meaning is actually pretty close to the technical meaning, because in both cases we’re talking about what a system “can” do, but usually what people mean by that is the functionality it realizes rather than the permissions it has been given. One path out of this terminological confusion takes its cue from the natural alignment between capabilities and object oriented programming, since it’s very easy to express capability concepts with object oriented abstractions (I’ll get into this shortly). This has lead, without too much loss of meaning, to the term “object capabilities”, which embraces this affinity. This phrase has the virtue that we can talk about it in abbreviated form as “ocaps” and slough off some of the lexical confusion even further. It does have the downside that there are some historically important capability systems that aren’t really what you’d think of as object oriented, but sometimes oversimplification is the price of clarity. The main thing is, just don’t let the word “capabilities” lead you down the garden path; instead, focus on the underlying ideas.

The other thing to be aware of is that there’s some controversy surrounding capabilities. Part of this is a natural immune response to criticism (nobody likes being told that they’re doing things all wrong), part of it is academic tribalism at work, and part of it is the engineer’s instinctive and often healthy wariness of novelty. I almost hesitate to mention this (some of my colleagues might argue I shouldn’t have), but it’s important to understand the historical context if you read through the literature. Some of the pushback these ideas have received doesn’t really have as much to do with their actual merits or lack thereof as one might hope; some of it is profoundly incorrect nonsense and should be called out as such.

The idea

Norm Hardy summarizes the capability mindset with the admonition “don’t separate designation from authority”. I like this a lot, but it’s the kind of zen aphorism that’s mainly useful to people who already understand it. To everybody else, it just invites questions: (a) What does that mean? and (b) Why should I care? So let’s take this apart and see…

The capability paradigm is about access control. When a system, such as an OS or a website, is presented with a request for a service it provides, it needs to decide if it should actually do what the requestor is asking for. The way it decides is what we’re talking about when we talk about access control. If you’re like most people, the first thing you’re likely to think of is to ask the requestor “who are you?” The fundamental insight of the capabilities paradigm is to recognize that this question is the first step on the road to perdition. That’s highly counterintuitive to most people, hence the related controversy.

For example, let’s say you’re editing a document in Microsoft Word, and you click on the “Save” button. This causes Word to issue a request to the operating system to write to the document file. The OS checks if you have write permission for that file and then allows or forbids the operation accordingly. Everybody thinks this is normal and natural. And in this case, it is: you asked Word, a program you chose to run, to write your changes to a file you own. The write succeeded because the operating system’s access control mechanism allowed it on account of it being your file, but that mechanism wasn’t doing quite what you might think. In particular, it didn’t check whether the specific file write operation in question was the one you asked for (because it can’t actually tell), it just checked if you were allowed to do it.

The access control model here is what’s known as an ACL, which stands for Access Control List. The basic idea is that for each thing the operating system wants to control access to (like a file, for example), it keeps a list of who is allowed to do what. The ACL model is how every current mainstream operating system handles this, so it doesn’t matter if we’re talking about Windows, macOS, Linux, FreeBSD, iOS, Android, or whatever. While there are a lot of variations in the details of how they they handle access control (e.g., the Unix file owner/group/others model, or the principal-per-app model common on phone OSs), in this respect they’re all fundamentally the same.

As I said, this all seems natural and intuitive to most people. It’s also fatally flawed. When you run an application, as far as the OS is concerned, everything the application does is done by you. Another way to put this is, an application you run can do anything you can do. This seems OK in the example we gave of Word saving your file. But what if Word did something else, like transmit the contents of your file over the internet to a server in Macedonia run by the mafia, or erase any of your files whose names begin with a vowel, or encrypt all your files and demand payment in bitcoins to decrypt them? Well, you’re allowed to do all those things, if for some crazy reason you wanted to, so it can too. Now, you might say, we trust Word not to do evil stuff like that. Microsoft would get in trouble. People would talk. And that’s true. But it’s not just Microsoft Word, it’s every single piece of software in your computer, including lots of stuff you don’t even know is there, much of it originating from sources far more difficult to hold accountable than Microsoft Corporation, if you even know who they are at all.

The underlying problem is that the access control mechanism has no way to determine what you really wanted. One way to deal with this might be to have the operating system ask you for confirmation each time a program wants to do something that is access controlled: “Is it OK for Word to write to this file, yes or no?” Experience with this approach has been pretty dismal. Completely aside from the fact that this is profoundly annoying, people quickly become trained to reflexively answer “yes” without a moment’s thought, since that’s almost always the right answer anyway and they just want to get on with whatever they’re doing. Plus, a lot of the access controlled operations a typical program does are internal things (like fiddling with a configuration file, for example) whose appropriateness the user has no way to determine anyhow.

An alternative approach starts by considering how you told Word what you wanted in the first place. When you first opened the document for editing, you typically either double-clicked on an icon representing the file, or picked the file from an Open File dialog. Note, by the way, that both of these user interface interactions are typically implemented by the operating system (or by libraries provided by the operating system), not by Word. The way current APIs work, what happens in either of these cases is that the operating system provides the application with a character string: the pathname of the file you chose. The application is then free to use this string however it likes, typically passing it as a parameter to another operating system API call to open the file. But this is actually a little weird: you designated a file, but the operating system turned this into a character string which it gave to Word, and then when Word actually wanted to open the file it passed the string back to the operating system, which converted it back into a file again. As I’ve said, this works fine in the normal case. But Word is not actually limited to using just the string that names the particular file you specified. It can pass any string it chooses to the Open File call, and the only access limitation it has is what permissions you have. If it’s your own computer, that’s likely to be permissions to everything on the machine, but certainly it’s at least permissions to all your stuff.

Now imagine things working a little differently. Imagine that when Word starts running it has no access at all to anything that’s access controlled – no files, peripheral devices, networks, nothing. When you double click the file icon or pick from the open file dialog, instead of giving Word a pathname string, the operating system itself opens the file and gives Word a handle to it (that is, it gives Word basically the same thing it would have given Word in response to the Open File API call when doing things the old way). Now Word has access to your document, but that’s all. It can’t send your file to Macedonia, because it doesn’t have access to the network – you didn’t give it that, you just gave it the document. It can’t delete or encrypt any of your other files, because it wasn’t given access to any of them either. It can mess up the one file you told it to edit, but it’s just the one file, and if it did that you’d stop using Word and not suffer any further damage. And notice that the user experience – your experience – is exactly the same as it was before. You didn’t have to answer any “mother may I?” security questions or put up with any of the other annoying stuff that people normally associate with security. In this world, that handle to the open file is an example of what we call a “capability”.

This is where we get back to Norm Hardy’s “don’t separate designation from authority” motto. By “designation” we mean how we indicate to, for example, the OS, which thing we are talking about. By “authority” we mean what we are allowed by the OS to do with that thing. In the traditional ACL world, these are two largely disconnected concepts. In the case of a file, the designator is typically a pathname – a character string – that you use to refer to the file when operating upon it. The OS provides operations like Write File or Delete File that are parameterized by the path name of the file to be written to or deleted. Authority is managed separately as an ACL that the OS maintains in association with each file. This means that the decision to grant access to a file is unrelated to the decision to make use of it. But this in turn means that the decision to grant access has to be made without knowledge of the specific uses. It means that the two pieces of information the operating system needs in order to make its access control decision travel to it via separate routes, with no assurance that they will be properly associated with each other when they arrive. In particular, it means that a program can often do things (or be fooled into doing things) that were never intended to be allowed.

Here’s the original example of the kind of thing I’m talking about, a tale from Norm. It’s important to note, by the way, that this is an actual true story, not something I just made up for pedagogical purposes.

Once upon a time, Norm worked for a company that ran a bunch of timeshared computers, kind of like what we now call “cloud computing” only with an earlier generation’s buzzwords. One service they provided was a FORTRAN compiler, so customers could write their own software.

It being so many generations of Moore’s Law ago, computing was expensive, so each time the compiler ran it wrote a billing record to a system accounting file noting the resources used, so the customer could be charged for them. Since this was a shared system, the operators knew to be careful with file permissions. So, for example, if you told the compiler to output to a file that belonged to somebody else, this would fail because you didn’t have permission. They also took care to make sure that only the compiler itself could write to the system accounting file – you wouldn’t want random users to mess with the billing records, that would obviously be bad.

Then one day somebody figured out they could tell the compiler the name of the system accounting file as the name of the file to write the compilation output to. The access control system looked at this and asked, “does this program have permission to write to this file?” – and it did! And so the compiler was allowed to overwrite the billing records and the billing information was lost and everybody got all their compilations for free that day.

Fixing this turned out to be surprisingly slippery. Norm named the underlying problem “The Confused Deputy”. At heart, the FORTRAN compiler was deputized by two different masters: the customer and the system operators. To serve the customer, it had been given permission to access the customer’s files. To serve the operators, it had been given permission to access the accounting file. But it was confused about which master it was serving for which purpose, because it had no way to associate the permissions it had with their intended uses. It couldn’t specify “use this permission for this file, use that permission for that file”, because the permissions themselves were not distinct things it could wield selectively – the compiler never actually saw or handled them directly. We call this sort of thing “ambient authority”, because it’s just sitting there in the environment, waiting to be used automatically without regard to intent or context.

If this system had been built on capability principles, rather than accessing the files by name, the compiler would instead have been given a capability by the system operators to access the accounting file with, which it would use to update the billing records, and then gotten a different capability from the customer to access the output file, which it would use when outputting the result of the compilation. There would have been no confusion and no exploit.

You might think this is some obscure problem those old people had back somewhere at the dawn of the industry, but a whole bunch of security problems plaguing us today – which you might think are all very different – fit this template, including many kinds of injection attacks, cross-site request forgery, cross site scripting attacks, click-jacking – including, depending on how you look at it, somewhere between 5 and 8 members of the OWASP top 10 list. These are all arguably confused deputy problems, manifestations of this one conceptual flaw first noticed in the 1970s!

Getting more precise

We said separating designation from authority is dangerous, and that instead these two things should be combined, but we didn’t really say much about what it actually means to combine them. So at this point I think it’s time to get a bit more precise about what a capability actually is.

A capability is single thing that both designates a resource and authorizes some kind of access to it.

There’s a bunch of abstract words in there, so let’s unpack it a bit.

By resource we just mean something the access control mechanism controls access to. It’s some specific thing we have that somebody might want to use somehow, whose use we seek to regulate. It could be a file, an I/O device, a network connection, a database record, or really any kind of object. The access control mechanism itself doesn’t much care what kind of thing the resource is or what someone wants to do with it. In specific use cases, of course, we care very much about those things, but then we’re talking about what we use the access control mechanism for, not about how it works.

In the same vein, when we talk about access, we just mean actually doing something that can be done with the resource. Access could be reading, writing, invoking, using, destroying, activating, or whatever. Once again, which of these it is is important for specific uses but not for the mechanism itself. Also, keep in mind that the specific kind of access that’s authorized is one of the things the capability embodies. Thus, for example, a read capability to a file is a different thing from a write capability to the same file (and of course, there might be a read+write capability to that file, which would be yet a third thing).

By designation, we mean indicating, somehow, specifically which resource we’re talking about. And by authorizing we mean that we are allowing the access to happen. Hopefully, none of this is any surprise.

Because the capability combines designation with authority, the possessor of the capability exercises their authority – that is, does whatever it is they are allowed to do with the resource the capability is a capability to – by wielding the capability itself. (What that means in practice should be clearer after a few examples). If you don’t possess the capability, you can’t use it, and thus you don’t have access. Access is regulated by controlling possession.

A key idea is that capabilities are transferrable, that someone who possesses a capability can convey it to someone else. An important implication that falls out of this is that capabilities fundamentally enable delegation of authority. If you are able to do something, it means you possess a capability for that something. If you pass this capability to somebody else, then they are now also able do whatever it is. Delegation is one of the main things that make capabilities powerful and useful. However, it also tends to cause a lot of people to freak out at the apparent loss of control. A common response is to try to invent mechanisms to limit or forbid delegation, which is a terrible idea and won’t work anyway, for reasons I’ll get into.

If you’re one of these people, please don’t freak out yourself; I’ll come back to this shortly and explain some important capability patterns that hopefully will address your concerns. In the meantime, a detail that might be helpful to meditate on: two capabilities that authorize the same access to the same resource are not necessarily the same capability (note: this is just a hint to tease the folks who are trying to guess where this goes, so if you’re not one of those people, don’t worry if it’s not obvious).

Another implication of our definition is that capabilities must be unforgeable. By this we mean that you can’t by yourself create a capability to a resource that you don’t already have access to. This is a basic requirement that any capability system must satisfy. For example, using pathnames to designate files is problematic because anybody can create any character string they want, which means they can designate any file they want if pathnames are how you do it. Pathnames are highly forgeable. They work fine as designators, but can’t by themselves be used to authorize access. In the same vein, an object pointer in C++ is forgeable, since you can typecast an integer into a pointer and thus produce a pointer to any kind of object at any memory address of your choosing, whereas in Java, Smalltalk, or pretty much any other memory-safe language where this kind of casting is not available, an object reference is unforgeable.

As I’ve talked about all this, I’ve tended to personify the entities that possess, transfer, and wield capabilities – for example, sometimes by referring to one of them as “you”. This has let me avoid saying much about what kind of entities these are. I did this so you wouldn’t get too anchored in specifics, because there are many different ways capability systems can work, and the kinds of actors that populate these systems vary. In particular, personification let me gloss over whether these actors were bits of software or actual people. However, we’re ultimately interested in building software, so now lets talk about these entities as “objects”, in the traditional way we speak of objects in object oriented programming. By getting under the hood a bit, I hope things may be made a little easier to understand. Later on we can generalize to other kinds of systems beyond OOP.

I’ll alert you now that I’ll still tend to personify these things a bit. It’s helpful for us humans, in trying to understand the actions of an intentional agent, to think of it as if it’s a person even if it’s really code. Plus – and I’ll admit to some conceptual ju-jitsu here – we really do want to talk about objects as distinct intentional agents. Another of the weaknesses of the ACL approach is that it roots everything in the identity of the user (or other vaguely user-like abstractions like roles, groups, service accounts, and so on) as if that user was the one doing things, that is, as if the user is the intentional agent. However, when an object actually does something it does it in a particular way that depends on how it is coded. While this behavior might reflect the intentions of the specific user who ultimately set it in motion, it might as easily reflect the intentions of the programmers who wrote it – more often, in fact, because most of what a typical piece of software does involves internal mechanics that we often work very hard to shield the user from having to know anything about.

In what we’re calling an “object capability” system (or “ocap” system, to use the convenient contraction I mentioned in the beginning), a reference to an object is a capability. The interesting thing about objects in such a system is that they are both wielders of capabilities and resources themselves. An object wields a capability – an object reference – by invoking methods on it. You transfer a capability by passing an object reference as a parameter in a method invocation, returning it from a method, or by storing it in a variable. An ocap system goes beyond an ordinary OOP system by imposing a couple additional requirements: (1) that object references be unforgeable, as just discussed, and (2) that there be some means of strong encapsulation, so that one object can hold onto references to other objects in a way that these can’t be accessed from outside it. For example, you can implement ocap principles in Java using ordinary Java object references held in private instance variables (to make Java itself into a pure ocap language – which you can totally do, by the way – requires introducing a few additional rules, but that’s more detail than we have time for here).

In an ocap system, there are only three possible ways you can come to have a capability to some resource, which is to say, to have a reference to some object: creation, transfer, and endowment.

Creation means you created the resource yourself. We follow the convention that, as a byproduct of the act of creation, the creator receives a capability that provides full access to the new resource. This is natural in an OOP system, where an object constructor typically returns a reference to the new object it constructed. In a sense, creation is an optional feature, because it’s not actually a requirement that a capability system have a way to produce new resources at all (that is, it might be limited to resources that already exist), but if it does, there needs to be way for the new resources to enter into the capability world, and handing them to their creator is a good way to do it.

Transfer means somebody else gave the capability to you. This is the most important and interesting case. Capability passing is how the authority graph – the map of who has what authority to do what with what – can change over time (by the way, the lack of a principled way to talk about how authorities change over time is another big problem with the ACL model). The simple idea is: Alice has a capability to Bob, Alice passes this capability to Carol, now Carol also has a capability to Bob. That simple narrative, however, conceals some important subtleties. First, Alice can only do this if she actually possesses the capability to Bob in the first place. Hopefully this isn’t surprising, but it is important. Second, Alice also has to have a capability to Carol (or some capability to communicate with Carol, which amounts to the same thing). Now things get interesting; it means we have a form of confinement, in that you can’t leak a capability unless you have another capability that lets you communicate with someone to whom you’d leak it. Third, Alice had to choose to pass the capability on; neither Bob nor Carol (nor anyone else) could cause the transfer without Alice’s participation (this is what motivates the requirement for strong encapsulation).

Endowment means you were born with the capability. An object’s creator can give it a reference to some other object as part of its initial state. In one sense, this is just creation followed by transfer. However, we treat endowment as its own thing for a couple of reasons. First, it’s how we can have an immutable object that holds a capability. Second, it’s how we avoid infinite regress when we follow the rules to their logical conclusion.

Endowment is how objects end up with capabilities that are realized by the ocap system implementation itself rather by code executing within it. What this means varies depending on the nature of the system; for example, an ocap language framework running on a conventional OS might provide a capability-oriented interface to the OS’s non-capability-oriented file system. An ocap operating system (such as KeyKOS or seL4) might provide capability-oriented access to primitive hardware resources such as disk blocks or network interfaces. In both cases we’re talking about things that exist outside the ocap model, which must be wrapped in special privileged objects that have native access to those things. Such objects can’t be created within the ocap rules, so they have to be endowed by the system itself.

So, to summarize: in the ocap model, a resource is an object and a capability is an object reference. The access that a given capability enables is the method interface that the object reference exposes. Another way to think of this is: ocaps are just object oriented programming with some additional strictness.

Here we come to another key difference from the ACL model: in the ocap world, the kinds of resources that may be access controlled, and the varieties of access to them that can be provided, are typically more diverse and more finely grained. They’re also generally more dynamic, since it’s usually possible, and indeed normal, to introduce new kinds of resources over time, with new kinds of access affordances, simply by defining new object classes. In contrast, the typical ACL framework has a fixed set of resource types baked into it, along with a small set of access modes that can be separately controlled. This difference is not fundamental – you could certainly create an extensible ACL system or an ocap framework based on a small, static set of object types – but it points to an important philosophical divergence between the two approaches.

In the ACL model, access decisions are made on the basis of access configuration settings associated with the resources. These settings must be administered, often manually, by direct interaction with the access control machinery, typically using tools that are part of the access control system itself. While policy abstractions (such as groups or roles, for example) can reduce the need for humans to make large numbers of individual configuration decisions, it is typically the case that each resource acquires its access control settings as the consequence of people making deliberate access configuration choices for it.

In contrast, the ocap approach dispenses with most of this configuration information and its attendant administrative activity. The vast majority of access control decisions are realized by the logic of how the resources themselves operate. Most access control choices are subsumed by the code of the corresponding objects. At the granularity of individual objects, the decisions to be made are usually simple and clear from context, further reducing the cognitive burden. Only at the periphery, where the system comes into actual contact with its human users, do questions of policy and human intent arise. And in many of these cases, intent can be inferred from the normal acts of control and designation that users make through their normal UI interactions (such as picking a file from a dialog or clicking on a save button, to return to the example we began with).

Consequently, thinking about access control policy and administration is an entirely different activity in an ocap system than in an ACL system. This thinking extends into the fundamental architecture of applications themselves, as well as that of things like programming languages, application frameworks, network protocols, and operating systems.

Capability patterns

To give you a taste of what I mean by affecting fundamental architecture, let’s fulfill the promise I made earlier to talk about how we address some of the concerns that someone from a more traditional ACL background might have.

The ocap approach both enables and relies on compositionality – putting things together in different ways to make new kinds of things. This isn’t really part of the ACL toolbox at all. The word “compositionality” is kind of vague, so I’ll illustrate what I’m talking about with some specific capability patterns. For discussion purposes, I’m going to group these patterns into a few rough categories: modulation, attenuation, abstraction, and combination. Note that there’s nothing fundamental about these, they’re just useful for presentation.

Modulation

By modulation, I mean having one object modulate access to another. The most important example of this is called a revoker. A major source of the anxiety that some people from an ACL background have about capabilities is the feeling that a capability can easily escape their control. If I’ve given someone access to some resource, what happens if later I decide it’s inappropriate for them to have it? In the ACL model, the answer appears to be simple: I merely remove that person’s entry from the resource’s ACL. In the ocap model, if I’ve given them one of my capabilities, then now they have it too, so what can I do if I don’t want them to have it any more? The answer is that I didn’t give them my capability. Instead I gave them a new capability that I created, a reference to an intermediate object that holds my capability but remains controlled by me in a way that lets me disable it later. We call such a thing a revoker, because it can revoke access. A rudimentary form of this is just a simple message forwarder that can be commanded to drop its forwarding pointer.

Modulation can be more sophisticated than simple revocation. For example, I could provide someone with a capability that I can switch on or off at will. I could make access conditional on time or date or location. I could put controls on the frequency or quantity of use (a use-once capability with a built-in expiration date might be particularly useful). I could even make an intermediary object that requires payment in exchange for access. The possibilities are limited only by need and imagination.

The revoker pattern solves the problem of taking away access, but what about controlling delegation? Capabilities are essentially bearer instruments – they convey their authority to whoever holds them, without regard to who the holder is. This means that if I give someone a capability, they could pass it to someone else whose access I don’t approve of. This is another big source of anxiety for people in the ACL camp: the idea that in the capability model there’s no way to know who has access. This is not rooted in some misunderstanding of capabilities either; it’s actually true. But the ACL model doesn’t really help with this, because it has the same problem.

In real world use cases, the need to share resources and to delegate access is very common. Since the ACL model provides no mechanism for this, people fall back on sharing credentials, often in the face of corporate policies or terms of use that specifically forbid this. When presented with the choice between following the rules and getting their job done, people will often pick the latter. Consider, for example, how common it is for a secretary or executive assistant to know their boss’s password – in my experience, it’s almost universal.

There’s a widespread belief that an ACL tells you who has access, but this is just an illusion, due to the fact that credential sharing is invisible to the access control system. What you really have is something that tells you who to hold responsible if a resource is used inappropriately. And if you think about it, this is what you actually want anyway. The ocap model also supports this type of accountability, but can do a much better job of it.

The first problem with credential sharing is that it’s far too permissive. If my boss gives me their company LDAP password so I can access their calendar and email, they’re also giving me access to everything else that’s protected by that password, which might extend to things like sensitive financial or personnel records, or even the power to spend money from the company bank account. Capabilities, in contrast, allow them to selectively grant me access to specific things.

The second problem with credential sharing is that if I use my access inappropriately, there’s no way to distinguish my accesses from theirs. It’s hard for my boss to claim “my flunky did it!” if the activity logs are tied to the boss’s name, especially if they weren’t supposed to have shared the credentials in the first place. And of course this risk applies in the other direction as well: if it’s an open secret that I have my boss’s password, suspicion for their misbehavior can fall on me; indeed, if my boss was malicious they might share credentials just to gain plausible deniability when they abscond with company assets. The revoker pattern, however, can be extended to enable delegation to be auditable. I delegate by passing someone an intermediary object that takes note of who is being delegated to and why, and then it can record this information in an audit log when it is used. Now, if the resource is misused, we actually know who to blame.

Keep in mind also that credential sharing isn’t limited to shared passwords. For example, if somebody asks me to run a program for them, then whatever it is that they wanted done by that program gets done using my credentials. Even if what the program did was benign and the request was made with the best of intentions, we’ve still lost track of who was responsible. This is the reason why some companies forbid running software they haven’t approved on company computers.

Attenuation

When I talk about attenuation, I mean reducing what a capability lets you do – its scope of authority. The scope of authority can encompass both the operations that are enabled and the range of resources that can be accessed. The later is particularly important, because it’s quite common for methods on an object’s API to return references to other objects as a result (once again, a concept that is foreign to the ACL world). For example, one might have a capability that gives access to a computer’s file system. Using this, an attenuator object might instead provide access only to a specific file, or perhaps to some discrete sub-directory tree in a file hierarchy (i.e., a less clumsy version of what the Unix chroot operation does).

Attenuating functionality is also possible. For example, the base capability to a file might allow any operation the underlying file system supports: read, write, append, delete, truncate, etc. From this you can readily produce a read-only capability to the same file: simply have the intermediary object support read requests without providing any other file API methods.

Of course, these are composable: one could readily produce a read-only capability to a particular file from a capability providing unlimited access to an entire file system. Attenuators are particularly useful for packaging access to the existing, non-capability oriented world into capabilities. In addition to the hierarchical file system wrapper just described, attenuators are helpful for mediating access to network communications (for example, limiting connections to particular domains, allowing applications to be securely distributed across datacenters without also enabling them talk to arbitrary hosts on the internet – the sort of thing that would normally be regulated by firewall configuration, but without the operational overhead or administrative inconvenience). Another use would be controlling access to specific portions of the rendering surface of a display device, something that many window systems already do in an almost capability-like fashion anyway.

Abstraction

Abstraction enters the picture because once we have narrowed what authority a given capability represents, it often makes sense to refactor what it does into something with a more appropriately narrow set of affordances. For example, it might make sense to package the read-only file capability mentioned above into an input stream object, rather than something that represents a file per se. At this point you might ask if this is really any different from ordinary good object oriented programming practice. The short answer is, it’s not – capabilities and OOP are strongly aligned, as I’ve mentioned several times already. A somewhat longer answer is that the capability perspective usefully shapes how you design interfaces.

A core idea that capability enthusiasts use heavily is the Principle of Least Authority (abbreviated POLA, happily pronounceable). The principle states that objects should be given only the specific authority necessary to do their jobs, and no more. The idea is that the fewer things an object can do, the less harm can result if it misbehaves or if its integrity is breached.

Least Authority is related to the notions of Least Privilege or Least Permission that you’ll frequently see in a lot of the traditional (non-capability) security literature. In part, this difference in jargon is just a cultural marker that separates the two camps. Often the traditional literature will tell you that authority and permission and privilege all mean more or less the same thing.

However, we really do prefer to talk about “authority”, which we take to represent the full scope of what someone or something is able to do, whereas “permission” refers to a particular set of access settings. For example, on a Unix system I typically don’t have permission to modify the /etc/passwd file, but I do typically have permission to execute the passwd command, which does have permission to modify the file. This command will make selected changes to the file on my behalf, thus giving me the authority to change my password. We also think of authority in terms of what you can actually do. To continue the example of the passwd command, it has permission to delete the password file entirely, but it does not make this available to me, thus it does not convey that authority to me even though it could if it were programmed to do so.

The passwd command is an example of abstracting away the low level details of file access and data formats, instead repackaging them into a more specific set of operations that is more directly meaningful to its user. This kind of functionality refactoring is very natural from a programming perspective, but using it to also refactor access is awkward in the ACL case. ACL systems typically have to leverage slippery abstractions like the Unix setuid mechanism. Setuid is what makes the Unix passwd command possible in the first place, but it’s a potent source of confused deputy problems that’s difficult to use safely; an astonishing number of Unix security exploits over the years have involved setuid missteps. The ocap approach avoids these missteps because the appropriate reduction in authority often comes for free as a natural consequence of the straightforward implementation of the operation being provided.

Combination

When I talk about combination, I mean using two or more capabilities together to create a new capability to some specific joint functionality. In some cases, this is simply the intersection of two different authorities. However, the more interesting cases are when we put things together to create something truly new.

For example, imagine a smartphone running a capability oriented operating system instead of iOS or Android. The hardware features of such a phone would, of course, be accessed via capabilities, which the OS would hand out to applications according to configuration rules or user input. So we could imagine combining three important capabilities: the authority to capture images using the camera, the authority to obtain the device’s geographic position via its built-in GPS receiver, and the authority to read the system clock. These could be encapsulated inside an object, along with a (possibly manufacturer provided) private cryptographic key, yielding a new capability that when invoked provides signed, authenticated, time stamped, geo-referenced images from the camera. This capability could then be granted to applications that require high integrity imaging, like police body cameras, automobile dash cams, journalistic apps, and so on. If this capability is the only way for such applications to get access to the camera at all, then the applications’ developers don’t have to be trusted to maintain a secure chain of evidence for the imagery. This both simplifies their implementation task – they can focus their efforts on their applications’ unique needs instead of fiddling with signatures and image formats – and makes their output far more trustworthy, since they don’t have prove their application code doesn’t tamper with the data (you still have to trust the phone and the OS, but that’s at least a separable problem).

What can we do with this?

I’ve talked at length about the virtues of the capability approach, but at the same time observed repeatedly (if only in criticism) that this is not how most contemporary systems work. So even if these ideas are as virtuous as I maintain they are, we’re still left with the question of what use we can make of them absent some counterfactual universe of technical wonderfulness.

There are several ways these ideas can provide direct value without first demanding that we replace the entire installed base of software that makes the world go. This is not to say that the installed base never gets replaced, but it’s a gradual, incremental process. It’s driven by small, local changes rather than by the unfolding of some kind of authoritative master plan. So here are a few incremental ways to apply these ideas to the current world. My hope is that these can deliver enough immediate value to bias practitioners in a positive direction, shaping the incentive landscape so it tilts towards a less dysfunctional software ecosystem. Four areas in particular seem salient to me in this regard: embedded systems, compartmentalized computation, distributed services, and software engineering practices.

Embedded systems

Capability principles are a very good way to organize an operating system. Two of the most noteworthy examples, in my opinion, are KeyKOS and seL4.

KeyKOS was developed in the 1980s for IBM mainframes by Key Logic, a spinoff from Tymshare. In addition to being a fully capability secure OS, it attained extraordinarily high reliability via an amazing, high performance orthogonal persistence mechanism that allowed processes to run indefinitely, surviving things like loss of power or hardware failure. Some commercial KeyKOS installations had processes that ran for years, in a few cases even spanning replacement of the underlying computer on which they were running. Towards the end of its commercial life, KeyKOS was also ported to several other processor architectures, making it a potentially interesting jumping off point for further development. KeyKOS has inspired a number of follow ons, including Eros, CapROS, and Coyotos. Unfortunately most of these efforts have been significantly resource starved and consequently have not yet had much real world impact. But the code for KeyKOS and its descendants is out there for the taking if anybody wants to give it a go.

seL4 is a secure variant of the L4 operating system, developed by NICTA in Australia. While building on the earlier L3 and L4 microkernels, seL4 is a from scratch design heavily influenced by KeyKOS. seL4 notably has a formal proof of functional correctness, making it an extremely sound basis for building secure and reliable systems. It’s starting to make promising inroads into applications that demand this kind of assurance, such as military avionics. Like KeyKOS, seL4, as well as seL4’s associated suite of proofs, is available as open source software.

Embedded systems, including much of the so called “Internet of Things”, are sometimes less constrained by installed base issues on account of being standalone products with narrow functionality, rather than general purpose computational systems. They often have fewer points where legacy interoperability is as important. Moreover, they’re usually cross-developed with tools that already expect the development and runtime environments to be completely different, allowing them to be bootstrapped via legacy toolchains. In other words, you don’t have to port your entire development system to the new OS in order to take advantage of it, but rather can continue using most of your existing tools and workflow processes. This is certainly true of the capability OS efforts I just mentioned, which have all dealt with these issues.

Furthermore, embedded software is often found in mission critical systems that must function reliably in a high threat environment. In these applications, reliability and security can take priority over cost minimization, making the assurances that a capability OS can offer comparatively more attractive. Consequently, using one of these operating systems as the basis for a new embedded application platform seems like an opportunity, particularly in areas where reliability is important.

A number of recent security incidents on the internet have revolved around compromised IoT devices. A big part of the problem is that the application code in these products typically has complete access to everything in the device, largely as a convenience to the developers. This massive violation of least privilege then makes these devices highly vulnerable to exploitation when an attacker finds flaws in the application code.

Rigorously compartmentalizing available functionality would greatly reduce the chances of these kinds of vulnerabilities, but this usually doesn’t happen. Partly this is just ignorance – most of these developers are not generally also security experts, especially when the things they are working on are not, on their face, security sensitive applications. However, I think a bigger issue is that the effort and inconvenience involved in building a secure system with current building blocks doesn’t seem justified by the payoff.

No doubt the developers of these products would prefer to produce more secure systems than they often do, all other things being equal, but all other things are rarely equal. One way to tilt the balance in our favor would be to give them a platform that more or less automatically delivers desirable security and reliability properties as a consequence of developers simply following the path of least resistance. This is the payoff that building on top of a capability OS offers.

Compartmentalized computation

Norm Hardy – one of the primary architects of KeyKOS, who I’ve already mentioned several times – has quipped that “the last piece of software anyone ever writes for a secure, capability OS is always the Unix virtualization layer.” This is a depressing testimony to the power that the installed base has over the software ecosystem. However, it also suggests an important benefit that these kinds of OS’s can provide, even in an era when Linux is the defacto standard.

In the new world of cloud computing, virtualization is increasingly how everything gets done. Safety-through-compartmentalization has long been one of the key selling points driving this trend. The idea is that even if an individual VM is compromised due to an exploitable flaw in the particular mix of application code, libraries, and OS services that it happens to be running, this does not gain the attacker access to other, adjacent VMs running on the same hardware.

The underlying idea – isolate independent pieces of computation so they can’t interfere with each other – is not new. It is to computer science what vision is to evolutionary biology, an immensely useful trick that gets reinvented over and over again in different contexts. In particular, it’s a key idea motivating the architecture of most multitasking operating systems in the first place. Process isolation has long been the standard way for keeping one application from messing up another. What virtualization brings to the table is to give application and service operators control over a raft of version and configuration management issues that were traditionally out of their hands, typically in the domain of the operators of the underlying systems on which they were running. Thus, for example, even if everyone in your company is using Linux it could still be the case that a service you manage depends on some Apache module that only works on Linux version X, while another some other wing of your company has a service requiring a version of MySQL that only works with Linux version Y. But with virtualization you don’t need to fight about which version of Linux to run on your company server machines. Instead, you can each have your own VMs running whichever version you need. More significantly, even if the virtualization system itself requires Linux version Z, it’s still not a problem, because it’s at a different layer of abstraction.

Virtualization doesn’t just free us from fights over which version of Linux to use, but which operating system entirely. With virtualization you can run Linux on Windows, or Windows on Mac, or FreeBSD on Linux, or whatever. In particular, it means you can run Linux on seL4. This is interesting because all the mainstream operating systems have structural vulnerabilities that mean they inevitably tend to get breached, and when somebody gets into the OS that’s running the virtualization layer it means they get into all the hosted VMs as well, regardless of their OS. While it’s still early days, initial indications are that seL4 makes a much more solid base for the virtualization layer than Linux or the others, while still allowing the vast bulk of the code that needs to run to continue working in its familiar environment.

By providing a secure base for the virtualization layer, you can provide a safe place to stand for datacenter operators and other providers of virtualized services. You have to replace some of the software that manages your datacenter, but the datacenter’s customers don’t have to change anything to benefit from this; indeed, they need not even be aware that you’ve done it.

This idea of giving applications a secure place to run, a place where the rules make sense and critical invariants can be relied upon – what I like to call an island of sanity – is not limited to hardware virtualization. “Frozen Realms”, currently working its slow way through the JavaScript standardization process, is a proposal to apply ocap-based compartmentalization principles to the execution environment of JavaScript code in the web browser.

The stock JavaScript environment is highly plastic; code can rearrange, reconfigure, redefine, and extend what’s there to an extraordinary degree. This massive flexibility is both blessing and curse. On the blessing side, it’s just plain useful. In particular, a piece of code that relies on features or behavior from a newer version of the language standard can patch the environment of an older implementation to emulate the newer pieces that it needs (albeit sometimes with a substantial performance penalty). This malleability is essential to how the language evolves without breaking the web. On the other hand, it makes it treacherous to combine code from different providers, since it’s very easy for one chunk of code to undermine the invariants that another part relies on. This is a substantial maintenance burden for application developers, and especially for the creators of application frameworks and widely used libraries. And this before we even consider what can happen if code behaves maliciously.

Frozen Realms is a scheme to let you to create an isolated execution environment, configure it with whatever modifications and enhancements it requires, lock it down so that it is henceforth immutable, and then load and execute code within it. One of the goals of Frozen Realms is to enable defensively consistent software – code that can protect its invariants against arbitrary or improper behavior by things it’s interoperating with. In a frozen realm, you can rely on things not to change beneath you unpredictably. In particular, you could load independent pieces of software from separate developers (who perhaps don’t entirely trust each other) into a common realm, and then allow these to interact safely. Ocaps are key to making this work. All of the ocap coding patterns mentioned earlier become available as trustworthy tools, since the requisite invariants are guaranteed. Because the environment is immutable, the only way pieces of code can affect each other is via object references they pass between them. Because all external authority enters the environment via object references originating outside it, rather than being ambiently available, you have control over what any piece of code will be allowed to do. Most significantly, you can have assurances about what it will not be allowed to do.

Distributed services

There are many problems in the distributed services arena for which the capability approach can be helpful. In the interest of not making this already long essay even longer, I’ll just talk here about one of the most important: the service chaining problem, for which the ACL approach has no satisfactory solution at all.

The web is a vast ecosystem of services using services using services. This is especially true in the corporate world, where companies routinely contract with specialized service providers to administer non-core operational functions like benefits, payroll, travel, and so on. These service providers often call upon even more specialized services from a range of other providers. Thus, for example, booking a business trip may involve going through your company’s corporate intranet to the website of the company’s contracted travel agency, which in turn invokes services provided by airlines, hotels, and car rental companies to book reservations or purchase tickets. Those services may themselves call out to yet other services to do things like email you your travel itinerary or arrange to send you a text if your flight is delayed.

Now we have the question: if you invoke one service that makes use of another, whose credentials should be used to access the second one? If the upstream service uses its own credentials, then it might be fooled, by intention or accident, into doing something on your behalf that it is allowed to do but which the downstream service wouldn’t let you do (a classic instance of the Confused Deputy problem). On the other hand, if the upstream service needs your credentials to invoke the downstream service, it can now do things that you wouldn’t allow. In fact, by giving it your credentials, you’ve empowered it to impersonate you however it likes. And the same issues arise for each service invocation farther down the chain.

Consider, for example, a service like Mint that keeps track of your finances and gives you money management advice. In order to do this, they need to access banks, brokerages, and credit card companies to obtain your financial data. When you sign up for Mint, you give them the login names and passwords for all your accounts at all the financial institutions you do business with, so they can fetch your information and organize it for you. While they promise they’re only going to use these credentials to read your data, you’re actually giving them unlimited access and trusting them not to abuse it. There’s no reason to believe they have any intention of breaking their promise, and they do, in fact, take security very seriously. But in the end the guarantee you get comes down to “we’re a big company, if we messed with you too badly we might get in trouble”; there are no technical assurances they can really provide. Instead, they display the logos of various security promoting consortia and double pinky swear they’ll guard your credentials with like really strong encryption and stuff. Moreover, their terms of use work really hard to persuade you that you have no recourse if they fail (though who actually gets stuck holding the bag in the event they have a major security breach is, I suspect, virgin territory, legally speaking).

While I’m quite critical of them here, I’m not actually writing this to beat up on them. I’m reasonably confident (and I say this without knowing or having spoken to anyone there, merely on the basis of having been in management at various companies myself) that they would strongly prefer not to be in the business of running a giant single point of failure for millions of people’s finances. It’s just that given the legacy security architecture of the web, they have no practical alternative, so they accept the risk as a cost of doing business, and then invest in a lot of complex, messy, and very expensive security infrastructure to try to minimize that risk.

To someone steeped in capability concepts, the idea that you would willingly give strangers on the web unlimited access to all your financial accounts seems like madness. I suspect it also seems like madness to lay people who haven’t drunk the conventional computer security kool aid, evidence of which is the lengths Mint has to go to in its marketing materials trying to persuade prospective customers that, no really, this is OK, trust us, please pay no attention to the elephant in the room.

The capability alternative (which, I stress, is not an option currently available to you), would be to obtain a separate credential – a capability! – from each of your financial institutions that you could pass along to a data management service like Mint. These credentials would grant read access to the relevant portions of your data, while providing no other authority. They would also be revocable, so that you could unilaterally withdraw this access later, say in the event of a security breach at the data manager, without disrupting your own normal access to your financial institutions. And there would be distinct credentials to give to each data manager that you use (say, you’re trying out a competitor to Mint) so that they could be independently controlled.

There are no particular technical obstacles to doing any of this. Alan Karp worked out much of the mechanics at HP Labs in a very informative paper called “Zebra Copy: A reference implementation of federated access management” that should be on everyone’s must read list.

Even with existing infrastructure and tools there are many available implementation options. Alan worked it out using SAML certificates, but you can do this just as well with OAuth3 bearer tokens, or even just special URLs. There are some new user interface things that would have to be done to make this easy and relatively transparent for users, but there’s been a fair bit of experimentation and prototyping done in this area that have pointed to a number of very satisfactory and practical UI options. The real problem is that the various providers and consumers of data and services would all have to agree that this new way of doing things is desirable, and then commit to switching over to it, and then standardize the protocols involved, and then actually change their systems, whereas the status quo doesn’t require any such coordination. In other words, we’re back to the installed base problem mentioned earlier.

However, web companies and other large enterprises are constantly developing and deploying hordes of new services that need to interoperate, so even if the capability approach is an overreach for the incumbents, it looks to me like a competitive opportunity for ambitious upstarts. Enterprises in particular currently lack a satisfactory service chaining solution, even though they’re in dire need of one.

In practice, the main defense against bad things happening in these sorts of systems is not the access control mechanisms at all, it’s the contractual obligations between the various parties. This can be adequate for big companies doing business with other big companies, but it’s not a sound basis for a robust service ecosystem. In particular, you’d like software developers to be able to build services by combining other services without the involvement of lawyers. And in any case, when something does go wrong, with or without lawyers it can be hard to determine who to hold at fault because confused deputy problems are rooted in losing track of who was trying to do what. In essence we have engineered everything with built in accountability laundering.

ACL proponents typically try to patch the inherent problems of identity-based access controls (that is, ones rooted in the “who are you?” question) by piling on more complicated mechanisms like role-based access control or attribute-based access control or policy-based access control (Google these if you want to know more; they’re not worth describing here). None of these schemes actually solves the problem, because they’re all at heart just variations of the same one broken thing: ambient authority. I think it’s time for somebody to try to get away from that.

Software engineering practices

At Electric Communities we set out to create technology for building a fully decentralized, fully user extensible virtual world. By “fully decentralized” we meant that different people and organizations could run different parts of the world on their own machines for specific purposes or with specific creative intentions of their own. By “fully user extensible” we meant that folks could add new objects to the world over time, and not just new objects but new kinds of objects.

Accomplishing this requires solving some rather hairy authority management problems. One example that we used as a touchstone was the notion of a fantasy role playing game environment adjacent to an online stock exchange. Obviously you don’t want someone taking their dwarf axe into the stock exchange and whacking peoples’ heads off, nor do you want a stock trader visiting the RPG during their lunch break to have their portfolio stolen by brigands. While interconnecting these two disparate applications doesn’t actually make a lot of sense, it does vividly capture the flavor of problems we were trying to solve. For example, if the dwarf axe is a user programmed object, how does it acquire the ability to whack people’s heads off in one place but not have that ability in another place?

Naturally, ocaps became our power tool of choice, and lots of interesting and innovative technology resulted (the E programming language is one notable example that actually made it to the outside world). However, all the necessary infrastructure was ridiculously ambitious, and consequently development was expensive and time consuming. Unfortunately, extensible virtual world technology was not actually something the market desperately craved, so being expensive and time consuming was a problem. As a result, the company was forced to pivot for survival, turning its attentions to other, related businesses and applications we hoped would be more viable.

I mention all of the above as preamble to what happened next. When we shifted to other lines of business, we did so with a portfolio of aggressive technologies and paranoid techniques forged in the crucible of this outrageous virtual world project. We went on using these tools mainly because they were familiar and readily at hand, rather than due to some profound motivating principle, though we did retain our embrace of the ocap mindset. However, when we did this, we made an unexpected discovery: code produced with these tools and techniques had greater odds of being correct on the first try compared to historical experience. A higher proportion of development time went to designing and coding, a smaller fraction of time to debugging, things came together much more quickly, and the resulting product tended to be considerably more robust and reliable. Hmmm, it seems we were onto something here. The key insight is that measures that prevent deliberate misbehavior tend to be good at preventing accidental misbehavior also. Since a bug is, almost by definition, a form of accidental misbehavior, the result was fewer bugs.

Shorn of all the exotic technology, however, the principles at work here are quite simple and very easy to apply to ordinary programming practice, though the consequences you experience may be far reaching.

As an example, I’ll explain how we apply these ideas to Java. There’s nothing magic or special about Java per se, other than it can be tamed – some languages cannot – and that we’ve had a lot of experience doing so. In Java we can reduce most of it to three simple rules:

  • Rule #1: All instance variables must be private
  • Rule #2: No mutable static state or statically accessible authority
  • Rule #3: No mutable state accessible across thread boundaries

That’s basically it, though Rule #2 does merit a little further explication. Rule #2 means that all static variables must be declared final and may only reference objects that are themselves transitively immutable. Moreover, constructors and static methods must not provide access to any mutable state or side-effects on the outside world (such as I/O), unless these are obtained via objects passed into them as parameters.

These rules simply ensure the qualities of reference unforgeability and encapsulation that I mentioned a while back. The avoidance of static state and static authority is because Java class names constitute a form of forgeable reference, since anyone can access a class just by typing its name. For example, anybody at all can try to read a file by creating an instance of java.io.FileInputStream, since this will open a file given a string. The only limitations on opening a file this way are imposed by the operating system’s ACL mechanism, the very thing we are trying to avoid relying on. On the other hand, a specific instance of java.io.InputStream is essentially a read capability, since the only authority it exposes is on its instance methods.

These rules cover most of what you need. If you want to get really extreme about having a pure ocap language, in Java there are a few additional edge case things you’d like to be careful of. And, of course, it would also be nice if the rules could be enforced automatically by your development tools. If your thinking runs along these lines, I highly recommend checking out Adrian Mettler’s Joe-E, which defines a pure ocap subset of Java much more rigorously, and provides an Eclipse plugin that supports it. However, simply following these three rules in your ordinary coding will give you about 90% of the benefits if what you care about is improving your code rather than security per se.

Applying these rules in practice will change your code in profound ways. In particular, many of the standard Java class libraries don’t follow them – for example lots of I/O authority is accessed via constructors and static methods. In practice, what you do is quarantine all the unavoidable violations of Rule #2 into carefully considered factory classes that you use during your program’s startup phase. This can feel awkward at first, but it’s an experience rather like using a strongly typed programming language for the first time: in the beginning you keep wondering why you’re blocked from doing obvious things you want to do, and then it slowly dawns on you that actually you’ve been blocked from doing things that tend to get you into trouble. Plus, the discipline forces you to think through things like your I/O architecture, and the result is generally improved structure and greater robustness.

(Dealing with the standard Java class libraries is a bit of an open issue. The approach taken by Joe-E and its brethren has been to use the standard libraries pruned of the dangerous stuff, a process we call “taming”. But while this yields safety, it’s less than ideal from an ergonomics perspective. A project to produce a good set of capability-oriented wrappers for the functionality in the core Java library classes would probably be a valuable contribution to the community, if anyone out there is so inclined.)

Something like the three rules for Java can be often devised for other languages as well, though the feasibility of this does vary quite a bit depending on how disciplined the language in question is. For example, people have done this for Scala and OCaml, and it should be quite straightforward for C#, but probably hopeless for PHP or Fortran. Whether C++ is redeemable in this sense is an open question; it seems plausible to me, although the requisite discipline somewhat cuts against the grain of how people use C++. It’s definitely possible for JavaScript, as a number of features in recent versions of the language standard were put there expressly to enable this kind of thing. It’s probably also worth pointing out that there’s a vibrant community of open source developers creating new languages that apply these ideas. In particular, you should check out Monte, which takes Python as its jumping off point, and Pony, which is really its own thing but very promising.

There’s a fairly soft boundary here between practices that simply improve the robustness and reliability of your code if you follow them, and things that actively block various species of bad outcomes from happening. Obviously, the stronger the discipline enforced by the tools is, the stronger the assurances you’ll have about the resulting product. Once again, the analogy to data types comes to mind, where there are best practices that are basically just conventions to be followed, and then there are things enforced, to a greater or lesser degree, by the programming language itself. From my own perspective, the good news is that in the short term you can start applying these practices in the places where it’s practical to do so and get immediate benefit, without having to change everything else. In the long term, I expect language, library, and framework developers to deliver us increasingly strong tools that will enforce these kinds of rules natively.

Conclusion

At its heart, the capability paradigm is about organizing access control around specific acts of authorization rather than around identity. Identity is so fundamental to how we interact with each other as human beings, and how we have historically interacted with our institutions, that it is easy to automatically assume it should be central to organizing our interactions with our tools as well. But identity is not always the best place to start, because it often fails to tell us what we really need to know to make an access decision (plus, it often says far too much about other things, but that’s an entirely separate discussion).

Organizing access control around the question “who are you?” is incoherent, because the answer is fundamentally fuzzy. The driving intuition is that the human who clicked a button or typed a command is the person who wanted whatever it was to happen. But this is not obviously true, and in some important cases it’s not true at all. Consider, for example, interacting with a website using your browser. Who is the intentional agent in this scenario? Well, obviously there’s you. But also there are the authors of every piece of software that sits between you and the site. This includes your operating system, the web browser itself, the web server, its operating system, and any other major subsystems involved on your end or theirs, plus the thousands of commercial and open source libraries that have been incorporated into these systems. And possibly other stuff running on your (or their) computers at the time. Plus intermediaries like your household or corporate wireless access point, not to mention endless proxies, routers, switches, and whatnot in the communications path from here to there. And since you’re on a web page, there’s whatever scripting is on the page itself, which includes not only the main content provided by the site operators but any of the other third party stuff one typically finds on the web, such as ads, plus another unpredictably large bundle of libraries and frameworks that were used to cobble the whole site together. Is it really correct to say that any action taken by this vast pile of software was taken by you? Even though the software has literally tens of thousands of authors with a breathtakingly wide scope of interests and objectives? Do you really want to grant all those people the power to act as you? I’m fairly sure you don’t, but that’s pretty much what you’re actually doing, quite possibly thousands of times per day. The question that the capability crowd keeps asking is, “why?”

Several years ago, the computer security guru Marcus Ranum wrote: “After all, if the conventional wisdom was working, the rate of systems being compromised would be going down, wouldn’t it?” I have no idea where he stands on capabilities, nor if he’s even aware of them, but this assertion still seems on point to me.

I’m on record comparing the current state of computer security to the state of medicine at the dawn of the germ theory of disease. I’d like to think of capabilities as computer security’s germ theory. The analogy is imperfect, of course, since germ theory talks about causality whereas here we’re talking about the right sort of building blocks to use. But I keep being drawn back to the parallel largely because of the ugly and painful history of germ theory’s slow acceptance. On countless occasions I’ve presented capability ideas to folks who I think ought know about them – security people, engineers, bosses. The typical response is not argument, but indifference. The most common pushback, when it happens, is some variation of “you may well be right, but…”, usually followed by some expression of helplessness or passive acceptance of the status quo. I’ve had people enthusiastically agree with everything I’ve said, then go on to behave as if these ideas had never ever entered their brains. People have trouble absorbing ideas that they don’t already have at least some tentative place for in their mental model of the world; this is just how human minds work. My hope is that some of the stuff I’ve written here will have given these ideas a toehold in your head.

Acknowledgements

This essay benefitted from a lot of helpful feedback from various members of the Capabilities Mafia, the Friam group, and the cap-talk mailing list, notably David Bruant, Raoul Duke, Bill Frantz, Norm Hardy, Carl Hewitt, Chris Hibbert, Baldur Jóhannsson, Alan Karp, Kris Kowal, William Leslie, Mark Miller, David Nicol, Kevin Reid, and Dale Schumacher. My thanks to all of them, whose collective input improved things considerably, though of course any remaining errors and infelicities are mine.

February 7, 2017

Open Source Lucasfilm’s Habitat Restoration Underway

Habitat Frontyard taken 12/30/2017Project Hub taken 12/30/2017

It’s all open source!

Yes – if you haven’t heard, we’ve got the core of the first graphical MMO/VW up and running and the project needs help with code, tools, doc, and world restoration.

I’m leading the effort, with Chip leading the underlying modern server: the Elko project – the Nth generation gaming server, still implementing the basic object model from the original game.

http://neohabitat.org is the root of it all.
http://slack.neohabitat.org to join the project team Slack.
http://github.com/frandallfarmer/neohabitat to fork the repo.

To contribute, you should be capable to use a shell, fork a repo, build it, and run it. Current developers use: shell, Eclipse, Vagrant, or Docker.

To get access to the demo server (not at all bullet proofed) join the project.

We’ve had people from around the world in there already! (See the photos)

http://neohabitat.org #opensource #c64 #themade

Habitat Turf taken 12/30/2017Habitat Beach taken 12/30/2017

October 14, 2016

Software Crisis: The Next Generation

tl;dr: If you consider the current state of the art in software alongside current trends in the tech business, it’s hard not to conclude: Oh My God We’re All Gonna Die. I think we can fix things so we don’t die.

Marc Andreesen famously says software is eating the world.

His analysis is basically just a confirmation of a bunch of my long standing biases, so of course I think he is completely right about this. Also, I’m a software guy, so naturally I would think this is a natural way for the world to be.

And it’s observably true: an increasing fraction of everything that’s physically or socially or economically important in our world is turning into software.

The problem with this is the software itself. Most of this software is crap.

And I don’t mean mere Sturgeon’s Law (“90% of everything is crap”) levels of crap, either. I’d put the threshold much higher. How much? I don’t know, maybe 99.9%? But then, I’m an optimist.

This is one of the dirty little secrets of our industry, spoken about among developers with a kind of masochistic glee whenever they gather to talk shop, but little understood or appreciated by outsiders.

Anybody who’s seen the systems inside a major tech company knows this is true. Or a minor tech company. Or the insides of any product with a software component. It’s especially bad in the products of non-tech companies; they’re run by people who are even more removed from engineering reality than tech executives (who themselves tend to be pretty far removed, even if they came up through the technical ranks originally, or indeed even if what they do is oversee technical things on a daily basis). But I’m not here to talk about dysfunctional corporate cultures, as entertaining as that always is.

The reason this stuff is crap is far more basic. It’s because better-than-crap costs a lot more, and crap is usually sufficient. And I’m not even prepared to argue, from a purely darwinian, return on investment basis, that we’ve actually made this tradeoff wrong, whether we’re talking about the ROI of a specific company or about human civilization as a whole. Every dollar put into making software less crappy can’t be spent on other things we might also want, the list of which is basically endless. From the perspective of evolutionary biology, good enough is good enough.

But… (and you knew there was a “but”, right?)

Our economy’s ferocious appetite for software has produced teeming masses of developers who only know how to produce crap. And tooling optimized for producing more crap faster. And methodologies and processes organized around these crap producing developers with their crap producing tools. Because we want all the cool new stuff, and the cool new stuff needs software, and crappy software is good enough. And like I said, that’s OK, at least for a lot of it. If Facebook loses your post every now and then, or your Netflix feed dies and your movie gets interrupted, or if your web browser gets jammed up by some clickbait website you got fooled into visiting, well, all of these things are irritating, but rarely of lasting consequence. Besides, it’s not like you paid very much (if you paid anything at all) for that thing that didn’t work quite as well as you might wish, so what’s your grounds for complaint?

But now, like Andreesen says, software is eating the world. And the software is crap. So the world is being eaten by crap.

And still, this wouldn’t be so bad, if the crap wasn’t starting to seep into things that actually matter.

A leading indicator of what’s to come is the state of computer security. We’re seeing an alarming rise in big security breaches, each more epic than the next, to the point where they’re hardly news any more. Target has 70 million customers’ credit card and identity information stolen. Oh no! Security clearance and personal data for 21.5 million federal employees is taken from the Office of Personnel Management. How unfortunate. Somebody breaks into Yahoo! and makes off with a billion or so account records with password hashes and answers to security questions. Ho hum. And we regularly see survey articles like “Top 10 Security Breaches of 2015”. I Googled “major security breaches” and the autocompletes it offered were “2014”, “2015”, and “2016”.

And then this past month we had the website of security pundit Brian Krebs taken down by a distributed denial of service attack originating in a botnet made of a million or so compromised IoT devices (many of them, ironically, security cameras), an attack so extreme it got him evicted by his hosting provider, Akamai, whose business is protecting its customers against DDOS attacks.

Here we’re starting to get bleedover between the world where crap is good enough, and the world where crap kills. Obviously, something serious, like an implanted medical device — a pacemaker or an insulin pump, say — has to have software that’s not crap. If your pacemaker glitches up, you can die. If somebody hacks into your insulin pump, they can fiddle with the settings and kill you. For these things, crap just won’t do. Except of course, the software in those devices is still mostly crap anyway, and we’ll come back to that in a moment. But you can at least make the argument that being crap-free is an actual requirement here and people will (or anyway should) take this argument seriously. A $60 web-enabled security camera, on the other hand, doesn’t seem to have these kinds of life-or-death entanglements. Almost certainly this was not something its developers gave much thought to. But consider Krebs’ DDOS — that was possible because the devices used to do it had software flaws that let them be taken over and repurposed as attack bots. In this case, they were mainly used to get some attention. It was noisy and expensive, but mostly just grabbed some headlines. But the same machinery could have as easily been used to clobber the IT systems of a hospital emergency room, or some other kind of critical infrastructure, and now we’re talking consequences that matter.

The potential for those kinds of much more serious impacts has not been lost on the people who think and talk about computer security. But while they’ve done a lot of hand wringing, very few of them are asking a much more fundamental question: Why is this kind of foolishness even possible in the first place? It’s just assumed that these kinds of security incidents will inevitably happen from time to time, part of the price of having a technological civilization. Certainly they say we need to try harder to secure our systems, but it’s largely accepted without question that this is just how things are. Psychologists have a term for this kind of thinking. They call it learned helplessness.

Another example: every few months, it seems, we’re greeted with yet another study by researchers shocked (shocked!) to discover how readily people will plug random USB sticks they find into their computers. Depending on the spin of the coverage, this is variously represented as “look at those stupid people, har, har, har” or “how can we train people not to do that?” There seems to be a pervasive idea in the computer security world that maybe we can fix our problems by getting better users. My question is: why blame the users? Why the hell shouldn’t it be perfectly OK to plug in a random USB stick you found? For that matter, why is the overwhelming majority of malware even possible at all? Why shouldn’t you be able to visit a random web site, click on a link in a strange email, or for that matter run any damn executable you happen to stumble across? Why should anything bad happen if you do those things? The solutions have been known since at least the mid ’70s, but it’s a struggle to get security and software professionals to pay attention. I feel like we’re trapped in the world of medicine prior to the germ theory of disease. It’s like it’s 1870 and a few lone voices are crying, “hey, doctors, wash your hands” and the doctors are all like, “wut?”. The very people who’ve made it their job to protect us from this evil have internalized the crap as normal and can’t even imagine things being any other way. Another telling, albeit horrifying, fact: a lot of malware isn’t even bothering to exploit bugs like buffer overflows and whatnot, a lot of it is just using the normal APIs in the normal ways they were intended to be used. It’s not just the implementations that are flawed, the very designs are crap.

But let’s turn our attention back to those medical devices. Here you’d expect people to be really careful, and indeed in many cases they have tried, but even so you still have headlines about terrible vulnerabilities in insulin pumps and pacemakers. And cars. And even mundane but important things like hotel door locks. And basically just think of some random item of technology that you’d imagine needs to be secure and Google “<fill in the blank> security vulnerability” and you’ll generally find something horrible.

In the current tech ecosystem, non-crap is treated as an edge case, dealt with by hand on a by-exception basis. Basically, when it’s really needed, quality is put in by brute force. This makes non-crap crazy expensive, so the cost is only justified in extreme use cases (say, avionics). Companies that produce a lot of software have software QA organizations within them who do make an effort, but current software QA practices are dysfunctional much like contemporary security practices are. There’s a big emphasis on testing, often with the idea you can use statistical metrics for quality, which works for assembling Toyotas but not so much for software because software is pathologically non-linear. The QA folks at companies where I’ve worked have been some of the most dedicated & capable people there, but generally speaking they’re ultimately unsuccessful. The issue is not a lack of diligence or competence; it’s that the underlying problem is just too big and complicated. And there’s no appetite for the kinds of heavyweight processes that can sometimes help, either from the people paying the bills or from developers themselves.

One of the reasons things are so bad is that the core infrastructure that we rely on — programming languages, operating systems, network protocols — predates our current need for software to actually not be crap.

In historical terms, today’s open ecosystem was an unanticipated development. Well, lots of people anticipated it, but few people in positions of responsibility took them very seriously. Arguably we took a wrong fork in the road sometime vaguely in the 1970s, but hindsight is not very helpful. Anyway, we now have a giant installed base and replacing it is a boil the ocean problem.

Back in the late ’60s there started to be a lot of talk about what came to be called “the software crisis”, which was basically the same problem but without malware or the Internet.

Back then, the big concern was that hardware had advanced faster than our ability to produce software for it. Bigger, faster computers and all that. People were worried about the problem of complexity and correctness, but also the problem of “who’s going to write all the software we’ll need?”. We sort of solved the first problem by settling for crap, and we sort of solved the second problem by making it easier to produce that crap, which meant more people could do it. But we never really figured out how to make it easy to produce non-crap, or to protect ourselves from the crap that was already there, and so the crisis didn’t so much go away as got swept under the rug.

Now we see the real problem is that the scope and complexity of what we’ve asked the machines to do has exceeded the bounds of our coping ability, while at the same time our dependence on these systems has grown to the point where we really, really, really can’t live without them. This is the software crisis coming back in a newer, more virulent form.

Basically, if we stop using all this software-entangled technology (as if we could do that — hey, there’s nobody in the driver’s seat here), civilization collapses and we all die. If we keep using it, we are increasingly at risk of a catastrophic failure that kills us by accident, or a catastrophic vulnerability where some asshole kills us on purpose.

I don’t want to die. I assume you don’t either. So what do we do?

We have to accept that we can’t really be 100% crap free, because we are fallible. But we can certainly arrange to radically limit the scope of damage available to any particular piece of crap, which should vastly reduce systemic crappiness.

I see a three pronged strategy:

1. Embrace a much more aggressive and fine-grained level of software compartmentalization.

2. Actively deprecate coding and architectural patterns that we know lead to crap, while developing tooling — frameworks, libraries, etc — that makes better practices the path of least resistence for developers.

3. Work to move formal verification techniques out of ivory tower academia and into the center of the practical developer’s work flow.

Each of these corresponds to a bigger bundle of specific technical proposals that I won’t unpack here, as I suspect I’m already taxing a lot of readers’ attention spans. I do hope to go into these more deeply in future postings. I will say a few things now about overall strategy, though.

There have been, and continue to be, a number of interesting initiatives to try to build a reliable, secure computing infrastructure from the bottom up. A couple of my favorites are the Midori project at Microsoft, which has, alas, gone to the great source code repo in the sky (full disclosure: I did a little bit of work for the Midori group a few years back) and the CTSRD project at the University of Cambridge, still among the living. But while these have spun off useful, relevant technology, they haven’t really tried to take a run at the installed base problem. And that problem is significant and real.

A more plausible (to me) approach has been laid out by Mark Miller at Google with his efforts to secure the JavaScript environment, notably Caja and Secure EcmaScript (SES) (also full disclosure: I’m one of the co-champions, along with Mark, and Caridy Patiño of Salesforce, of a SES-related proposal, Frozen Realms, currently working its way through the JavaScript standardization process). Mark advocates an incremental, top down approach: deliver an environment that supports creating and running defensively consistent application code, one that we can ensure will be secure and reliable as long as the underlying computational substrate (initially, a web browser) hasn’t itself been compromised in some way. This gives application developers a sane place to stand, and begins delivering immediate benefit. Then use this to drive demand for securing the next layer down, and then the next, and so on. This approach doesn’t require us to replace everything at once, which I think means it has much higher odds of success.

You may have noticed this essay has tended to weave back and forth between software quality issues and computer security issues. This is not a coincidence, as these two things are joined at the hip. Crappy software is software that misbehaves in some way. The problem is the misbehavior itself and not so much whether this misbehavior is accidental or deliberate. Consequently, things that constrain misbehavior help with both quality and security. What we need to do is get to work adding such constraints to our world.

October 27, 2014

The Bureaucratic Failure Mode Pattern

When we try to take purposeful action within an organization (or even in our lives more generally), we often find ourselves blocked or slowed by various bits of seemingly unrelated process that must first be satisfied before we are allowed to move forward. Some of these were put in place very deliberately, while others just grew more or less organically, but what they often have in common, aside from increasing the friction of activity, is that they seem disconnected from our ultimate purpose. If I want to drive my car to work, having to register my car with the DMV seems like a mechanically unnecessary step (regardless of what the real underlying reason for it may be).

Note that I’m not talking about the intrinsic difficulty or inconvenience of the process itself (car registration might entail waiting around for several hours in the DMV office or it might be 30 seconds online with a web page, for example), but the cost imposed by the mere existence of the need to report information or get permission or put things in some particular way just so or align or coordinate with some other thing (and the concomitant need to know that you are supposed to do whatever it is, and the need to know or find out how). Each of these is a friction factor; the competence or user-friendliness of whatever necessary procedure is involved may influence the magnitude of the inconvenience, but not the fact of it. (Other recursive friction factors embedded in the organizations or processes behind these things may well figure into why many of them are in fact incompetently executed or needlessly complex or time consuming, but that is a separate matter.)

Over time, organizations tend to acquire these bits of process, the way ships accumulate barnacles, with the accompanying increase in drag that makes forward progress increasingly difficult and expensive. However, barnacles are purely parasitic. They attach themselves to the hull for their own benefit, while the ship gains nothing of value in return. But even though organizational cynics enjoy characterizing these bits of process as also being purely parasitic, each of those bits of operational friction was usually put there for some purpose, presumably a purpose of some value. It may be that the cost-benefit analysis involved was flawed, but the intent was generally positive. (I’m ignoring here for a moment those things that were put in place for malicious reasons or to deliberately impede one person’s actions for the benefit of someone else. These kinds of counter-productive interventions do happen from time to time, and while they tend to loom large in people’s institutional mythologies, I believe such evil behavior is actually comparatively rare – perhaps not that uncommon in absolute terms, but still dwarfed by the truly vast number of ordinary, well-intentioned process elements that slow us down every day.)

Because I’m analyzing this from a premise of benign intent, I’m going to avoid characterizing these things with a loaded word like “barnacles”, even though they often have a similar effect. Instead, let’s refer to them as “checkpoints” – gates or control points or tests that you have to pass in order to move forward. They are annoying and progress-impeding but not necessarily valueless.

We are forced to pass through checkpoints all the time – having to swipe your badge past a reader to get into the office (or having to unlock the door to your own home, for that matter), entering a user name and password dozens of times per day to access various network services, getting approval from your boss to take a vacation day, having to fill out an expense report form (with receipts!) to get reimbursed for expenses you have incurred, all of the various layers of review and approval to push a software change into production, having to get approval from someone in the legal department before you can adopt a new piece of open source software; the list is potentially endless.

Note that while these vary wildly in terms of how much drag they introduce, for many of them the actual amount is very little, and this is a key point. The vast majority of these were motivated by some real world problem that called for some tiny addition to the process flow to prevent (or at least inhibit) whatever the problem was from happening again. No doubt some were the result of bad dealing or of an underemployed lawyer or administrator trying to preempt something purely hypothetical, but I think these latter kinds of checkpoint are the exception, and we weaken our campaign to reduce friction by paying too much attention to them – that is, by focusing too much on the unjustified bureaucracy, we distract attention from the far larger (and therefore far more problematic) volume of justified bureaucracy.

Let’s just presume, for the purpose of argument, that each of the checkpoints that we encounter is actually well motivated: that it exists for a reason, that the reason can be clearly articulated, that the reason is real, that it is more or less objective, that people, when presented with the argument for the checkpoint, will find it basically convincing. Let’s further presume that the friction imposed by the checkpoint is relatively modest – that the friction that results is not because the checkpoint is badly implemented but simply because it is there. And yes, I am trying, for purposes of argument, to cast things in a light that is as favorable to the checkpoints as possible. The reason I’m being so kind hearted towards them is because I think that, even given the most generous concessions to process, we still have a problem: the “death of a thousand cuts” phenomenon.

Checkpoints tend to accumulate over time. Organizations usually start out simple and only introduce new checkpoints as problems are encountered – most checkpoints are the product of actual experience. Checkpoints tend to accumulate with scale. As an organization grows, it finds itself doing each particular operation it does more often, which means that the frequency of actually encountering any particular low probability problem goes up. As an organization grows, it finds itself doing a greater variety of things, and this variety in turn implies greater variety of opportunities to encounter whole new species of problems. Both of these kinds of scale-driven problem sources motivate the introduction of additional checkpoints. What’s more, the greater variety of activities also means a greater number of permutations and combinations of activities that can be problematic when they interact with each other.

Checkpoints, once in place, tend to be sticky – they tend not to go away. Partly this is because if the checkpoint is successful at addressing its motivating problem, it’s hard to tell if the problem later ceases to exist – either way you don’t see it. In general, it is much easier for organizations to start doing things than it is for them to stop doing things.

The problem with checkpoints is their cumulative cost. In part, this is because the small cost of each makes them seductive. If the cost of checkpoint A is close to zero, it is not too painful, and there is little motivation or, really, little actual reason to do anything about it. Unfortunately, this same logic applies to checkpoint B, and to checkpoint C, and indeed to all of them. But the sum of a large number of values near zero is not necessarily itself a value near zero. It can, instead, be very large indeed. However, as we stipulated in our premises above, each one of them is individually justified and defensible. It is merely their aggregate that is indefensible – there is nothing to tell you, “here, this one, this is the problem” because there isn’t any one which is the problem. The problem is an emergent phenomenon.

Any specific checkpoint may be one that you encounter only rarely, or perhaps only once. Consider, for example, all the various procedures we make new hires go through. When you hit such a checkpoint, it may be tedious and annoying, but once you’ve passed it it’s done with. Thereafter you really have no incentive at all to do anything about it, because you’ll never encounter it again. But if we make a large number of people each go through it once, there’s still a large multiplier, and we’ve still burdened our organization with the cumulative cost.

A problem of particular note is that, because checkpoints tend to be specialized, they are often individually not well known. Plus, a larger total number of checkpoints increases the odds in general that you will encounter checkpoints that are unknown or mysterious to you, even if they are well known to others. Thus it becomes easy for somebody without the relevant specialized knowledge to get into trouble by violating a rule that they didn’t even know to exist.

Unknown or poorly understood checkpoints increase friction disproportionately. They trigger various kinds of remedial responses from the organization, in the form of compliance monitoring, mandatory training sessions, emailed warning messages and other notices that everyone has to read, and so on. Each such checkpoint thus generates a whole new set of additional checkpoints, meaning that the cumulative frictions multiply instead of just adding.

Violation of a checkpoint may visit sanctions or punishment on the transgressor, even if the transgression was inadvertent. The threat of this makes the environment more hostile. It trains people to be become more timid and risk averse. It encourages them to limit their actions to those areas where they are confident they know all the rules, lest they step on some unfamiliar procedural landmine, thus making the organization more insular and inflexible. It gives people incentives to spend their time and effort on defensive measures at the expense of forward progress.

When I worked at Electric Communities, we had (as most companies do) a bulletin board in our break room where we displayed all the various mandatory notices required by a profusion of different government agencies, including arms of the federal government, three states (though we were a California company, we had employees who commuted from Arizona and Oregon and so we were subject to some of those states’ rules too), a couple of different regional agencies, and the City of Cupertino. I called it The Wall Of Bureaucracy. At one point I counted 34 different such notices (and employees, of course, were expected to read them, hence the requirement that they be posted in a prominent, common location, though of course I suspect few people actually bothered). If you are required to post one notice, it’s pretty easy to know that you are in compliance: either you posted it or you didn’t. But if you are required to post 34 different notices, it’s nearly impossible to know that the number shouldn’t be 35 or 36 or that some of the ones you have are out of date or otherwise mistaken. Until, of course, some government inspector from some agency you never heard from before happens to wander in and issue you a citation and a fine (and often accuse you of being a bad person while they’re at it). As Alan Perlis once said, talking about programming, “If you have a procedure with ten parameters, you probably missed some.”

In the extreme case, the cumulative costs of all the checkpoints within an organization can exceed the working resources the organization has available, and forward progress becomes impossible. When this happens, the organization generally dies. From an external perspective – from the outside, or even from one part of the organization looking at another – this appears insane and self-destructive, but from the local perspective governing any particular piece of it, it all makes sense and so nothing is done to fix it until the inexorable laws of arithmetic put a stop to the whole thing. A famous example of this was Atari, where by 1984 the combined scleroses effecting the product development process became so extreme that no significant new products were able to make it out the door because the decision making and approval process managed to kill them all before they could ship, even though a vast quantity of time and money and effort was spent on developing products, many of them with great potential. Few organizations manage to achieve this kind of epic self-absorption, though some do seem to approach it as an asymptote (e.g., General Motors). In practice, however, what seems to keep the problem under control, here in Silicon Valley anyway, is that the organization reaches a level of dysfunction where it is no longer able to compete effectively and it is supplanted in the marketplace by nimbler and generally younger rivals whose sclerosis is not as advanced.

The challenge, of course, is how to deal with this problem. The most common pathway, as alluded to above, is for a newer organization to supplant the older one. This works, not because the one organization is intrinsically more immune to the phenomenon than the other but simply due to the fact that because it is younger and smaller it has not yet developed as many internal checkpoints. From the perspective of society, this is a fine way of handling things; this is Schumpeter’s “creative destruction” at work. It is less fine from the perspective of the people whose money or lives are invested in the organization being creatively destroyed.

Another path out of the dilemma is strong leadership that is prepared to ride roughshod over the sound justifications supporting all these checkpoints and simply do away with them by fiat. Leaders like this will disregard the relevant constituencies and just cut, even if crudely. Such leaders also tend to be authoritarian, megalomaniacal, visionary, insensitive, and arguably insane – and, disturbingly often, right – i.e., they are Steve Jobs. They also tend to be a bit rough on their subordinates. This kind of willingness to disrespect procedure can also sometimes be engendered by dire necessity, enabling even the most hidebound bureaucracies to manifest surprising bursts of speed and effectiveness. A well known and much studied example of this phenomenon is the military, ordinarily among the stuffiest and most procedure bound of institutions, which can become radically more effective in times of actual war. In the first three weeks of American involvement in World War II, when we weren’t yet really doing anything serious, Army Chief of Staff George Marshall merely started carefully asking people questions and half the generals in the US Army found themselves retired or otherwise displaced.

A more user-friendly way to approach the problem is to foster an institutional culture that sees the avoidance of checkpoints as a value unto itself. This is very hard to do, and I am hard pressed to think of any examples of organizations that have managed to do this consistently over the long term. Even in the short term, examples are few, and tend to be smaller organizations embedded within much larger, more traditional ones. Examples might include Bell Labs during AT&T’s pre-breakup years, Xerox PARC during its heyday, the Lucasfilm Computer Division during the early 1980s, or the early years of the Apollo program. Each of these examples, by the way, benefited from a generous surplus of externally provided resources, which allowed them to trade a substantial amount of resource inefficiency for effective productivity. Surplus resources, however, tend also to engender actual parasitism, which ultimately ends the golden age, as all these examples attest.

The foregoing was expressed in terms of people and organizations, but essentially the same analysis applies almost without modification to software systems. Each of the myriad little inefficiencies, rough edges, performance draining extra steps, needless added layers of indirection, and bits of accumulated cruft that plague mature software is like an organizational checkpoint.

October 19, 2014

Map of The Habitat World

By now a lot of you may have heard about the initiative at Oakland’s Museum of Digital Arts & Entertainment to resurrect Habitat on the web using C64 emulators and vintage server hardware. If not, you can read more about it here (there’s also been a modest bit of coverage in the game press, for example at Wired, Joystiq, and Gamasutra).

Part of this effort has had me digging through my archives, looking at old source files to answer questions that people had and to refresh my own memory of how things worked. It’s been pretty nostalgic, actually. One of the cooler things I stumbled across was the Habitat world map, something which relatively few people have ever seen because when Habitat was finally released to the public it got rebranded (as “Club Caribe”) with an entirely different set of publicity materials. I had a big printout of this decorating my office at Skywalker Ranch and later at American Information Exchange, but not very many people will have been in either of those places. Now, however, thanks to the web, I can share it publicly for the first time.

We wanted to have a map because we thought we would need a plan for enlarging the world as the user population grew. The idea was to have a framework into which we could plug new population centers and new places for stories and adventures.

The specific map we ended up with came about because I was playing around writing code to generate plausible topographic surfaces using fractal techniques (and, of course, lots and lots and LOTS of random numbers). The little program I wrote to do this was quite a CPU hog, but I could run it on a bunch of different computers in parallel and combine the results (sort of like modern MapReduce techniques, only by hand!). One night I grabbed every Unix machine on the Lucasfilm network that I could lay my hands on (two or three Vax minicomputers and six or eight Sun workstations) and let the thing cook for an epic all-nighter of virtual die rolling. In the morning I was left with this awesome height field, in the form of a file containing a big matrix of altitude numbers. Then, of course, the question was what to do with it, and in particular, how to look at it. Remember that in those days, computers didn’t have much in the way of image display capability; everything was either low resolution or low color fidelity or both (the Pixar graphics guys had some high end display hardware, but I didn’t have access to it and anyway I’d have to write more code to do something with the file I had, which wasn’t in any kind of standard image format). Then I realized that we had these new Apple LaserWriter printers. Although they were 1-bit per pixel monochrome devices, they printed at 300 DPI, which meant you could get away with dithering for grayscale. And you fed stuff to them using PostScript, a newfangled Forth-like programming language. So I ordered Adobe’s book on PostScript and went to work.

I wrote a little C program that took my big height field and reduced it to a 500×100 image at 4 bits per pixel, and converted this to a file full of ASCII hexadecimal values. I then wrapped this in a little bit of custom PostScript that would interpret the hex dump as an image and print it, and voilá, out of the printer comes a lovely grayscale topographic map. Another little quick filter and I clipped all the topography below a reasonable altitude to establish “sea level”, and I had some pretty sweet looking landscape. At this point, you could make out a bunch of obvious geographic features, so we picked locations for cities, and drew some lines for roads between them, and suddenly it was a world. A little bit more PostScript hacking and I was able to actually draw nicely rendered roads and city labels directly on the map. Then I blew it up to a much larger size and printed it over several pages which I trimmed and taped together to yield a six and a half foot wide map suitable for posting on the wall.

As I was going through my archives in conjunction with the project to reboot Habitat, I encountered the original PostScript source for the map. I ran it through GhostScript and rendered it into a 22,800×4,560 pixel TIFF image which I could open in Photoshop and wallow around in. This immediately tempted me to do a bit more embellishment with Photoshop, so a little bit more hacking on the PostScript and I could split the various components of the image (the topographic relief, the roads, the city labels, etc.) into separate images which could then be individually manipulated as layers. I colorized the topography, put it through a Gaussian blur to reduce the chunkiness, and did a few other little bits of cosmetic tweaking, and the result is the image you see here (clicking on the picture will take you to a much larger version):

Habitat map

(Also, if you care to fiddle with this in other formats, the PostScript for the raw map can be gotten here. Beware that depending on what kind of configuration your browser has, your browser may just attempt to render the PostScript, which might not have exactly the results you want or expect. Have fun.)

There a number of interesting details here worth mentioning. Note that the Habitat world is cylindrical. This lets us encompass several different interesting storytelling possibilities: Going around the cylinder lets you circumnavigate the world; obviously, the first avatar to do this would be famous. The top edge is bounded by a wall, the bottom edge by a cliff. This means that you can fall of the edge of the world, or explore the wall for mysterious openings. By the way, the top edge is West. Habitat compasses point towards the West Pole, which was endlessly confusing for nearly everyone.

We had all kinds of plans for what to do with this, which obviously we never had a chance to follow through on. One of my favorites was the notion that if you walked along the top (west) wall enough, eventually you’d find a door, and if you went through this door you’d find yourself in a control room of some kind, with all kinds of control panels and switches and whatnot. What these switches would do would not be obvious, but in fact they’d control things like the lights and the day/night cycle in different parts of the world, the color palette in various places, the prices of things, etc. Also, each of the cities had a little backstory that explained its name and what kinds of things you might expect to find there. If I run across that document I’ll post it here too.

April 29, 2014

Troll Indulgences: Virtual Goods Patent Gutted [7,076,445]

Indulgence Another terrible virtual currency/goods patent has been rightfully destroyed – this time in an unusual (but worthy) way: From Law360: EA, Zynga Beat Gametek Video Game Purchases Patent Suit, By Michael Lipkin

Law360, Los Angeles (April 25, 2014, 7:20 PM ET) — A California federal judge on Friday sided with Electronic Arts Inc., Zynga Inc. and two other video game companies, agreeing to toss a series of Gametek LLC suits accusing them of infringing its patent on in-game purchases because the patent covers an abstract idea. … “Despite the presumption that every issued patent is valid, this appears to be the rare case in which the defendants have met their burden at the pleadings stage to show by clear and convincing evidence that the ’445 patent claims an unpatentable abstract idea,” the opinion said.

The very first thing I thought when I saw this patent was: “Indulgences! They’re suing for Indulgences? The prior art goes back centuries!” It wasn’t much of a stretch, given the text of the patent contains this little fragment (which refers to the image at the head of this post):

Alternatively, in an illustrative non-computing application of the present invention, organizations or institutions may elect to offer and monetize non-computing environment features and/or elements (e.g. pay for the right to drive above the speed limit) by charging participating users fees for these environment features and/or elements.

WTF? Looks like reasoning something along those lines was used to nuke this stinker out of existence. It is quite unusual for a patent to be tossed out in court. Usually the invalidation process has to take a separate track, as it has with other cases I’ve helped with, such as The Word Balloon Patent. I’m very glad to see this happen – not just for the defendant, but for the industry as a whole. Just adding “on a computer [network]” to existing abstract processes doesn’t make them intellectual property! Hopefully this precedent will help kill other bad cases in the pipeline already…

March 5, 2014

Two Recipes for Stone Soup [A Fable of Pre-Funding Startups]

There once was a young Zen master, who had earned a decent name for himself throughout the land. He was not famous, but many of his peers knew of his reputation for being wise and fair. During his career, he was renowned for his loyalty to whatever dojo he was attached to, usually for many years at a time. One year his patronage decided to merge with another, larger dojo, and the young master found himself unexpectedly looking for a new livelihood. But he was not desperate, as he’d heeded the words of his mentor and had kept close contact with many other Zen masters over the years and considered many options.

As word spread about the young master’s availability, he began to receive more interest than he could possibly ever fulfill. It took all of his Zen training and long nights just to keep up with the correspondence and meetings. He was getting queries from well-established cooperatives, various governments, charitable groups, many recently formed houses, and even more people who had a grand idea around which to form a whole-new kind of dojo. This latter category was intriguing, but the most fraught with peril. There were too many people with too many ideas for the young master to sort between. So he decided to consult with his mentor. At least one more time, he would be the apprentice and ventured forth to the dojo of his youth, a half-day’s journey away.

“Master, the road ahead is filled with many choices, some are well traveled roads and others are merely slight indentations in the grass that may some day become paths. How can I choose?” asked the apprentice.

The mentor replied, “Have you considered the wide roads and the state-maintained roads?”

“Yes, I know them well and have many reasons to continue on one of them, but these untrodden paths still call to me. It is as if there is a man with his hands at his mouth standing at each one shouting to follow his new path to riches and glory. How do I sort out the truth of their words?” The young master was genuinely perplexed.

“You are wise, my son, to seek council on this matter — as sweet smelling words are enticing indeed and could lead you down a path of ruin or great fortune. Recount to me now two of the recruiting stories that you have heard and I will advise you.” The mentor’s face relaxed and his eyes closed as he dropped into thought, which was exactly what the young master needed to calm himself sufficiently to relate the stories.

After the mentor had heard the stories, he continued meditating for several minutes before speaking again: “Former apprentice, do you recall the story and lesson of Stone Soup?”

“Yes, master. We learned it as young adepts. It is the story of a man who pretended that he had a magic stone for making the world’s best soup, which he then used to convince others to contribute ingredients to the broth until a delicious brew was made. This story was about how leadership and an idea can ease people into cooperating to create great things for the good of them all.” recounted the student. “I can see the similarity between the callers standing on the new paths and the man with the magic stone. Also it is clear that that the ingredients are symbolic of the skills of the potential recruits. But, I don’t see how that helps me.” The apprentice had many years of experience with the mentor, and knew that this challenge would get the answer he was looking for.

“The stories you told me are two different recipes for Stone Soup,” the master started.

“The first caller was a man with a certain and impressive voice that said to you ‘You should join my dojo! It is like none other and it is a good and easy path that will lead to great riches. Many people that you know, such as Haruko and Jin, have tested this path and others who have great reputations including Master Po and Teacher Win are going to walk upon it as well. Your reputation would be invaluable to our venture. Join us now!'”

“The second caller was a humble and uncertain man who spoke softly as he said ‘You should join my dojo. It is like none other and the path, though potentially fraught with peril, could lead to riches if the right combination of people were to take to it. Your reputation is well known, and if you were to join the party, the chance of success would increase greatly. Would you consider meeting here in two days time to talk to others to discuss our goals and to see if a suitable party could be formed? Even if you don’t join us, any advice you have would be invaluable.” The mentor paused to see if his former student understood.

The young master said “I don’t see much difference, other than the second man seems the weaker.”

The mentor suppressed a sigh. Clearly this visit would not have been necessary if the young master were able to see this himself. Besides, it was good to see his student again and to be discussing such a wealth of opportunities.

He resumed, “Remember the parable of Stone Soup. The first man did not. He recited many names as if those names carried the weight of the reputations of their owners. He has forgotten the objective of the parable: The Soup. It is not the names or reputations of the people who placed the ingredients into the soup that mattered. It was that the soup needed the ingredients and the people added them anonymously, in exchange for a bowl of the broth. The first man merely suggested that important people were committed to the journey. I am quite certain that, were you to ask Haruko and Jin what names they have heard as being associated with the proposed dojo, you would find that your name was provided as a reference without your knowledge or consent.”

The student clearly became agitated as the truth of his mentor’s words sunk in. There was work to do before the day was done in order to repair any damage to his reputation that speaking with the first man may have caused.

The mentor continued, “The first recipe for Stone Soup is The Braggarts Brew. It tastes just like hot water because when everyone finds out that the founder is a liar, they all recover whatever ingredients they can to take them home and try to dry them out.”

The mentor took a quick drink, but gave a quelling glance that told the apprentice to remain silent until the lesson was over.

“You called the second man weaker, but his weakness is like that of the man with the Stone from the parable. He keeps his eyes on the goal — creating the Soup or staffing his dojo. Without excellent ingredients, there will be no success; and the best way to get them is to appeal to the better nature of those who possess them. He, by listening to them, transforms the dojo into a community project — which many contribute to, even if only a little bit.”

“Your skills, young master, are impressive on their own. You need not compare yourself with others, nor should you be impressed with one who would so trivially invoke the reputation of others, as if they were magic words in some charm.”

“The second recipe for Stone Soup is Humble Chowder, seasoned with a healthy dash of realism. This is the tempting broth.” And the mentor was finished.

The apprentice jumped up — “Master! I am so thankful! I knew that coming to you would help me see the truth. And now, I see a greater truth — you are also the man with a Stone. Please tell me what I can contribute to your Soup.”

“Choose your next course wisely, and return to me with the story so that I may share it with the next class of students.”

“I will!”

And with that, the young master ran as quickly as he could to catch up with the group meeting about the second man’s dojo. He wasn’t certain if he’d join them, but the honor of being able to contribute to its foundation would enough payment for now. When he approached the seated group, he was delighted to see several people whose reputation he respected around the fire, discussing amazing possibilities. One of them was Jin, who was shocked to learn that the first man had given his name to the young master…

[This is a long-lost post, originally posted on our old site six years go. Once again, the internet archive to the rescue!]

February 21, 2014

White Paper: 5 Questions for Selecting an Online Community Platform

From Cultivating Community (a Ning blog)

Today, we’re proud to announce a project that’s been in the works for a while: A collaboration with Community Pioneer F. Randall Farmer to produce this exclusive white paper – “Five Questions for Selecting an Online Community Platform.” 

Randy is co-host of the Social Media Clarity podcast, a prolific social media innovator, and literally co-wrote the book on Building Web Reputation Systems. We were very excited to bring him on board for this much needed project. While there are numerous books, blogs, and white papers out there to help Community Managers grow and manage their communities, there’s no true guide to how to pick the right kind of platform for your community.

In this white paper, Randy has developed five key questions that can help determine what platform suits your community best. This platform agnostic guide covers top level content permissions, contributor identity, community size, costs, and infrastructure. It truly is the first guide of its kind and we’re delighted to share it with you.

Go to the Cultivating Community post to get the paper.

December 19, 2013

Audio version of classic “BlockChat” post is up!

On the Social Media Clarity Podcast, we’re trying a new rotational format for episodes: “Stories from the Vault” – and the inaugural tale is a reading of the May 2007 post The Untold History of Toontown’s SpeedChat (or BlockChattm from Disney finally arrives)

[sc_embed_player fileurl=”http://traffic.libsyn.com/socialmediaclarity/138068-disney-s-hercworld-toontown-and-blockchat-tm-s01e08.mp3″]
toontown1

Link to podcast episode page[sc_embed_player fileurl=”http://traffic.libsyn.com/socialmediaclarity/138068-disney-s-hercworld-toontown-and-blockchat-tm-s01e08.mp3″]

October 30, 2013

Origin of Avatars, MMOs, and Freemium

Origin of Avatars, MMOs, and Freemium – S01E06 Social Media Clarity Podcast

The latest episode of the Social Media Clarity Podcast contains an interview with Chip Morningstar (and podcast hosts: Randy Farmer and Scott Moore). This segment focuses on the emergent social phenomenon encountered the first time people used avatars with virtual currency, and artificial scarcity.

Links and transcription at http://socialmediaclarity.net