Richard Golding

Research statement

My work in the last several years has been focused on the questions that arise in the practical construction of systems made up of many independent components. Much of this work has been in the context of distributed storage systems, but not exclusively.

The K2 distributed storage project provides an example of this kind of system. It implements a storage system, built from up to a few tens of thousands of server nodes. Each server node has a processor, memory, some disks, and network connections, and operates independently from other servers. Together the servers provide several petabytes of storage capacity, with virtualization that ensures that different users get the proper amount of capacity, performance, and reliability.

Systems of this scale are fundamentally different from smaller systems. The population of servers is usually heterogeneous, and changes continuously from failure, replacement, and growth. The system is likely shared among many users or applications, possibly including multiple organizations and hence separate administrative and trust domains. Shared systems like this must ensure that resource competition between users or applications is managed so that each gets the resources it needs. And, of course, there is simple scale: algorithms that scale worse than linearly often will not work.

Many of the current approaches to building distributed systems do not provide the tools needed to build this kind of system. Consensus-oriented group services and the replicated state machine design pattern, for example, do not provide for reasoning about heterogeneity and trust.

Social and biological systems, on the other hand, regularly adapt to change and cope with diverse populations, and so the approach in my research has been to draw on these systems to help define models for reasoning about large-scale distributed systems.

The K2 distributed storage system provides a concrete example of the kind of system that results from this research approach. K2 provides self-managed, virtualized storage: administrators define virtual storage pools for different users or applications, giving each pool attributes that define required capacity, performance, and reliability. The system works to ensure that the right resources are assigned to each pool in order to meet the requirement. As the requirements or available resources change, the system reallocates resources and moves data. K2 thus is designed to support higher-level application and user management by handling the details of provisioning resources and leaving policy to high-level management.

Internally, the K2 design rests on two mechanisms: federated small groups and resource allocation.

Each virtual storage pool in K2 is implemented by allocating a part of the resources on one or more storage server, and running a server process to handle requests associated with the pool. The server processes that make up one pool form a group and elect a manager, which watches the server processes for failure and coordinates updates to the group. This is the federated small groups model, which was inspired by the way small human work groups organize themselves. The approach scales well because most decision-making is centralized (while remaining fault-tolerant), and each server needs only to communicate with a few other servers in the normal case.

The manager runs resource allocation decision algorithms for the group when resources need to be adjusted. The global allocation algorithm, based on a multidimensional bin-packing heuristic, draws inspiration from economic models: each group asks servers for their bid on some resources. The algorithms work to balance the needs of one pool against the needs of all pools, in part to avoid the Tragedy of the Commons. Once the global decision is made, each server process is told the amount of resource that it is expected to provide for the pool, and the server process then provides local enforcement of the global decision. For example, each server runs an I/O scheduling algorithm to shape I/O traffic to meet performance requirements.


Earlier research

In earlier research, I have also worked with object storage, and designed the first commercially-available object storage device (in the Panasas ActiveScale file system) and managed the team that implemented it. I had hoped that object storage would provide a standardized model for intelligent, independent storage services.

I have also worked on ways to make systems adapt their behavior to the workload imposed on them; in particular, the work on idleness provides a framework for reasoning about how to do some work when the system is less busy.

My dissertation work was on weak-consistency group communication, and was a predecessor to my current work on large-scale systems. In that work, a set of nodes are attempting to coordinate their activities using a group communication protocol, but the nodes are geographically distant and often unable to communicate. The work proposed a family of protocols for weak-consistency groups, where all group members are acting on a consistent view of a sequence of messages (typically carrying updates to shared data, as in the replicated state machine model) -- but it may take some time for some members to receive some of the messages, with implications for fault tolerance.


Teaching statement

Most of my teaching has been done one-on-one, rather than in classrooms, though I taught the undergraduate Computer Networks course in Fall 1993 at the Vrije Universiteit Amsterdam. I have mentored several summer interns, served on several thesis committees, and co-advised one PhD student. The greatest satisfaction I get from teaching has been from these individual interactions, helping a student improve their skills and become ready to be an independent researcher or developer.

There are two areas that I have especially helped students improve: organizing projects and writing. I have helped several students learn to organize how they ask questions in order to determine what to work on, and how to go about investigating the ideas they have. With other students, I have helped them learn how to write clear technical prose -- how to organize a paper to hit the marks that one expects, and how to structure longer documents including theses.