The Very Large Scale Computation Lab (VLSC) is trying to solve some of the key software and hardware challenges of building global-scale services. These services may encompass tens of geo-distributed datacenters, each of which can contain tens of thousands of servers. Compounding these challenges of scale and distribution, the services have very high availability and responsiveness requirements, and they are under continual development and evolution.
Currently, VLSC is focused on three research directions:
- Reconfigurable computing. The imminent end of Moore’s Law has stimulated considerable interest in alternative computing models, which use a large number of transistors to build something other than a conventional, general-purpose processor. At Microsoft, I helped start the Catapult project, which built an FPGA accelerator for the Bing search engine. At EPFL, we are looking at using FPGAs (including a full Microsoft Catapult system) as accelerators for genomic data processing.
- Genomic computing. Next-generation genome sequencing technology has reached a point at which it is becoming cost-effective to sequence all patients. Biobanks and researchers are faced with an oncoming deluge of genomic data, whose processing requires new and scalable bioinformatics architectures and systems. Processing raw genetic sequence data is computationally expensive and datasets are large. Current software systems can require many hours to process a single genome and generally run only on a single computer. Common file formats are monolithic and row-oriented, a barrier to distributed computation. To address these challenges, we built Persona, a cluster-scale, high-throughput bioinformatics framework
- Non-volatile memory. New non-volatile memory (NVM) technologies allow direct, durable storage of data in an application’s heap. Durable random-access memory can simplify the construction of reliable applications that do not lose data at system shutdown or power failure. Taking advantage of non-volatility is not as simple as just keeping data in NVM. Without programming support, it is challenging to write correct, efficient code that permits recovery after a power failure since the restart mechanism must find a consistent state in durable storage, which may have last written at an arbitrary point in a program’s execution.
All of this research is cross-disciplinary and considers hardware and software as complementary techniques that need to be studied and pushed forward together.