Research

 

My research interests are in distributed systems and datacenter applications. At Berkeley, I was advised by Prof. Ion Stoica and was part of the Algorithms, Machines and People (AMP) Lab, one of the larger collaborative multi-area labs at Berkeley.

Current Projects

As I have taken a position at Cloudera, I am not currently working on any new research projects.

Past Projects

Cake (formerly Frosting) is a multi-resource scheduler for shared storage systems that supports enforcement of high-level service-level objectives. This allows latency-sensitive and batch workloads to be consolidated onto the same storage system, reducing provisioning costs, improving utilization, and shortening traditional copy-then-process analytics cycles. Cake was presented at SoCC 2012.

PACMan is an in-memory caching infrastructure for Hadoop and HDFS which is tuned for large datacenter workloads. We have developed novel cache management policies that perform significantly better than theoretically "optimal" cache-eviction algorithms (such as farthest-in-the-future) by exploiting characteristics of workload traces from large internet companies. This was published at NSDI 2012.

CrowdDB is a relational database that has been extended with crowdsourcing operators, allowing it to incorporate human computation and knowledge as part of query execution. This means CrowdDB can be used to solve AI-hard problems that are difficult to tackle with traditional databases by crowdsourcing small tasks to human workers. I contributed to a demo of CrowdDB shown at VLDB 2011 in Seattle, which won the inaugural Best Demo award.

Papers

Try looking on DBLP as well.

Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Ion Stoica, and Randy Katz: Cake: Enabling High-level SLOs on Shared Storage Systems. SoCC 2012. [PDF] [Talk Slides]

Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Ion Stoica, and Randy Katz: Sweet Storage SLOs with Frosting. HotCloud 2012. [PDF] [Talk Slides]

Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker, Ion Stoica: PACMan: Coordinated Memory Caching for Parallel Jobs. NSDI 2012. [PDF]

Amber Feng, Michael J. Franklin, Donald Kossmann, Tim Kraska, Samuel Madden, Sukriti Ramesh, Andrew Wang, Reynold Xin: CrowdDB: Query Processing with the VLDB Crowd. PVLDB 4(12): 1387-1390 (2011). Best Demo Award!

Raghavendra Rajkumar, Andrew Wang, Jason Hiser, Anh Nguyen-Tuong, Jack W. Davidson, John C. Knight: Component-Oriented Monitoring of Binaries for Security. HICSS 2011: 1-10

Anh Nguyen-Tuong, Andrew Wang, Jason Hiser, John C. Knight, Jack W. Davidson: On the effectiveness of the metamorphic shield. ECSA Companion Volume 2010: 170-174