Distributed Systems

Cloud Computing: Hybrid Clouds, Federations, Brokers

The proliferation of public cloud providers and advancements in virtualization technologies create new opportunities: end-users can integrate private datacenters with public clouds to save on costs while satisfying privacy requirements; federations of small-scale cloud providers can partner with each other, to create economies of scale and sustain workload spikes while retaining business autonomy; cloud brokers can take advantage of multiple spot markets and volume discounts to make profit. Our goal is to develop mathematical performance and market models to evaluate the economic opportunities of cloud markets. For example, in (Pal et al., 2013) we proposed models for cloud pricing and capacity planning; in (Song et al., 2013), we addressed mechanism design for datacenter spot markets.

Optimization of Distributed Machine Learning

Machine learning frameworks such as TensorFlow can easily deploy distributed training jobs over multiple nodes and, within each node, multiple CPU cores or GPUs. While communication among nodes or devices is seamlessly managed by frameworks, the user has to decide how many nodes to allocate for each type of task. In parameter server architectures, nodes can serve as parameter servers (e.g., collecting and applying updates to weights of a neural network) or worker nodes (processing training examples). Our goal is to develop mathematical models to quickly assess the performance (e.g., training examples processed per second) of possible configurations. In turn, this will allow users to optimally assign cluster resources to a set of machine learning jobs.

Job Scheduling in Geo-distributed Datacenters

With growing data volumes generated and stored across geo-distributed locations, it is becoming increasingly inefficient to aggregate all data required for computation at a single datacenter. Instead, a recent trend is to distribute computation to take advantage of data locality, thus reducing the costs (e.g., bandwidth) while improving performance. New challenges are emerging for job scheduling, which requires coordination among datacenters as each job runs across geo-distributed sites. We proposed novel job scheduling algorithms that coordinate across datacenters with low overhead and near-optimal performance, with up to 50% improvement in average job completion time over Shortest Remaining Processing Time (SRPT) scheduling.

P2P Video Streaming Systems

Inconsistent quality of service is a significant problem in P2P video streaming systems: playback pauses are common for low-capacity peers, as they can usually upload relatively little compared to high-capacity ones. In (Wang et al., 2014) we proposed a framework where high-capacity peers are rewarded with reduced ads duration; a system for P2P video streaming based on this framework was analyzed in (Lin et al., 2015). Our approach adopts utility-theoretic market models, where market stakeholders consist of a content provider, an advertisement provider, and network peers. Using game theory, we determined optimal system parameters to reach market efficiency, and studied the practical implications of equilibria on stakeholders' satisfaction.

Privacy Engineering

Differential Privacy Mechanisms

Differential privacy is a framework to quantify the level of privacy preserved in statistical databases when aggregate information is released. We are interested in understanding the optimality of noise generation mechanisms that take into consideration query sensitivity, query side information, and longitudinal/collusion attacks. In our work (Chen et al., 2016), we investigated oblivious noise generation mechanisms for scalar queries in both non-Bayesian and Bayesian user settings, and provided supporting evidence and counterexamples to existing theory results for relaxed assumption sets.

Data and Algorithmic Transparency

Algorithms for large-scale data analysis are routinely used for tasks as diverse as online content recommendation (Facebook), targeted ads (AdWords, DoubleClick), allocation of labor (Uber), medical diagnosis, and prediction of criminal activity (Northpointe Analytics). However, the general public (users, researchers, regulators) has little information on the data and algorithms used by these systems. We are interested in mathematical models to evaluate mechanisms that address concerns about fairness, privacy, potential discrimination, and manipulations in data-driven systems.

Past Projects


Cyber-insurance protects businesses and individuals from risks due to attacks to IT infrastructure (e.g., data destruction, extortion, theft, hacking, and denial of service), reducing liability for damages caused to others by errors or omissions. In (Pal et al., 2010), we proposed a general mathematical framework by which cooperative and non-cooperative Internet users can decide whether or not to invest in self-defense mechanisms (e.g., personal antivirus) when insurance plans are offered. In (Pal et al., 2014), we proved the inefficiency of cyber-insurance markets under conditions of partial information, asymmetry and correlated risks, and showed the existence of efficient markets (both regulated and unregulated) under premium discrimination.

Reliability of Software Architectures

Requirements of modern software systems dictate an early assessment of dependability qualities (such as reliability) in their life cycle. In (Cheung et al., 2012), we proposed SHARP, an architecture-level reliability prediction framework that analyzes a hierarchical, scenario-based specification of system behavior. Models of basic scenarios are solved and combined to obtain results for higher-lever scenarios composed by sequential and parallel executions.