THEMIS: Fairness in Federated Stream Processing under Overload

Eva Kalyvianaki

Federated stream processing systems, which utilise nodes from multiple independent domains, can be found increasingly in multi-provider cloud deployments, internet-of-things systems, collaborative sensing applications and large-scale grid systems. To pool resources from several sites and take advantage of local processing, submitted queries are split into query fragments, which are executed collaboratively by different sites. When supporting many concurrent users, however, queries may exhaust available processing resources, thus requiring constant load shedding. Given that individual sites have autonomy over how they allocate query fragments on their nodes, it is an open challenge how to ensure global fairness on processing quality experienced by queries in a federated scenario.

In this talk I will describe THEMIS, a federated stream processing system for resource-starved, multi-site deployments. It executes queries in a globally fair fashion and provides users with constant feedback on the experienced processing quality for their queries. THEMIS associates stream data with its source information content (SIC), a metric that quantifies the contribution of that data towards the query result, based on the amount of source data use to generate it. We provide the THEMIS distributed load shedding algorithm that balances the SIC values of result data. Our evaluation shows that the THEMIS algorithm yields balanced SIC values across queries, as measured by Jain's Fairness Index. Our approach also incurs a low execution time overhead.

Short bio: Eva is a Lecturer in the Department of Computer Science, at City, University of London. She holds a PhD from Cambridge University and a BSc and MSc from the University of Crete, Greece. Her research interests span the areas of Cloud Computing, Data Stream Processing, Autonomic Computing, Distributed Systems and Systems Research in general. She in interested in the design and management of next generation, large-scale applications while addressing the complexity of such systems with mathematical reasoning.