0
TD-Orch: Scalable Load-Balancing for Distributed Systems with Applications to Graph Processing
arXiv:2511.11843v2 Announce Type: replace
Abstract: In this paper, we introduce a task-data orchestration abstraction that supports a range of distributed applications, including graph processing and key-value stores. Given a batch of lambda tasks each requesting one or more data items, where both tasks and data are distributed across multiple machines, each task must be co-located with its target data (by moving tasks and/or data) and then executed. We present TD-Orch, an efficient and scalable orchestration framework featuring a simple application developer interface. TD-Orch employs a distributed push-pull technique, leveraging the bidirectional flow of both tasks and data to achieve scalable load balance across machines even under highly skewed data requests (data hot spots), with minimal communication overhead. Experimental results show that TD-Orch achieves up to 2.8x speedup over existing distributed scheduling baselines. Building on TD-Orch, we present TDO-GP, a distributed graph processing system for general graph problems, demonstrating the effectiveness of the underlying framework. We design three families of implementation techniques to fully leverage the execution flow provided by TD-Orch. Experimental results show that TDO-GP achieves an average speedup of 4.1x over the best prior open-source distributed graph systems for general graph processing.
Abstract: In this paper, we introduce a task-data orchestration abstraction that supports a range of distributed applications, including graph processing and key-value stores. Given a batch of lambda tasks each requesting one or more data items, where both tasks and data are distributed across multiple machines, each task must be co-located with its target data (by moving tasks and/or data) and then executed. We present TD-Orch, an efficient and scalable orchestration framework featuring a simple application developer interface. TD-Orch employs a distributed push-pull technique, leveraging the bidirectional flow of both tasks and data to achieve scalable load balance across machines even under highly skewed data requests (data hot spots), with minimal communication overhead. Experimental results show that TD-Orch achieves up to 2.8x speedup over existing distributed scheduling baselines. Building on TD-Orch, we present TDO-GP, a distributed graph processing system for general graph problems, demonstrating the effectiveness of the underlying framework. We design three families of implementation techniques to fully leverage the execution flow provided by TD-Orch. Experimental results show that TDO-GP achieves an average speedup of 4.1x over the best prior open-source distributed graph systems for general graph processing.