拓扑感知的算法大规模平行计算模型

论文标题

拓扑感知的算法大规模平行计算模型

Algorithms for a Topology-aware Massively Parallel Computation Model

论文作者

Hu, Xiao, Koutris, Paraschos, Blanas, Spyros

论文摘要

大量并行数据处理中的大多数工作都假定同质性，即每个计算单元具有相同的计算能力，并且可以与其他延迟和带宽相同的其他单元进行通信。但是，这种统一拓扑的强烈假设很少在实用环境中，在这种情况下，计算单元通过复杂的网络连接。为了解决这个问题，Blanas等。最近提出了一个大规模平行计算模型，该模型在建模成本中整合了网络结构和异质性。网络被建模为有向图，其中每个边缘都与成本函数关联，该成本函数取决于两个端点之间传输的数据。计算以同步回合进行，每个回合的成本都被视为网络中所有边缘的最高成本。在这项工作中，我们迈出了第一步，以研究此拓扑感知的并行模型中的三个基本数据处理任务：设置相交，笛卡尔产品和分类。我们关注的是树拓扑的网络拓扑，并呈现下限以及（渐近）匹配上限的网络拓扑。我们算法的最佳性是关于网络节点之间的初始数据分布，而不是像以前的结果那样假设最坏情况分布。除了我们结果的理论最优性外，我们的协议很简单，使用恒定的回合，我们认为也可以在实际环境中实现。

Most of the prior work in massively parallel data processing assumes homogeneity, i.e., every computing unit has the same computational capability, and can communicate with every other unit with the same latency and bandwidth. However, this strong assumption of a uniform topology rarely holds in practical settings, where computing units are connected through complex networks. To address this issue, Blanas et al. recently proposed a topology-aware massively parallel computation model that integrates the network structure and heterogeneity in the modeling cost. The network is modeled as a directed graph, where each edge is associated with a cost function that depends on the data transferred between the two endpoints. The computation proceeds in synchronous rounds, and the cost of each round is measured as the maximum cost over all the edges in the network. In this work, we take the first step into investigating three fundamental data processing tasks in this topology-aware parallel model: set intersection, cartesian product, and sorting. We focus on network topologies that are tree topologies, and present both lower bounds, as well as (asymptotically) matching upper bounds. The optimality of our algorithms is with respect to the initial data distribution among the network nodes, instead of assuming worst-case distribution as in previous results. Apart from the theoretical optimality of our results, our protocols are simple, use a constant number of rounds, and we believe can be implemented in practical settings as well.

下载PDF全文

下载文献需遵守相关版权规定

论文标题