论文标题
Tracar比率:为活动数据集服务数据库选择正确的存储技术
The TRaCaR Ratio: Selecting the Right Storage Technology for Active Dataset-Serving Databases
论文作者
论文摘要
主内存数据库系统旨在为用户提供低延迟和对数据的高吞吐量访问。大多数数据都位于辅助存储中,该存储受到技术访问速度的限制。对于热内容,数据居住在DRAM中,随着数据集的规模和访问需求的增长,该数据变得越来越昂贵。随着诸如Flash和Intel的3D Xpoint(3DXP)之类的低延迟存储解决方案的出现,这些系统有机会为用户提供高质量服务,同时降低提供商的成本。为了达到高性能,提供商必须为这些数据集提供服务器主机,并提供适当的DRAM和次要存储量,并选择存储技术。容量和交易负载的加班的增长使得在不同的存储技术和内存存储组合之间来回翻转昂贵。现在必须为一个存储技术设置的服务器重新配置,重新分配并可能完全替换。随着更多的低延迟存储解决方案的可用性,考虑到数据集增长的预测趋势并提供负载的趋势,如何决定正确的内存存储组合以及选择存储技术?在本文中,我们描述并提出了使用Tracar比率的理由 - 交易率除以工作负载所需的存储容量 - 允许提供商选择最具成本效益的内存存储组合和存储技术,鉴于其预测的数据集趋势和负载要求。我们探索如何与3DXP和Flash一起使用高度Zipfian B-Tree数据库,并讨论可以利用比率的潜在研究方向。
Main memory database systems aim to provide users with low latency and high throughput access to data. Most data resides in secondary storage, which is limited by the access speed of the technology. For hot content, data resides in DRAM, which has become increasingly expensive as datasets grow in size and access demand. With the emergence of low-latency storage solutions such as Flash and Intel's 3D XPoint (3DXP), there is an opportunity for these systems to give users high Quality-of-Service while reducing the cost for providers. To achieve high performance, providers must provision the server hosts for these datasets with the proper amount of DRAM and secondary storage, as well as selecting a storage technology. The growth of capacity and transaction load overtime makes it expensive to flip back-and-forth between different storage technologies and memory-storage combinations. Servers set up for one storage technology must now be reconfigured, repartitioned, and potentially replaced altogether. As more low-latency storage solutions become available, how does one decide on the right memory-storage combination, as well as selecting a storage technology, given a predicted trend in dataset growth and offered load? In this paper, we describe and make the case for using the TRaCaR ratio - the transaction rate divided by the storage capacity needed for a workload - for allowing providers to choose the most cost-effective memory-storage combination and storage technology given their predicted dataset trend and load requirement. We explore how the TRaCaR ratio can be used with 3DXP and Flash with a highly-zipfian b-tree database, and discuss potential research directions that can leverage the ratio.