Finding efficient Shard Keys with a learning process on query logs in Database Sharding


Nowadays industry has been experiencing a dramatic data growth. Not only this data has to be processed properly, but also it must be stored somewhere with smart strategy to be able to write and read with highest possible speed. Over the past decades, vendors have been motivated to migrate their brown-field database solutions to a distributed version through Partitioning/Sharding concepts. (In)appropriate shard keys have a great impact on the future performance of the whole application. The improper choices may cause SLA violation for enterprises and end up with business failure. In the first chapter, we introduce various approaches of data partitioning along with challenges that you may face. The second chapter explores the Sharding strategy utilised by two famous vendors. And, lastly we propose an automatic approach for detection of efficient Sharding Scheme and Sharding keys with a learning process on existing query logs of a database.

Keywords: Database Sharding, Partitioning, Sharding, Shard Key, Learning


Emad Heydari Beni