Introduction to Elasticsearch

Elasticsearch is an open source, distributed, scalable, highly available, document-oriented, RESTful, full text search engine with realtime search and analytics capabilities. It is basically built on top of the Apache Lucene for the indexing purposes. In this post, I am going to give you a very quick conceptual introduction to Elasticsearch.
Continue reading

Facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

Finding efficient Shard Keys with a learning process on query logs in Database Sharding

Abstract

Nowadays industry has been experiencing a dramatic data growth. Not only this data has to be processed properly, but also it must be stored somewhere with smart strategy to be able to write and read with highest possible speed. Over the past decades, vendors have been motivated to migrate their brown-field database solutions to a distributed version through Partitioning/Sharding concepts. (In)appropriate shard keys have a great impact on the future performance of the whole application. The improper choices may cause SLA violation for enterprises and end up with business failure. In the first chapter, we introduce various approaches of data partitioning along with challenges that you may face. The second chapter explores the Sharding strategy utilised by two famous vendors. And, lastly we propose an automatic approach for detection of efficient Sharding Scheme and Sharding keys with a learning process on existing query logs of a database.

Keywords: Database Sharding, Partitioning, Sharding, Shard Key, Learning

Download-PDF

Emad Heydari Beni

Facebooktwittergoogle_plusredditpinterestlinkedinmailby feather