Elasticsearch is a popular search engine used predominantly around the world. So the demand for an Elasticsearch expert is very high. Check your level of competency and stay ahead of the competition. Whether you are a fresher or experienced, it is always good to have a grip on basic topics and syntaxes before you attend an interview. We are here to help you with just that. In this section, you will be reading about the top 30 frequently asked Elasticsearch interview questions and answers. The questions listed below will help you in preparing for your dream job interview.
Ans: Elasticsearch is an open-source and distributed search engine built on top of Lucene. It is used for storing, searching and analyzing data in bulk. The data can be structured, unstructured, geo-spatial, etc. It stores data in the form of JSON and supports full-text searches. It provides query results in near real-time. It is widely used by companies such as Netflix, Linkedin, Stack Overflow, etc
Interested in learning Elasticsearch Join hkr and Learn more on Elasticsearch Certification Course!
Ans: The advantages of Elasticsearch are,
Ans: The term-based query will search for an exact match. The full-text query splits the sentences into terms and checks for their proximity matches.
Ans: Kibana is an open-source data visualization tool for Elasticsearch. Using Kibana, we can perform data operations on the indexed data in an Elasticsearch cluster. We can even create dashboards with pie charts, bar graphs, scatter plots, etc.
Ans: Logstash is an open-source data processing tool. It is mainly used to collect data that might be in the form of logs, chat messages, order details, etc, process it and send it to a destination, which in our case, an Elasticsearch cluster.
Ans: We can perform four main operations on the data stored in Elasticsearch
Ans: Yes, Elasticsearch does have a schema. The schema is nothing but mapping the fields in a document to its type. Elasticsearch can also be schema-less. When documents are indexed without specifying the schema, it will generate dynamic schema by default.
Ans: A type in Elasticsearch represents a set of similar documents. The type name of a document will be stored in the ‘_type’ metadata field. When we want to search for similar documents, we can just search based on the ‘_type’ field.
Ans: An index in Elasticsearch is equivalent to a table in RDBMS. It is a collection of documents and contains mappings of fields that can be of multiple types. All the operations on data in Elasticsearch are based on the indices.
Ans: A document in Elasticsearch is equivalent to a row of a table. It contains all the fields of an entry. The data in a document will be in the form of JSON i.e key-value pairs.
Ans: A node can be represented as a server that is part of a cluster. The data stored in Elasticsearch will actually be stored in the nodes. It has indexing and search capabilities.
Ans: A collection of nodes that are connected together becomes a cluster. When we install Elasticsearch in our system, by default it will create a cluster of one node with the name 'elasticsearch'. When data is stored in Elasticsearch, the data is evenly distributed across the available nodes.
Ans: Mapping is nothing but metadata i.e the definition of the fields that the document contains. Each index has mapping types, so similar documents can be grouped together. We can define custom rules of mapping for the dynamically added fields in a document.
Ans: Text analysis is a process of converting text into a structured format. To make the data better available for search, text analysis is used. It enables full-text searches so the results returned are more relevant to the query.
Get ahead in your career with our Elasticsearch Tutorial
Ans: Modules are for monitoring the functionality of indices. Modules consist of two types of settings,
Ans: We can perform searching in three different ways
To add or create an index, we have to use the following syntax
IndexName will be the name of the index that you want to create
Ans: To retrieve a document, we have to use a GET API like below,
DocumentID is the ID of the document
IndexName will be the name of the index that you want to perform the search in
We can delete a document with the following syntax,
DocumentID is the ID of the document
IndexName will be the name of the index in which the document is present
Ans: The query language of Elasticsearch is based on Lucene query language which is also called Query DSL.
Ans: Query DSL (Domain Specific Language) is used to define JSON based queries on Elasticsearch. It has 2 clauses,
Ans: A shard is a bucket that stores data. When an index is created in Elasticsearch, it will split the data into buckets and stores them across the nodes. An index can be made up of a single shard or multiple. We can specify the number of shards that our index can have at the time of creating an index.
Ans: A replica is a copy of a shard in Elasticsearch. To avoid data loss in case of a failure, shards are copied and stored across different nodes to make the data available always. The default unit of a replica in Elastcsearch is 1. We can change this factor manually to have more replicas of data
Ans: An analyzer is an algorithm that encompasses text analysis i.e it defines how a search sentence should be divided into terms. Elasticsearch provides built-in analyzers. We can even create a custom analyzer by combining character filters, tokenizers, etc.
Ans: ECS (Elastic Common Schema) is developed by the Elastic user community to have a common schema for all the developers to follow. It defines a set of fields to be used when storing a document in Elastisearch.
Ans: Inverted index is a data structure that enables full-text search. It is a hashmap of unique words of all the documents. It also stores the document name in which it appears for each word. So when we perform a search based on the text field, it will first refer to this inverted index to find the matching search terms.
[Related Articles:Elasticsearch Commands]
Ans: Filters are used on structured data to optimize search results. Some of the filters provided by Elasticsearch are AND, OR, EXISTS, etc.
Ans: Elasticsearch supports For, While and D-While loops to loop through the documents.
Ans: When we search for a document, Elasticsearch uses the Lucene scoring algorithm which calculates a weight for each document based on some search parameters and term frequency. When a search is performed, the weight i.e score is stored in the ‘_score’ represented by a positive floating number for each document. So based on this score, relevant results are returned for a search.