Elastic Search Tutorial

Welcome to the Elasticsearch tutorial! This tutorial covers the fundamental concepts that are needed to acquire good knowledge of Elasticsearch. Through this Elasticsearch training, you will be able to get hands-on experience as well.

What is Elasticsearch and how it can be used?

Elasticsearch is an open-source and distributed analytics search engine. It is written in Java and was released in the year 2010. Elasticsearch is built on top of Lucene. It is a document-oriented engine that stores, searches, and analyzes data in large quantities. It is optimized to work well on huge data sets such that the search happens in near real-time.

It will work on both structured and unstructured documents. All we have to do is index a document and it will be available for search. The data stored in Elasticsearch will be in the form of JSON. Usually, databases require a lot of time to run a search query, while Elasticsearch reduces the results retrieval time along with full-text search.

The features in Elasticsearch are exposed as REST APIs,

  • Index API - for indexing documents
  • Get API - for retrieving a document
  • Search API - for querying the documents
  • Put Mapping API - for defining the custom mapping

Elasticsearch can be used to build a search engine for any type of application. We have an option to configure auto-suggestions, pagination of results, etc.

Become a Elasticsearch Certified professional  by learning Elasticsearch online course from hkrtrainings! 

Advantages of Elasticsearch

Distributed Storage

When documents are added, the indices for them will be divided into shards. And these shards can have any number of replicas.

Scalability

It will run perfectly on a machine or on a cluster of nodes. 

Improved performance

Since the indices are based on a distributed approach, the search results of a query will be retrieved very quickly.

Near-real time

The search results will be in near real-time i.e it performs search as soon as you index a document.

Document oriented

The search is document-oriented. When the documents are stored, all the fields will be indexed by default.

Installation of Elasticsearch

First, make sure that Java 8 or higher is installed on your machine. Go to Download Elasticsearch Choose an installer file based on your operating system and download it.

Windows

Download the elastic search-7.7.1.msi installer file. Run the installer file and follow the prompts to finish the installation. The environment path will be automatically set during installation.

Linux

Download the elasticsearch-7.7.1-Linux-x86_64.tar.gz tar file. Extract the tar file using the below command,

$tar -xzf elasticsearch-7.7.1-linux-x86_64.tar.gz

To run Elasticsearch, go to the installation location and run the bat file under bin using the below commands,

$ cd installationpath/bin

$ ./elasticsearch

The installationpath in the above command would be the path of the installation folder.

Mac

Download the elasticsearch-7.7.1-darwin-x86_64.tar.gz tar file and extract it using the below command,

$ tar -xvf elasticsearch-7.7.1-darwin-x86_64.tar.gz

Add the below lines in .bash_profile file to set the path to environment variables,

export ES_HOME=~/installationpath

export PATH=$ES_HOME/bin:$PATH

To run Elasticsearch, execute the below command,

$ elasticsearch

Once the installation is done, Elasticsearch runs on the port 9200 by default. To check if it is up and running or not, open a browser and run http://localhost:9200/ You should get JSON response like below,

{

  "name" : "",

  "cluster_name" : "elasticsearch",

  "cluster_uuid" : "7_soBVcOQuaShrZur0kjzw",

  "version" : {

    "number" : "7.7.1",

    "build_flavor" : "unknown",

    "build_type" : "unknown",

    "build_hash" : "ad56dce891c901a492bb1ee393f12dfff473a423",

    "build_date" : "2020-05-28T16:30:01.040088Z",

    "build_snapshot" : false,

    "lucene_version" : "8.5.1",

    "minimum_wire_compatibility_version" : "6.8.0",

    "minimum_index_compatibility_version" : "6.0.0-beta1"

  },

  "tagline" : "You Know, for Search"

}

Kibana Installation

Kibana is a front-end application for Elasticsearch. We can run all our queries using Kibana. 

Go to Download Kibana and download the zip file that suits your OS. Unzip the file to your desired installation path. 

If you are using Windows, run the command - bin\kibana.bat

ElasticSearch Training

  • Master Your Craft
  • Lifetime LMS & Faculty Access
  • 24/7 online expert support
  • Real-world & Project-Based Learning

If you are using Linux (or) Mac, open config/kibana.yaml and set Elasticsearch installation path through elasticsearch.hosts field. Open the command prompt and run bin/kibana

Once the installation process is done, go to http://localhost:5601 to access Elasticsearch through Kibana

Click here to get  Elasticsearch interview questions and answers for freshers & experienced professionals

API Conventions

We can read, update, delete data in Elasticsearch using REST APIs. Here are some conventions that can be applied to the REST APIs,

Multiple Indices

We can search for documents present in different indices through a single search query. Indices can be given using the following notations,

  • Comma-separated (indexname1, indexname2, indexname3)
  • _all for searching through all the indices
  • Wildcards (*name, index*, *indexname*)
  • - for excluding an index (indexname*, -indexname2)

Date Math Support in Index Names

Date math operations help you in narrowing down the search results. So you can only get the results that you want within a range of time-series. In order to do that, we have to include some parameters along with the name while creating an index. Below is the syntax,

<index_name{date_math_expr{date_format|time_zone}}>

  • index_name : Name that you want to give to the index
  • date_math_expr: Adds a dynamic date 
  • date_format: To mention the format of the date
  • time_zone: By default, it will be UTC. We can change it if we want

Cron Expressions

We can schedule triggers in Elasticsearch using Cron expressions. The syntax for this is,

[year]

We can schedule daily, monthly, yearly, or for a range of days too. For example, if we give the expression as 0 15 11 ? 6 SAT, then a trigger will be scheduled to run at 11:15 AM UTC on every Saturday in the month of June.

Subscribe to our youtube channel to get new updates..!

Common Options

Elasticsearch provides a lot of common options using which we can customize our query results. We can use them by just appending them at the end of the API. Some of the common options provided are pretty result, human-readable output, distance unit, response filtering, etc.

URL based access control

To provide more secure access to your indices, we can prevent the users from overriding an index by specifying it in the request body of a URL.

Create an Elasticsearch index

Open Kibana. Go to Dev Tools in the left side menu, you will get access to the console.

Execute the following command to create an index,

PUT employees

You will get the below response on successful creation of the index,

{

  "acknowledged" : true,

  "shards_acknowledged" : true,

  "index" : "employees"

}

For each document that we want to add to an index, we need to specify an id to it. To add a document to the created index, execute the below command,

POST employees/_doc/1

{

  "name" : "John Doe",

  "Employee ID" : "1234",

  "Department" : "Development",

  "Location" : "Andhra Pradesh"

}

The response would be,

{

  "_index" : "employees",

  "_type" : "_doc",

  "_id" : "1",

  "_version" : 1,

  "result" : "created",

  "_shards" : {

    "total" : 2,

    "successful" : 2,

    "failed" : 0

  },

  "_seq_no" : 0,

  "_primary_term" : 1

}

Add some more documents using the syntax specified above. Remember to give a unique id for each of the documents added to the index.

Search API

We can search through the documents that were added to an index. By taking the above example, if I want to search for the employees in the location of Andhra Pradesh, the API for it would be like below,

POST /employees/_search

{

   "query":{

      "query_string":{

         "query":"Andhra Pradesh"

      }

   }

This API will return all the documents along with their details matching the query. The response would be,

{

  "took" : 2,

  "timed_out" : false,

  "_shards" : {

    "total" : 1,

    "successful" : 1,

    "skipped" : 0,

    "failed" : 0

  },

  "hits" : {

    "total" : {

      "value" : 4,

      "relation" : "eq"

    },

    "max_score" : 1.6471939,

    "hits" : [

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "1",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "John Doe",

          "Employee ID" : "1234",

          "Department" : "Development",

          "Location" : "Andhra Pradesh"

        }

      },

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "4",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "Jessica Davis",

          "Employee ID" : "1236",

          "Department" : "Operations",

          "Location" : "Andhra Pradesh"

        }

      },

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "5",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "Justin Foley",

          "Employee ID" : "1237",

          "Department" : "Operations",

          "Location" : "Andhra Pradesh"

        }

      },

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "10",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "Tyler Down",

          "Employee ID" : "1243",

          "Department" : "Development",

          "Location" : "Andhra Pradesh"

        }

      }

    ]

  }

}

Get API

If we want to get the details of a particular document, we can do so by specifying the id through GET API like below,

GET employees/_doc/3

Response :

{

  "took" : 2,

  "timed_out" : false,

  "_shards" : {

    "total" : 1,

    "successful" : 1,

    "skipped" : 0,

    "failed" : 0

  },

  "hits" : {

    "total" : {

      "value" : 4,

      "relation" : "eq"

    },

    "max_score" : 1.6471939,

    "hits" : [

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "1",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "John Doe",

          "Employee ID" : "1234",

          "Department" : "Development",

          "Location" : "Andhra Pradesh"

        }

      },

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "4",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "Jessica Davis",

          "Employee ID" : "1236",

          "Department" : "Operations",

          "Location" : "Andhra Pradesh"

        }

      },

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "5",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "Justin Foley",

          "Employee ID" : "1237",

          "Department" : "Operations",

          "Location" : "Andhra Pradesh"

        }

      },

      {

        "_index" : "employees",

        "_type" : "_doc",

        "_id" : "10",

        "_score" : 1.6471939,

        "_source" : {

          "name" : "Tyler Down",

          "Employee ID" : "1243",

          "Department" : "Development",

          "Location" : "Andhra Pradesh"

        }

      }

    ]

  }

}

Delete API

For deleting a document in an index, the API will be,

DELETE employees/_doc/10

Response : 

{

  "_index" : "employees",

  "_type" : "_doc",

  "_id" : "10",

  "_version" : 2,

  "result" : "deleted",

  "_shards" : {

    "total" : 2,

    "successful" : 2,

    "failed" : 0

  },

  "_seq_no" : 11,

  "_primary_term" : 1

}

Update API

To update any fields in a document, we need to specify the document id along with the field that you want to update to. The API for it will be,

{

  "_index" : "employees",

  "_type" : "_doc",

  "_id" : "10",

  "_version" : 2,

  "result" : "deleted",

  "_shards" : {

    "total" : 2,

    "successful" : 2,

    "failed" : 0

  },

  "_seq_no" : 11,

  "_primary_term" : 1

}

Response : 

{

  "_index" : "employees",

  "_type" : "_doc",

  "_id" : "7",

  "_version" : 3,

  "result" : "updated",

  "_shards" : {

    "total" : 2,

    "successful" : 2,

    "failed" : 0

  },

  "_seq_no" : 12,

  "_primary_term" : 1

}

Query DSL

Query DSL (Domain Specific Language) is used to define JSON based queries on Elasticsearch. It has 2 clauses,

  • Leaf Query Clause - to find a specific value through match, term or range
  • Compound Query Clause - multiple queries, basically a combination of leaf query clauses 

ElasticSearch Training

Weekday / Weekend Batches

Query DSL helps in performing both exact queries like terms, filters, etc, and approximate queries like fuzzy, regex, etc. We can even use them both in combination by writing a compound query.

Mapping, Text Analysis, and Index Modules 

Mapping

Every document in an index will have a mapping to it. Mapping is nothing but metadata i.e the definition of the fields that the document contains. When we add a new document in an index, the mapping will be automatically added to it. We can even add mapping to a document manually.

[Related articles : Elasticsearch Commands]

Text Analysis

It is a process of converting text into a structured format. Like breaking up a sentence into words so it is better available for searching. When you are searching for text fields, Elasticsearch performs text analysis on indices that have fields of type text.

Index Modules

Modules are for monitoring the functionality of indices. Modules consist of two types of settings,

Static Settings - These can be set only at the time of index creation or on a closed index
Dynamic Settings - These can be set when the index is live on Elasticsearch

Conclusion

Elasticsearch is widely used in building websites where product or document search is very much needed. It can be implemented in various programming languages. It supports search on documents in 34 text languages. Elasticsearch is also provided as a managed service on AWS, GCP, and Alibaba Cloud.

Categories

SAP

Request for more information

Webinar

Python tutorial for beginners

5th April | 08:00 AM

150 Registered

Mudassir
Mudassir
DevOps ERP and IAM tools
Mudaasir is a programming developer for hkr trainings. He has a well knowledge of today’s technology and I’ve loved technology my entire life. And also been lucky enough to work for the programmer including science and technology. Big thanks to everyone who has followed me on LinkedIn and twitter.

WhatsApp
To Top