how to backup and restore elasticsearch indexes using curator7 min read

Prabhin Prabharkaran Administrator
DevOps Engineer

He is a Technical professional. He is a person who loves to share tricks and tips on the Internet. He Posts what he does!

follow me

Hi all, this document shows how to backup and restore elasticsearch indexes using a curator.
Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java.

Elasticsearch Curator helps you curate, or manage, your Elasticsearch indices and snapshots by: Obtaining the full list of indices (or snapshots) from the cluster, as the actionable list. Iterate through a list of user-defined filters to progressively remove indices (or snapshots) from this actionable list as needed.

Step 1: Install curator

In centos

vi /etc/yum.repos.d/curator.repo
[curator-5]
name=CentOS/RHEL 7 repository for Elasticsearch Curator 5.x packages
baseurl=https://packages.elastic.co/curator/5/centos/7
gpgcheck=1
gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1

Install elasticsearch curator

yum install elasticsearch-curator

Ubuntu
Sign the PGP key first.

wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

Add the curator repostory to the system
Create a file called curator.list

vi /etc/apt/sources.list.d/curator.list

Add the below line in the above-mentioned file.

deb [arch=amd64] https://packages.elastic.co/curator/5/debian stable main

Update the repo by executing the below command.

sudo apt-get update

Install elasticsearch curator

apt-get install elasticsearch-curator

Step 2: Add the repo in elasticsearch.yml file

vi /etc/elasticsearch/elasticsearch.yml
path.repo: ["/backup/es"]

restart the Elasticsearch service to take effect.

Service elasticsearch service

Here I’m using the filesystem as the repository.
For the elasticsearch cluster make sure the same directory is mounted on the elastic search nodes. The file system has to be an NFS share.

Step 3: Configure the snapshot repo by executing the below API

curl -XPUT -H "Content-Type: application/json;charset=UTF-8"
'http://localhost:9200/_snapshot/esbackup' -d '{
"type": "fs",
"settings": {
"location": "/backup/es",
"compress": true
}
}'

Note: In cluster setup, this required be executed only on any of the nodes. Best will be the master node.

Step 4: Configure curator

cd

mkdir .curator
vi .curator/curator.yml

Add below lines

---
# Remember, leave a key empty if there is no value.  None will be a string,
# not a Python "NoneType"
client:
  hosts: ["10.1.15.13", "10.1.15.12", "10.1.15.11"]
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  ssl_no_validate: False
  http_auth:
  timeout: 30
  master_only: False

logging:
  loglevel: INFO
  logfile: '/root/.curator/log.log'
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']

In my case I have a cluster, add only one IP address if you have a standalone instance

Create log the file

touch /root/.curator/log.log

Step 5: Now let’s create YAML file for snapshot, delete and restore

mkdir yaml-files
vi snapshot.yml

Add the below lines

---
# Remember, leave a key empty if there is no value.  None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True.  If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
  1:
    action: snapshot
    description: >-
      Snapshot  than 90 day (based on index
      creation_date) with the default snapshot name pattern of
      'curator-%Y%m%d%H%M%S'.  Wait for the snapshot to complete.  Do not skip
      the repository filesystem access check.  Use the other options to create
      the snapshot.
    options:
      repository: esbackup
      # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
      name:
      ignore_unavailable: False
      include_global_state: True
      partial: False
      wait_for_completion: True
      skip_repo_fs_check: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: regex
      value: 'fluentd'
      exclude: True
    - filtertype: kibana
      exclude: True
    - filtertype: age
      source: creation_date
      direction: older
      unit: days
      unit_count: 90

Step 6: Perform snapshot.

curator --config /root/.curator/curator.yml snapshot.yml

Expected output in the log file
Snapshot curator-
20200915083757 successfully completed.
2020-09-15 08:38:06,980 INFO
Action ID: 1, “snapshot”
completed.
2020-09-15 08:38:06,980 INFO
Job completed.

Step 7: Create yaml file for delete.

vi delete.yml

Add below lines

---
# Remember, leave a key empty if there is no value.  None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True.  If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 90 days. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      disable_action: False
    filters:
    - filtertype: kibana
      exclude: True
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 90

Step 8: Delete indexes.

curator --config /root/.curator/curator.yml delete.yml

Expected Output in the log file

—deleting index nginx
—deleting index nodejs
—deleting index
Action ID: 1,
Job completed.

Step 9: Perform restore

vi restore.yml
---
# Remember, leave a key empty if there is no value.  None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True.  If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
  1:
    action: close
    description: "Close indices before restoring snapshot"
    options:
      continue_if_exception: True
      ignore_empty_list: True
    filters:
    - filtertype: kibana
      exclude: false
    - filtertype: age 
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 15
  2:
    action: restore
    description: Restore my_index from esbackup in my_repository
    options:
      repository: esbackup
      name: 
      indices: 
      wait_for_completion: True
      max_wait: 3600
      wait_interval: 10
    filters:
    - filtertype: state
      state: SUCCESS

Expected output in theĀ  log file
nginx-2020.08.31
2020-09-15 08:41:26,292 INFO
have been restored.

NOTE: during restoration make sure that you have snapshot data on repository location. In my case, esbackup is the repository and the location of the repository is /backup/es location.

You can use filters in your curator YAML files, in case you need to exclude/indexes some indexes using index names, prefix, postfix, age, etc.
Refer elasticsearch curator documentation for supported values.

 

© 2020, Techrunnr. All rights reserved.

#1
#2
#3
Questions Answered
Articles Written
Overall Points

Leave a Reply