Elasticsearch:用Curator辅助Marvel,实现自动删除旧marvel索引

2017/12/11 posted in  ELK

Marvel几乎是所有Elasticsearch用户的标配。Marvel保留观测数据的代价是,
它默认每天会新建一个index,命名规律像是这样:.marvel-2017-12-10。
marvel自建的索引一天可以产生大概500M的数据,而且将会越来越多,占的容量也将越来越大。
有没有什么办法能让它自动过期?比如说只保留最近两天的观测数据,其他的都抛弃掉。

当然有办法,curator就可以帮你实现.

curator是什么?

它是一个命令,可以帮助你管理你在Elasticsearch中的索引,帮你删除,关闭(close),
打开(open)它们。当然这是比较片面的说法,更完整的说明见:
https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html

实践

我们集群里面安装的Elasticsearch的版本是2.1.1.
按照官网,我装了最新的5.x版本,显示版本不对.
按照 http://blog.csdn.net/hereiskxm/article/details/47423715 这个博客,我装了3.3.0版本.
显示也不对.

然后我搜了一下,感觉应该装一个中间的版本,因此我安装了4.0.0版本

pip install elasticsearch-curator (4.0.0)

然后我看了一下这个版本提供的参数

 curator --help
Usage: curator [OPTIONS] ACTION_FILE

  Curator for Elasticsearch indices.

  See http://elastic.co/guide/en/elasticsearch/client/curator/current

Options:
  --config PATH  Path to configuration file. Default: ~/.curator/curator.yml
  --dry-run      Do not perform any changes.
  --version      Show the version and exit.
  --help         Show this message and exit.

和我安装最新的5.X的版本看起来是一致的.正好在这个站点看到配置的办法
https://stackoverflow.com/questions/33430055/removing-old-indices-in-elasticsearch/42268400#42268400

之前在博客里面看到的那个3.3.0版本,还不兼容呢.

用法

目的是删除2天前以.marvel开头的索引

新建目录 /opt/curator

~ pwd
/opt/curator
~ ll
total 12
-rw-r--r-- 1 root root  184 Dec 12 10:48 config_file.yml
-rw-r--r-- 1 root root 1311 Dec 12 10:37 delete_marvel_indices.yml
drwxr-xr-x 2 root root 4096 Dec 12 10:49 logs

config_file.yml

# 记住,这个logfile得提前新建好.不然会启动报错.
vim config_file.yml

---
client:
  hosts:
   - 10.10.25.217
  port: 9200
logging:
  loglevel: INFO
  logfile: "/opt/curator/logs/actions.log"
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']

delete_marvel_indices.yml

删除以.marvel前缀且是2天之前的索引

---
# Remember, leave a key empty if there is no value.  None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True.  If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 30 days (based on index name), for rc- prefixed indices.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: rc-
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 30
      exclude:
  2:
    action: delete_indices
    description: >-

      Delete indices older than 2 days (based on index name), for .marvel prefixed indices.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: .marvel
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 2
      exclude:

配置完成.

执行命令

curator --config config_file.yml [--dry-run] delete_marvel_indices.yml

注意:

  1. --dry-run 是可选参数,加上后不会真的删除,只会执行逻辑.你可以通过看日志来判断是否正确.
    确认正确后,去掉--dry-run参数,再执行命令,既是真正的执行删除了.
  2. 如果没有在config_file.yml里面配置logfile参数,那么日志会在console打印出来.

配置日常任务

很明显,我们需要自动化这个过程,让它每天自动执行,因此写一个脚本,让crontab每天自动调用即可

#!/bin/bash

curator --config /opt/curator/config_file.yml  /opt/curator/delete_marvel_indices.yml

echo "delete success"

配置crontab

# 每天2点执行删除脚本
0 2 * * * source /etc/profile;bash /opt/curator/delete_marvel_daily.sh > /opt/curator/delete.log 2>&1