Elasticsearch:用Curator辅助Marvel,实现自动删除旧marvel索引

Marvel几乎是所有Elasticsearch用户的标配。Marvel保留观测数据的代价是,
它默认每天会新建一个index,命名规律像是这样:.marvel-2017-12-10。
marvel自建的索引一天可以产生大概500M的数据,而且将会越来越多,占的容量也将越来越大。
有没有什么办法能让它自动过期?比如说只保留最近两天的观测数据,其他的都抛弃掉。

当然有办法,curator就可以帮你实现.

curator是什么?

它是一个命令,可以帮助你管理你在Elasticsearch中的索引,帮你删除,关闭(close),
打开(open)它们。当然这是比较片面的说法,更完整的说明见:
https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html

实践

我们集群里面安装的Elasticsearch的版本是2.1.1.
按照官网,我装了最新的5.x版本,显示版本不对.
按照 http://blog.csdn.net/hereiskxm/article/details/47423715 这个博客,我装了3.3.0版本.
显示也不对.

然后我搜了一下,感觉应该装一个中间的版本,因此我安装了4.0.0版本

1
pip install elasticsearch-curator (4.0.0)

然后我看了一下这个版本提供的参数

1
2
3
4
5
6
7
8
9
10
11
12
curator --help
Usage: curator [OPTIONS] ACTION_FILE
Curator for Elasticsearch indices.
See http://elastic.co/guide/en/elasticsearch/client/curator/current
Options:
--config PATH Path to configuration file. Default: ~/.curator/curator.yml
--dry-run Do not perform any changes.
--version Show the version and exit.
--help Show this message and exit.

和我安装最新的5.X的版本看起来是一致的.正好在这个站点看到配置的办法
https://stackoverflow.com/questions/33430055/removing-old-indices-in-elasticsearch/42268400#42268400

之前在博客里面看到的那个3.3.0版本,还不兼容呢.

用法

目的是删除2天前以.marvel开头的索引

新建目录 /opt/curator

1
2
3
4
5
6
7
~ pwd
/opt/curator
~ ll
total 12
-rw-r--r-- 1 root root 184 Dec 12 10:48 config_file.yml
-rw-r--r-- 1 root root 1311 Dec 12 10:37 delete_marvel_indices.yml
drwxr-xr-x 2 root root 4096 Dec 12 10:49 logs

config_file.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
# 记住,这个logfile得提前新建好.不然会启动报错.
vim config_file.yml
---
client:
hosts:
- 10.10.25.217
port: 9200
logging:
loglevel: INFO
logfile: "/opt/curator/logs/actions.log"
logformat: default
blacklist: ['elasticsearch', 'urllib3']

delete_marvel_indices.yml

删除以.marvel前缀且是2天之前的索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
# Remember, leave a key empty if there is no value. None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True. If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
1:
action: delete_indices
description: >-
Delete indices older than 30 days (based on index name), for rc- prefixed indices.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: rc-
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 30
exclude:
2:
action: delete_indices
description: >-
Delete indices older than 2 days (based on index name), for .marvel prefixed indices.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: .marvel
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 2
exclude:

配置完成.

执行命令

1
curator --config config_file.yml [--dry-run] delete_marvel_indices.yml

注意:

  1. –dry-run 是可选参数,加上后不会真的删除,只会执行逻辑.你可以通过看日志来判断是否正确.
    确认正确后,去掉–dry-run参数,再执行命令,既是真正的执行删除了.
  2. 如果没有在config_file.yml里面配置logfile参数,那么日志会在console打印出来.

配置日常任务

很明显,我们需要自动化这个过程,让它每天自动执行,因此写一个脚本,让crontab每天自动调用即可

1
2
3
4
5
#!/bin/bash
curator --config /opt/curator/config_file.yml /opt/curator/delete_marvel_indices.yml
echo "delete success"

配置crontab

1
2
# 每天2点执行删除脚本
0 2 * * * source /etc/profile;bash /opt/curator/delete_marvel_daily.sh > /opt/curator/delete.log 2>&1
Donate comment here