ELK 搭建




ELK搭建


yum 安装 ELK

# vim /etc/yum.repos.d/elk.repo
    [elasticsearch-7.x]

    name=Elasticsearch repository for 7.x packages
    baseurl=https://artifacts.elastic.co/packages/7.x/yum
    gpgcheck=1
    gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
    enabled=1
    autorefresh=1
    type=rpm-md

# yum install elasticsearch
# yum install kibana
# yum install logstash

# vim /etc/elasticsearch/elasticsearch.yml
    network.host: 0.0.0.0
    cluster.initial_master_nodes: ["node-1"]
# systemctl enable elasticsearch
# systemctl start elasticsearch                   \\ 9200   9300 被监听


使用 ELK 收集 nginx 日志                                 \\ 环境 CentOS 7.8   ELK版本7.9.2  nginx18.0


Elasticsearch 二进制包 安装                                \\ 下载网址 https://elasticsearch.cn/download/

    # wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-linux-x86_64.tar.gz
    # tar zxf elasticsearch-7.9.2-linux-x86_64.tar.gz -C /usr/local/
    # cd /usr/local/
    # mv elasticsearch-7.9.2/ elasticsearch
    # vim /usr/local/es/config/elasticsearch.yml           \\ 主配置文件
        network.host: 192.168.10.12
        cluster.initial_master_nodes: ["node-1"]             \\ 解决错误3
    # vim /etc/security/limits.conf     \\ linux资源限制配置文件  解决错误1、2
        * hard nofile 65536              \\ * 所有的账户 系统中所能设定的 打开文件最大数目的 最大值
        * soft nofile 131072              \\ 当前系统生效的 打开文件最大数据的 设置值
        * hard nproc 4096                  \\ 系统中所能设定的 进程的最大数目的 最大值
        * soft nproc 2048                   \\ 当前系统生效的 进程的最大数目的 设置值
        # End of file                        \\ 在End前添加
    # vim /etc/sysctl.conf
        vm.max_map_count=655360
        fs.file-max=655360
    # sysctl -p

    # groupadd elk
    # useradd -g elk elk                         \\ 创建用户 ELK 不允许 root 启动
    # chown -R elk.elk /usr/local/elasticsearch
    # su elk
    # /usr/local/elasticsearch/bin/elasticsearch    \\ 启动 elasticsearch   注意 防火墙及selinux 放行
    # ss -tnl                                        \\ 9200  9300 被监听 需要新开窗口查看 
    # http://192.168.10.12:9200                       \\ 浏览器访问  或使用curl访问
    # curl 127.0.0.1:9200                              \\ 另开一个连接  出现此 Elasticsearch已经启动成功
        {
          "name" : "localhost.localdomain",              \\ Elasticsearch 名称      
          "cluster_name" : "elasticsearch",               \\ 集群的名称
          "cluster_uuid" : "34Su3E64SOq2J6w0sFs29w",       \\ uuid 唯一标识
          "version" : {
            "number" : "7.9.2",                              \\ Elasticsearch 版本
            "build_flavor" : "default",
            "build_type" : "tar",
            "build_hash" : "d34da0ea4a966c4e49417f2da2f244e3e97b4e6e",
            "build_date" : "2020-09-23T00:45:33.626720Z",
            "build_snapshot" : false,
            "lucene_version" : "8.6.2",
            "minimum_wire_compatibility_version" : "6.8.0",
            "minimum_index_compatibility_version" : "6.0.0-beta1"
          },
          "tagline" : "You Know, for Search"
        }

    注: 会出现的错误 error
        1. max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
        2. max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
        3. the default discovery settings are unsuitable for production use; at least one of must be configured


Kibana 安装                    \\下载网址   https://www.elastic.co/cn/downloads/kibana
    
    # wget https://artifacts.elastic.co/downloads/kibana/kibana-7.9.2-linux-x86_64.tar.gz \\ 要和Elasticsearch的版本对应 
    # tar zxf kibana-7.9.2-linux-x86_64.tar.gz -C /usr/local/
    # cd /usr/local/
    # mv kibana-7.9.2/ kibana
    # vim /usr/local/kibanna/config/kibana.yml
        server.port: 5601                                     \\ 监听端口
        server.host: "192.168.10.12"   
        elasticsearch.hosts: ["http://192.168.10.12:9200"]      \\ 指定Elasticsearch地址
        kibana.index: ".kibana"                                  \\ 指定索引名
        i18n.locale: "zh-CN"                                      \\ 修改kibana界面为中文
    # chown -R elk.elk /usr/local/kibanna/*
    # su elk
    # /usr/local/kibana/bin/kibana                                  \\ 启动 kibana   注意 防火墙及selinux 放行
    # ss -tnl                                                        \\ 5601 被监听 需要新开窗口查看
    http://192.168.10.12:5601/    --> Dev Tools                       \\ 浏览器访问


Logstash 安装                        \\下载网址   https://www.elastic.co/cn/downloads/logstash

    # wget https://artifacts.elastic.co/downloads/logstash/logstash-7.9.2.tar.gz       \\ 要和Elasticsearch的版本对应
    # tar zxf logstash-7.9.2.tar.gz -C /usr/local/
    # cd /usr/local/
    # mv logstash-7.9.2/ logstash
    # vim /usr/local/logstash/nginx_access.conf              \\ 创建自定义脚本 以json 的方式返回
        input {
            file {
                path => "/usr/local/nginx/logs/access.log"
                start_position => beginning
            }
        }
        filter {
            grok {
                match => { "message" => "%{COMBINEDAPACHELOG} %{QS:x_forwarded_for}"}
            }
            date {
                match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
            }
            geoip {
                source => "clientip"
            }
        }
        output {
            elasticsearch {
            hosts => "192.168.10.12:9200"
            }
            stdout { codec => rubydebug }
        }
    # mkdir /usr/local/logstash/patterns           \\ 创建
    # vim /usr/local/logstash/patterns/nginx        \\ logstash中grok的正则
        WZ ([^ ]*)
        NGINXACCESS %{IP:remote_ip} \- \- \[%{HTTPDATE:timestamp}\] "%{WORD:method} %{WZ:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:status} %{NUMBER:bytes} %{QS:referer} %{QS:agent} %{QS:xforward}
    # /usr/local/logstash/bin/logstash -f /usr/local/logstash/config/text.conf
    # ss -tnl                                        \\ 9600 被监听 需要新开窗口查看

    注: 需要安装nginx . nginx日志格式 默认即可

    使用 Kibana 查看 统计信息 
          -->   Dev Tools

    http://192.168.10.12:5601/
        Stack Management --> Index Patterns --> Create index pattern --> 搜索 lo* -->logstash-2020.10* 
        --> next step--> @timestamp --> Create index pattern
        
        Dev Tools
            logstash-2020.10.01-000001/_search                             \\ 查看收集到的nginx日志
        Visualize --> Create visualization --> line --> logstash-2020.10*   \\ 查看 曲线图 或饼状图等
            Metrics --> Y-axis --> Count                                     \\ Y 轴 访问量
            Buckets --> Add --> X-axis --> Date Histogram --> Update          \\ X 轴 时间


ES集群搭建

    # vim /usr/local/es/config/elasticsearch.yml
        cluster.name: my-elk                   \\ 集群名字 两台机器要一样
        node.name: node-1                       \\ 节点名字 两台机器要不一样
        path.data: /usr/local/es/data            \\ 数据 存储路径
        path.logs: /usr/local/es/logs             \\ 日志
        node.master: true                          \\ 主节点
        node.data: true                             \\ 从节点 即可为主节点又可以为从节点 加入
        http.port: 9200                              \\ es 访问端口
        transport.tcp.port: 9300                      \\ 集群之间 通讯的端口
        discovery.zen.minimum_master_nodes: 2          \\ 避免脑裂 ，集群节点数最少为 半数+1
        discovery.seed_hosts: ["192.168.10.14"]         \\ 集群之间的通讯ip 需要填写对方ip 可以是多个 使用，隔开
    # /usr/local/elasticsearch/bin/elasticsearch         \\ 如果不能发现集群 删除各自的data与logs目

    cerebro 安装                                           \\ ES集群监控管理工具 
        # wget https://github.com/lmenezes/cerebro/releases/download/v0.9.2/cerebro-0.9.2.tgz
        # tar zxf cerebro-0.9.2.tgz -C /usr/local/
        # cd /usr/local
        # mv cerebro-0.9.2/ cerebro
        # /usr/local/cerebro/bin/cerebro                         \\ 启动cerebro 9000端口被监听
        http://192.168.10.12:9000/                                \\ 浏览器 访问
            http://192.168.10.12:9200                              \\ 输入 ES地址


    cerebro 使用
        more --> create index 创建索引
            name   my                     \\ my索引创建2个主分片 每个主分片创建1个副本
            number of shards    2              主分片   存储数据            负载的作用
            number of replicas  1              副分片   主分片的数据的副本   高可用

            name   my1                       \\ my1索引创建3个主分片 每个主分片创建1个副本
            number of shards    3             \\ 会随机分配到各个节点上
            number of replicas  1


        监控界面
            my-elk    集群名字
            nodes     es节点
            indices   2个副本  每一个分片有一个副本 共2个
            shards    所有分片
            docs      文档数
            KB        占用的空间

            颜色
                green   所有主分片和副本分片都可用
                yellow  所有主分片可用 但不是所有副本分片都可用
                red     不是所有的主要分片都可用


------------------------------------------------------------------------------------------------------------------------


Logstash 的自定义脚本格式
    input 插件
        stdin : 输入插件 可以管道输入 也可以从终端交互输入  可以有以下两种方式启动
            1 # /usr/local/logstash/bin/logstash -f /usr/local/logstash/config/text.conf
                lucy111
            2 # echo "lucy111" | /usr/local/logstash/bin/logstash -f /usr/local/logstash/config/text.conf 
            codec：类型为 codec 
            type：类型为string 自定义该事件类型 可用于后续判断...就是想要显示的 信息 随便 
            tags： 类型为array 自定义事件的tag 可用于后续判读 
            add_field : 类型为hash 为该事件添加字段

        file : 从文件读取数据 如常见的日志文件
            # vim /usr/local/logstash/text2.conf 
                input {
                    file {
                        path => ["/usr/local/nginx/logs/access.log"]
                        start_position => "beginning"
                        type => "nginx"
                    }
                }
                output {
                    stdout {
                        codec => "rubydebug"
                    }
                }
            # /usr/local/logstash/bin/logstash -f /usr/local/logstash/config/text2.conf

                path => ["/var/log/*/*.log","/var/log/message"]         文件位置 可是多个
                exclue => "*.gz"                                        不读取哪些文件
                sincedb_path => "var/log/message"                       记录sincedb文件路径   .......                 
                start_postion => "beginning"                            是否从头读取文件 或者end
                stat_interval => 1000                                   单位 秒 定时检查文件是否有更新 默认1s

        查询 elasticsearch
            # vim /usr/local/logstash/text3.conf
                input {
                    elasticsearch {
                        hosts => "192.168.14.10"
                        index => "teo4"
                        query => '{ "query": { "match_all": {} }}'
                    }
                }

                output {
                    stdout {
                        codec => "rubydebug"
                    }
                }

            # ./bin/logstash -f ./config/text3.conf


Kibana 使用
    http://192.168.10.12:9200/_cat               \\ kibanan 常用命令
    http://192.168.10.12:9200/_cat/health         \\ kibanan 健康状态
    
    创建索引
        PUT /info1/doc/1                  \\ PUT 为插入数据   PUT/POST都可以插入数据
        {                                  \\ info:索引名  doc: 类型    1:id 如果没有id字段会自动生成一段id
            "name":"lucy",                  \\ name:字段   对应的内容 而已
            "first_name":"xiaol",
            "last_name":"ming",
            "age":22,
            "job":"java"
            "about":"i love to go rock climbing",
            "interests":["sports","music"]
        }
        PUT /info2/doc/1
        {
            "name":"king",              
            "first_name":"xiaok",
            "last_name":"empty",
            "age":18,
            "job":"php",
        "about":"i like to collect rock albums",
        "interests":["music"]
        }

    批量创建                               \\ 可实现 mysql等数据库中的数据 使用python等 转换成 es语言 并插入
        POST info3/doc/_bulk
        {"index":{"_id":1}}                 \\ 插入id为1的数据 要在一行 不能使用换行
        {"username":"lucy3","age":"19"}
        {"index":{"_id":2}}
        {"username":"king3","age":"26"}

    更新/覆盖文档
        POST /info2/doc/1             \\ 此会覆盖所有 文档1的内容 只是剩下age
        {
            "age":55,
        }

    删除 文档
        DELETE info2/doc/2

    查询/检索 索引
        GET _search                      \\ 查询所有索引文档
        GET info1/_search                 \\ 查询指定索引文档
        GET info1,info2/_search            \\ 多索引查询
        GET info*/_search                   \\ * 通配符
        GET info1/doc/1                      \\ 查询 id 为1的 所有内容
            {
              "_index" : "info",                    \\ 索引名
              "_type" : "doc",                       \\ 类型
              "_id" : "1",                            \\ id
              "_version" : 2,                          \\ 版本 就是修改了几次
              "_seq_no" : 1,
              "_primary_term" : 1,
              "found" : true,
              "_source" : {
                "name" : "lucy",
                "first_name" : "xiaol",
                "last_name" : "ming",
                "age" : 22,
                "job" : "java",
                "about" : "i love to go rock climbing",
                "interests" : [
                  "sports",
                  "music"
                ]
              }
            }
        GET info1/doc/1?_source=first_name    \\ 查询 id 为1的 first_name 字段的内容
        GET info1/doc/_search                  \\ 查询 info中的doc的所有文档的内容  _search获取所有内容
            {
              "took" : 1,                     \\ 时间
              "timed_out" : false,
              "_shards" : {                     \\ 分片
                "total" : 1,                     \\ 一共 1条
                "successful" : 1,                 \\ 成功 1
                "skipped" : 0,                     \\ 跳过 0
                "failed" : 0                        \\ 失败 0
              },
              "hits" : {                              \\ 查询的内容
                "total" : {
                  "value" : 2,
                  "relation" : "eq"
                },
                "max_score" : 1.0,            \\ 打分 高频词
                "hits" : [
                  {
                    "_index" : "info",
                    "_type" : "doc",
                    "_id" : "1",
                    "_score" : 1.0,
                    "_source" : {
                      "name" : "lucy",
                      "first_name" : "xiaol",
                      "last_name" : "ming",
                      "age" : 22,
                      "job" : "java",
                      "about" : "i love to go rock climbing",
                      "interests" : [
                        "sports",
                        "music"
                      ]
                    }
                  },
                  {
                    "_index" : "info",
                    "_type" : "doc",
                    "_id" : "2",
                    "_score" : 1.0,
                    "_source" : {
                      "name" : "hank",
                      "first_name" : "xiaok",
                      "last_name" : "king",
                      "age" : 18,
                      "job" : "php",
                      "about" : "i like to collect rock albums",
                      "interests" : [
                        "music"
                      ]
                    }
                  }
                ]
              }
            }
        GET info1/doc/_search?q=lucy            \\ 查询 search搜索 包含lucy的 ?q=要查询的关键字
        GET info1/doc/_search?q=lucy&df=username&sort=age:asc&from=4&size=10
            ?:  后接条件
            q:  指定 关键字
            df: 指定默认查询的字段，如果不指定 es会查询所有字段
            sort: 排序 asc升序 desc降序
                sort=age:asc 按age字段 做升序排列
            timeout: 指定超时时间 默认不超时
            from,size: 用于分页
                from=0 从第0个索引开始
                size=10 显示10条数据
            timeout: 超时 防止查询的时候 数据量较大 长时间没有响应 造成阻塞
                timeout=1s
            &: 连接符号
        GET info1/doc/_search?q=username:lucy king            \\ 此查询方式为term查询 等效于 lucy 或者 king 的
        GET teo4/doc/_search?q=username:"collect rock"         \\ 此查询方式为phrase查询 词语查询 查询到 lucy king 的字符串
        GET teo4/doc/_search?q=username:(lucy OR king) AND lili \\ 包含 lucy或king 和 lili 的字段
            "profile":true                                       \\ 查看 查询的过程
        GET teo4/doc/_search?q=username:(luc NOT kin)             \\ 包含luc 不包含kin的
        GET teo4/doc/_search?q=username:(+luc -kin)                \\ 包含teo 不包含kata的
        GET teo4/doc/_search?q=age:[18 TO 22]
            [ ] 闭区间
            { } 开区间
            [1 TO 10]                 \\ 1 <= age <= 10
            [1 TO]                     \\ 1 <= age
            [* TO 10]                   \\ age <= 10
        通配符查询
            ?: 1个字符
            *: 0或多个字符
        正则表达式
            /   /        需要用 / / 括起来
        GET info1/doc/_search?q=username:lucy~1         \\ 匹配与lucy差一个字母的词 如 lucya luc lucy2 lucb
            ~1              \\ 模糊匹配 范围只能是 0 1 2
            ~2
        POST /teo4/doc/_mget                               \\ 查询/检索 多个文档
            {
                "ids":["1","2"]                              \\ 查询 id 为1和2的文档
            }

    浏览器查询 Search API(RUI) 查询                             \\ 语句 放在查询路径的后面
        http://192.168.10.12:9200/lucy1/doc/1                   \\ 查询某一个文档
        http://192.168.10.12:9200/lucy1/doc/_search              \\ 查询某一个类型中的id 所有的文档

    Kibana 分词器 简介
        post _analyze                     \\ 请求  分析器
        {
            "analyzer"："standard",         \\ analyzer 分词器类型   standard
            "text":"we are a family"         \\ 分的词
        }

        通用分词器类型
            standard: 默认分词器 支持多语言 按词切分并做小写处理
            simple: 按照非字母切分 小写处理
            whitespace: 按照空格来切分
            stop: 去除语气助词 如 the an 的 这等
            keyword: 不分词
            pattern: 正则分词 默认\w+ 即非字词符号做分隔符
            language: 常见语言的分词器 30+
        中文分词器类型
            ik: 实现中英文单词切分 自定义词库 https://github.com/medcl/elasticsearch-analysis-ik
                ik_smart: 最少切分 想要效率更高一些速度快一些使用此分词器 企业内部建议用此分词器
                ik_max_word: 最细粒度划分 想要搜索更全面使用此分词器 分的词会更多 占用空间较多 检索效率较低 提供外界用户建议用此分词器
            jieba: python流行分词系统 支持分词和词性标注 支持繁体 自定义 并行分词 https://github.com/singlee/elasticsearch-jieba-plugin
            Hanlp: 由一系列模型于算法组成的java工具包 普及自然语言处理在生产环境中的应用 https://github.com/hankcs/hanlp
            THULAC: 清华大学中文词法分析工具包 具有中文分词和词性标注功能 https://github.com/microbun/elasticsearch-thulac-plugin
    
        ik分词器    \\ 版本要和Elasticsearch的版本对应 下载网址 https://github.com/medcl/elasticsearch-analysis-ik/releases

        # cd /data/
        # unzip -o -d /usr/local/es/plugins/analysis-ik/ elasticsearch-analysis-ik-7.9.2.zip
        # cd /usr/local/es/plugins
        重启elasticsearch
        重启Kibana
        http://192.168.10.12:5601/    --> Dev Tools             \\ 浏览器访问
            POST _analyze
            {
              "analyzer": "ik_smart",                              \\ 使用 ik_smart 分词器
              "text":"php是世界上最好的语言"
            }

        自定义分词器
            PUT my_analyzer                     \\ 定义 索引
            {
                "settings":{                      \\ 设置 定义分词器
                    "analysis":{                   \\ 主分词器
                        "analyzer":{    
                            "my":{                                   \\ 主分词器名字 随便起
                                "tokenizer":"punctuation",            \\ punctuation分词器类型   第二步
                                "type":"custom",                       \\ 类型 自定义
                                "char_filter":["emoticons"],            \\ 字符过滤              第一步
                                "filter":["lowercase","english_stop"]    \\ 分词后的处理         第三步
                            } 
                        },
                        "tokenizer":{                   \\ 分词器参数自定义
                            "punctuation":{              \\ 定义punctuation分词器 的参数
                                "type":"pattern",
                                "pattern":"[.,!?]"         \\ 遇到 .,!? 做分词 的匹配  [a]a做分词 [0-9]数字做分词 
                            }
                        },
                        "char_filter":{                      \\ 定义字符过滤
                            "emoticons":{
                                "type":"mapping",              \\ 类型 mapping
                                "mappings":[                    \\ 定义mapping
                                    ":)=>_happy_",               \\ 遇到 :) 转化成  _happy_
                                    ":(=>_sad_"
                                ]
                            }
                        },
                        "filter":{                            \\ 定义 分词后的处理
                            "english_stop":{
                                "type":"stop",                  \\ 停用词 敏感词
                                "stopwords":"_english_"          \\ 停用 _english_
                            }
                        }
                    }                                              \\ 结束 定义分词器
                }
            }

            post my_analyzer/_analyze
            {
                "analyzer":"my",                                      \\ 使用 自定义分词器 my
                "text":"1 ' m a : ) person,and you?"
            }


ELK是Elasticsearch、Logstash、Kibana的简称，这三者是核心套件，但并非全部。

Elasticsearch 是实时全文搜索和分析引擎，提供搜集、分析、存储数据三大功能；是一套开放REST和JAVA API等结构提供高效搜索功能，
              可扩展的分布式系统。它构建于Apache Lucene搜索引擎库之上。

Logstash      是一个用来搜集、分析、过滤日志的工具。它支持几乎任何类型的日志，包括系统日志、错误日志和自定义应用程序日志。
              它可以从许多来源接收日志，这些来源包括 syslog、消息传递（例如 RabbitMQ）和JMX，它能够以多种方式输出数据，
              包括电子邮件、websockets和Elasticsearch。

Kibana        是一个基于Web的图形界面，用于搜索、分析和可视化存储在 Elasticsearch指标中的日志数据。它利用Elasticsearch
              的REST接口来检索数据，不仅允许用户创建他们自己的数据的定制仪表板视图，还允许他们以特殊的方式查询和过滤数据
Teo