  • 传送门
  • 前言
  • 一、概念
    • 1. 主要功能
    • 2. 架构
    • 3. 使用场景
    • 4. 模块
    • 5. 监控与管理
  • 二、下载地址
  • 三、Linux下7.6.2版本安装
    • filebeat.yml配置文件参考(不要直接拷贝用)
    • 多行匹配配置
    • 过滤配置
    • 最终配置(一、多行匹配、直接读取日志文件、EFK方案)
    • 最终配置(二、多行匹配、直接读取日志文件、ELFK方案)
  • 四、综合理论案例








Filebeat是Elastic Stack中的一个轻量级数据传输工具,专门用于转发和集中化日志数据。它的设计目的是高效地读取和转发日志文件到Elasticsearch或Logstash,进而帮助用户进行日志分析和监控。以下是对Filebeat的详细介绍,包括其主要功能、架构、使用场景和配置示例。

1. 主要功能


2. 架构



3. 使用场景


4. 模块


5. 监控与管理

Filebeat支持与Elastic Stack中的其他工具(如Kibana)集成,以实现日志的可视化和监控。用户可以在Kibana中创建仪表板,以实时查看采集到的日志数据。

type: log #input类型为log
enable: true #表示是该log类型配置生效
paths:     #指定要监控的日志,目前按照Go语言的glob函数处理。没有对配置目录做递归处理,比如配置的如果是:
- /var/log/* /*.log  #则只会去/var/log目录的所有子目录中寻找以".log"结尾的文件,而不会寻找/var/log目录下以".log"结尾的文件。
recursive_glob.enabled: #启用全局递归模式,例如/foo/**包括/foo, /foo/*, /foo/*/*
exclude_lines: ['^DBG'] #不包含匹配正则的行
include_lines: ['^ERR', '^WARN']  #包含匹配正则的行
harvester_buffer_size: 16384 #每个harvester在获取文件时使用的缓冲区的字节大小
max_bytes: 10485760 #单个日志消息可以拥有的最大字节数。max_bytes之后的所有字节都被丢弃而不发送。默认值为10MB (10485760)
exclude_files: ['\.gz$']  #用于匹配希望Filebeat忽略的文件的正则表达式列表
ingore_older: 0 #默认为0,表示禁用,可以配置2h,2m等,注意ignore_older必须大于close_inactive的值.表示忽略超过设置值未更新的
close_* #close_ *配置选项用于在特定标准或时间之后关闭harvester。 关闭harvester意味着关闭文件处理程序。 如果在harvester关闭
后文件被更新,则在scan_frequency过后,文件将被重新拾取。 但是,如果在harvester关闭时移动或删除文件,Filebeat将无法再次接收文件
close_inactive  #启动选项时,如果在制定时间没有被读取,将关闭文件句柄
使用内部时间戳机制,来反映记录日志的读取,每次读取到最后一行日志时开始倒计时使用2h 5m 来表示
close_rename #当选项启动,如果文件被重命名和移动,filebeat关闭文件的处理读取
close_removed #当选项启动,文件被删除时,filebeat关闭文件的处理读取这个选项启动后,必须启动clean_removed
close_eof #适合只写一次日志的文件,然后filebeat关闭文件的处理读取
close_timeout #当选项启动时,filebeat会给每个harvester设置预定义时间,不管这个文件是否被读取,达到设定时间后,将被关闭
close_timeout 不能等于ignore_older,会导致文件更新时,不会被读取如果output一直没有输出日志事件,这个timeout是不会被启动的,
设置0 表示不启动
clean_inactived #从注册表文件中删除先前收获的文件的状态
clean_removed #启动选项后,如果文件在磁盘上找不到,将从注册表中清除filebeat
如果关闭close removed 必须关闭clean removed
scan_frequency #prospector检查指定用于收获的路径中的新文件的频率,默认10s
backoff: #backoff选项指定Filebeat如何积极地抓取新文件进行更新。默认1s,backoff选项定义Filebeat在达到EOF之后
max_backoff: #在达到EOF之后再次检查文件之前Filebeat等待的最长时间
backoff_factor: #指定backoff尝试等待时间几次,默认是2
harvester_limit:#harvester_limit选项限制一个prospector并行启动的harvester数量,直接影响文件打开数tags #列表中添加标签,用过过滤,例如:tags: ["json"]
fields #可选字段,选择额外的字段进行输出可以是标量值,元组,字典等嵌套类型
app_id: query_engine_12
fields_under_root #如果值为ture,那么fields存储在输出文档的顶级位置multiline.pattern #必须匹配的regexp模式
multiline.negate #定义上面的模式匹配条件的动作是 否定的,默认是false
假如模式匹配条件'^b',默认是false模式,表示讲按照模式匹配进行匹配 将不是以b开头的日志行进行合并
multiline.match # 指定Filebeat如何将匹配行组合成事件,在之前或者之后,取决于上面所指定的negate
multiline.max_lines #可以组合成一个事件的最大行数,超过将丢弃,默认500
multiline.timeout #定义超时时间,如果开始一个新的事件在超时时间内没有发现匹配,也将发送日志,默认是5smax_procs #设置可以同时执行的最大CPU数。默认值为系统中可用的逻辑CPU的数量。name #为该filebeat指定名字,默认为主机的hostname





cd /usr/local/filebeat
tar -zxvf filebeat-7.6.2-linux-x86_64.tar.gz #  解压tar包,解压时间有点长。
cd /usr/local/filebeat/filebeat-7.6.2-linux-x86_64  # 公司古博服务器
cp /usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat.yml  /usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat.bak.yml # 备份
vim /usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat.yml   #  可以先备份一下在 path位置修改一下路径就可以了,单节点的es也不用改其他的。multiline是多行匹配,异常合并这种。深坑:一定要注意缩进,否则无效paths:-  /java/logs/*/*.logmultiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'multiline.negate: truemultiline.match: after/usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat test config  # 验证配置文件命令
nohup /usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat -e -c filebeat.yml > /dev/null 2>&1 &  disown  # 后台启动,并将日志放到黑洞,注意disown参数要带上,该参数解决关闭xshell后filebeat自动退出问题。
#### nohup /usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat -e -c filebeat.yml > /usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat.txt  &  # 非黑洞方式,可以查看日志排查原因。
####tail -n 10 -f  /usr/local/filebeat/filebeat-7.6.2-linux-x86_64/filebeat.txt
###  -c 后面跟自定义配置文件 -d 是开启debug模式  默认端口5044


4种组合设置 以b为中心的, b之前的,b之后的各种不同组合规则,比如 让异常 at开头的都在一行中,而不是分开

multiline.pattern: '^[[:space:]]+(at|\.{3})\b|^Caused by:'
multiline.negate: false
multiline.match: after
multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after
合并了一个线程的(官方推荐的,合并的有点厉害,这种不行的,合并以后时间都是错乱的)multiline.pattern: '^\['multiline.negate: truemultiline.match: after




###################### Filebeat Configuration Example ########################## This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.#=========================== Filebeat inputs =============================filebeat.inputs:# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.- type: log# Change to true to enable this input configuration.enabled: true# Paths that should be crawled and fetched. Glob based paths.paths:# - /var/log/*.log- /java/logs/*/*.log- /java/logs/auth/*.log#- c:\programdata\elasticsearch\logs\*# Exclude lines. A list of regular expressions to match. It drops the lines that are# matching any regular expression from the list.#exclude_lines: ['^DBG']# Include lines. A list of regular expressions to match. It exports the lines that are# matching any regular expression from the list.#include_lines: ['^ERR', '^WARN']# Exclude files. A list of regular expressions to match. Filebeat drops the files that# are matching any regular expression from the list. By default, no files are dropped.#exclude_files: ['.gz$']# Optional additional fields. These fields can be freely picked# to add additional information to the crawled log files for filtering#fields:#  level: debug#  review: 1### Multiline options# Multiline can be used for log messages spanning multiple lines. This is common# for Java Stack Traces or C-Line Continuation# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [#multiline.pattern: ^\[# Defines if the pattern set under pattern should be negated or not. Default is false.#multiline.negate: false# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern# that was (not) matched before or after or as long as a pattern is not matched based on negate.# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash#multiline.match: after# multiline.pattern: '^\['# multiline.negate: true# multiline.match: after# multiline.pattern: '^[[:space:]]+(at|\.{3})\b|^Caused by:'# multiline.negate: false# multiline.match: aftermultiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'multiline.negate: truemultiline.match: after
#============================= Filebeat modules ===============================filebeat.config.modules:# Glob pattern for configuration loadingpath: ${path.config}/modules.d/*.yml# Set to true to enable config reloadingreload.enabled: false# Period on which files under path should be checked for changes#reload.period: 10s#==================== Elasticsearch template setting ==========================setup.template.settings:index.number_of_shards: 1#index.codec: best_compression#_source.enabled: false#================================ General =====================================# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]# Optional fields that you can specify to add additional information to the
# output.
#  env: staging#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:#============================== Kibana =====================================# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:# Kibana Host# Scheme and port can be left out and will be set to the default (http and 5601)# In case you specify and additional path, the scheme is required: http://localhost:5601/path# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601#host: "localhost:5601"# Kibana Space ID# ID of the Kibana Space into which the dashboards should be loaded. By default,# the Default Space will be used.#space.id:#============================= Elastic Cloud ==================================# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:#================================ Outputs =====================================# Configure what output to use when sending the data collected by the beat.#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:# Array of hosts to connect to.# hosts: ["localhost:9200"]# Protocol - either `http` (default) or `https`.#protocol: "https"# Authentication credentials - either API key or username/password.#api_key: "id:api_key"#username: "elastic"#password: "changeme"#----------------------------- Logstash output --------------------------------
output.logstash:# The Logstash hostshosts: ["localhost:5044"]# Optional SSL. By default is off.# List of root certificates for HTTPS server verifications#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]# Certificate for SSL client authentication#ssl.certificate: "/etc/pki/client/cert.pem"# Client Certificate Key#ssl.key: "/etc/pki/client/cert.key"#================================ Processors =====================================# Configure processors to enhance or manipulate events generated by the beat.processors:- add_host_metadata: ~- add_cloud_metadata: ~- add_docker_metadata: ~- add_kubernetes_metadata: ~#================================ Logging =====================================# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]#============================== X-Pack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.# Set to true to enable the monitoring reporter.
#monitoring.enabled: false# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:#================================= Migration ==================================# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true








tcp那个 是logstash直接读取tcp,所以不用配置filebeat



