初识Keepalived

初步介绍了keepalive的使用
内容 隐藏

高可用集群KEEPALIVED

1、高可用集群

1.1 集群类型

  • LB:Load Balance 负载均衡
    LVS/HAProxy/nginx(http/upstream, stream/upstream)
  • HA:High Availability 高可用集群
    数据库、Zookeeper、Redis
    SPoF: Single Point of Failure,解决单点故障
  • HPC:High Performance Computing 高性能集群
    https://www.top500.org

1.2 系统可用性

SLA:Service-Level Agreement 服务等级协议(提供服务的企业与客户之间就服务的品质、水准、性能等方面所达成的双方共同认可的协议或契约)
A = MTBF / (MTBF+MTTR)

99.95%:(60*24*30)*(1-0.9995)=21.6分钟 #一般按一个月停机时间统计

指标 :99.9%, 99.99%, 99.999%,99.9999%

1.3 系统故障

硬件故障:设计缺陷、wear out(损耗)、自然灾害……
软件故障:设计缺陷 bug

1.4 实现高可用

提升系统高用性的解决方案:降低MTTR- Mean Time To Repair(平均故障时间)
解决方案:建立冗余机制

  • active/passive 主/备
  • active/active 双主
  • active –> HEARTBEAT –> passive
  • active HEARTBEAT active

1.5 高可用相关技术

1.5.1 HA service

资源:组成一个高可用服务的“组件”,比如:vip,service process,shared storage
(1) passive node的数量
(2) 资源切换

1.5.2 shared storage

  • NAS(Network Attached Storage):网络附加存储,基于网络的共享文件系统。
  • SAN(Storage Area Network):存储区域网络,基于网络的块级别的共享

1.5.3 Network partition 网络分区

1.5.3.1 quorum 法定人数,仲裁

with quorum: > total/2
without quorum: <= total/2

1.5.3.2 隔离设备 fence

node:STONITH = Shooting The Other Node In The Head(强制下线/断电)
参考资料:
https://access.redhat.com/documentation/zh-cn/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/s1-unfence-haar

1.5.4 双节点集群(TWO nodes Cluster)

辅助设备:仲裁设备,ping node, quorum disk

  • Failover:故障切换,即某资源的主节点故障时,将资源转移至其它节点的操作
  • Failback:故障移回,即某资源的主节点故障后重新修改上线后,将之前已转移至其它节点的资源重新切回的过程

1.5.5 HA Cluster实现方案:

1.5.5.1 AIS:Applicaiton Interface Specification 应用程序接口规范

  • RHCS:Red Hat Cluster Suite 红帽集群套件
    参考资料:https://access.redhat.com/documentation/zh-cn/red_hat_enterprise_linux/5/html/cluster_suite_overview/ch.gfscs.cluster-overview-cso


– heartbeat:基于心跳监测实现服务高可用
– pacemaker+corosync:资源管理与故障转移

1.5.5.2 VRRP:Virtual Router Redundancy Protocol

虚拟路由冗余协议,解决静态网关单点风险

  • 物理层:路由器、三层交换机
  • 软件层:keepalived

1.5.6 VRRP

1.5.6.1 VRRP 网络层硬件实现

参考链接:
https://support.huawei.com/enterprise/zh/doc/EDOC1000141382/19258d72/basic-concepts-of-vrrp
https://wenku.baidu.com/view/dc0afaa6f524ccbff1218416.html
https://wenku.baidu.com/view/281ae109ba1aa8114431d9d0.html

1.5.6.2 VRRP 相关术语

  • 虚拟路由器:Virtual Router
  • 虚拟路由器标识:VRID(0-255),唯一标识虚拟路由器
  • VIP:Virtual IP
  • VMAC:Virutal MAC (00-00-5e-00-01-VRID)
  • 物理路由器:
    master:主设备
    backup:备用设备
    priority:优先级

1.5.7.3 VRRP 相关技术

通告:心跳,优先级等;周期性

工作方式:抢占式,非抢占式

安全认证:

  • 无认证
  • 简单字符认证:预共享密钥
  • MD5

工作模式:

  • 主/备:单虚拟路径器
  • 主/主:主/备(虚拟路由器1),备/主(虚拟路由器2)

2、Keepalived 初步

2.1 keepalived 介绍

vrrp 协议的软件实现,原生设计目的为了高可用 ipvs服务
官网:http://keepalived.org/
功能:

  • 基于vrrp协议完成地址流动
  • 为vip地址所在的节点生成ipvs规则(在配置文件中预先定义)
  • 为ipvs集群的各RS做健康状态检测
  • 基于脚本调用接口完成脚本中定义的功能,进而影响集群事务,以此支持nginx、haproxy等服务

2.2 Keepalived 架构

官方文档:

https://keepalived.org/doc/
http://keepalived.org/documentation.html

  • 用户空间核心组件:
    vrrp stack:VIP消息通告
    checkers:监测real server
    system call:实现 vrrp 协议状态转换时调用脚本的功能

    SMTP:邮件组件
    IPVS wrapper:生成IPVS规则
    Netlink Reflector:网络接口
    WatchDog:监控进程

  • 控制组件:提供keepalived.conf 的解析器,完成Keepalived配置

  • IO复用器:针对网络目的而优化的自己的线程抽象

  • 内存管理组件:为某些通用的内存管理功能(例如分配,重新分配,发布等)提供访问权限

Keepalived进程树

Keepalived <-- Parent process monitoring children
\_ Keepalived <-- VRRP child
\_ Keepalived <-- Healthchecking child

2.3 Keepalived 环境准备

  • 各节点时间必须同步:ntp, chrony
  • 关闭防火墙及SELinux
  • 各节点之间可通过主机名互相通信:非必须
  • 建议使用/etc/hosts文件实现:非必须
  • 各节点之间的root用户可以基于密钥认证的ssh服务完成互相通信:非必须

2.4Keepalived 相关文件

  • 软件包名:keepalived
  • 主程序文件:/usr/sbin/keepalived
  • 主配置文件:/etc/keepalived/keepalived.conf
  • 配置文件示例:/usr/share/doc/keepalived/
  • Unit File:/lib/systemd/system/keepalived.service
  • Unit File的环境配置文件:
    /etc/sysconfig/keepalived CentOS
    /etc/default/keepalived Ubuntu

注意:CentOS 7 上有 bug,可能有下面情况出现

systemctl restart keepalived #新配置可能无法生效
systemctl stop keepalived;systemctl start keepalived #无法停止进程,需要kill 停止

2.5 Keepalived 安装

2.5.1 包安装

#CentOS
[root@centos ~]#yum install keepalived

#ubuntu
[root@ubuntu1804 ~]#apt -y install keepalived

2.5.1.1 CentOS 安装 keepalived

[root@ka1 ~]#yum install -y keepalived.x86_64 
[root@ka1 ~]#yum info keepalived.x86_64         #centos8上版本为2.1.5还是很新的直接用
Last metadata expiration check: 0:00:28 ago on Tue 28 Dec 2021 02:17:23 PM CST.
Installed Packages
Name         : keepalived
Version      : 2.1.5
Release      : 6.el8
Architecture : x86_64
Size         : 1.5 M
Source       : keepalived-2.1.5-6.el8.src.rpm
Repository   : @System
From repo    : appstream
Summary      : High Availability monitor built upon LVS, VRRP and service pollers
URL          : http://www.keepalived.org/
License      : GPLv2+
Description  : Keepalived provides simple and robust facilities for load balancing
             : and high availability to Linux system and Linux based infrastructures.
             : The load balancing framework relies on well-known and widely used
             : Linux Virtual Server (IPVS) kernel module providing Layer4 load
             : balancing. Keepalived implements a set of checkers to dynamically and
             : adaptively maintain and manage load-balanced server pool according
             : their health. High availability is achieved by VRRP protocol. VRRP is
             : a fundamental brick for router failover. In addition, keepalived
             : implements a set of hooks to the VRRP finite state machine providing
             : low-level and high-speed protocol interactions. Keepalived frameworks
             : can be used independently or all together to provide resilient
             : infrastructures.

[root@ka1 ~]#systemctl enable --now keepalived.service

[root@ka1 ~]#ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:95:b7:a2 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.11/24 brd 10.0.0.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.200.16/32 scope global ens33   #这些都是默认的配置文件中的内容,之后会进行修改
       valid_lft forever preferred_lft forever
    inet 192.168.200.17/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet 192.168.200.18/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe95:b7a2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

2.6KeepAlived 配置说明

2.6.1 配置文件组成部分

/etc/keepalived/keepalived.conf 配置组成

  • GLOBAL CONFIGURATION
    Global definitions:定义邮件配置,route_id,vrrp配置,多播地址等
  • VRRP CONFIGURATION
    VRRP instance(s):定义每个vrrp虚拟路由器
  • LVS CONFIGURATION
    Virtual server group(s)
    Virtual server(s):LVS集群的VS和RS

2.6.2 配置语法说明

帮助

man keepalived.conf

2.6.2.1 全局配置

[root@ka1 ~]#cat /etc/keepalived/keepalived.conf 
! Configuration File for keepalived

global_defs {
   notification_email {
     acassen@firewall.loc       #keepalived 发生故障切换时邮件发送的目标邮箱,可以按行区分写多个
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc    #发邮件的地址
   smtp_server 192.168.200.1                                #邮件服务器地址
   smtp_connect_timeout 30                                  #邮件服务器连接timeout
   router_id LVS_DEVEL      #每个keepalived主机唯一标识,建议使用当前主机名,但多节点重名不影响
   vrrp_skip_check_adv_addr #对所有通告报文都检查,会比较消耗性能,启用此配置后,如果收到的通告报文和上一个报文是同一个路由器,则跳过检查,默认值为全检查
   vrrp_strict              #严格遵守VRRP协议,启用此项后以下状况将无法启动服务:1.无VIP地址 2.配置了单播邻居 3.在VRRP版本2中有IPv6地址,开启动此项并且没有配置vrrp_iptables时会自动开启iptables防火墙规则,默认导致VIP无法访问,建议不加此项配置
   vrrp_garp_interval 0     #gratuitous ARP messages 报文发送延迟,0表示不延迟
   vrrp_gna_interval 0      #unsolicited NA messages (不请自来)消息发送延迟
   vrrp_mcast_group4 224.0.0.18 #指定组播IP地址范围:224.0.0.0到239.255.255.255,默认值:224.0.0.18
   vrrp_iptables    #此项和vrrp_strict同时开启时,则不会添加防火墙规则,如果无配置vrrp_strict项,则无需启用此项配置

}


2.6.2.2 配置虚拟路由器

[root@ka1 ~]#cat /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {        #vrrp的实例名,一般为业务名称
    state MASTER            #当前节点在此虚拟路由器上的初始状态,状态为MASTER或者BACKUP
    interface eth0          #绑定为当前虚拟路由器使用的物理接口,如:eth0,bond0,br0,可以和VIP不在一个网卡
    virtual_router_id 51    #每个虚拟路由器惟一标识,范围:0-255,每个虚拟路由器此值必须唯一,否则服务无法启动,同属一个虚拟路由器的多个keepalived节点必须相同,务必要确认在同一网络中此值必须唯一
    priority 100            #当前物理节点在此虚拟路由器的优先级,范围:1-254,每个keepalived主机节点此值不同
    advert_int 1            #vrrp通告的时间间隔,默认1s
    authentication {        #认证机制
        auth_type PASS      #AH为IPSEC认证(不推荐),PASS为简单密码(建议使用)
        auth_pass 1111      #预共享密钥,仅前8位有效,同一个虚拟路由器的多个keepalived节点必须一样
    }
    virtual_ipaddress {     #虚拟IP,生产环境可能指定上百个IP地址
        192.168.200.16      #指定VIP,不指定网卡,默认为eth0,注意:不指定/prefix,默认为/32
        192.168.200.17
        192.168.200.18
        192.168.200.101/24 dev eth1     #指定VIP的网卡,建议和interface指令指定的岗卡不在一个网卡
        192.168.200.102/24 dev eth2 label eth2:1    #指定VIP的网卡label
    }
    track_interface { #配置监控网络接口,一旦出现故障,则转为FAULT状态实现地址转移
        eth0
        eth1
        …
    }
}

2.6.2.3 启用keepalived日志功能

[root@ka1 ~]#vim /etc/sysconfig/keepalived 
[root@ka1 ~]#cat /etc/sysconfig/keepalived | grep -Ev '^#|^$'
KEEPALIVED_OPTIONS="-D -S 6"

[root@ka1 ~]#vim /etc/rsyslog.conf 
local6.*                                                /var/log/keepalived.log   

[root@ka1 ~]#systemctl restart keepalived.service rsyslog.service 

2.6.2.4 实现独立子配置文件

当生产环境复杂时, /etc/keepalived/keepalived.conf 文件中内容过多,不易管理,可以将不同集群的配置,比如:不同集群的VIP配置放在独立的子配置文件中

利用include 指令可以实现包含子配置文件

格式:

include /path/file
[root@ka1 ~]#mkdir /etc/keepalived/conf.d/
[root@ka1 ~]#cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
[root@ka1 ~]#ls /etc/keepalived/
conf.d  keepalived.conf  keepalived.conf.bak

[root@ka1 ~]#vim /etc/keepalived/keepalived.conf
[root@ka1 ~]#cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

include /etc/keepalived/conf.d/*.conf       #将VRRP相关配置放在子配置文件中

3、Keepalived 企业应用

3.1 实现master/slave的 Keepalived 单主架构

3.1.1 MASTER配置

#全局配置
[root@ka1 ~]#cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     448803503@qq.com
   }
   notification_email_from 448803503@qq.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id ka1
   vrrp_skip_check_adv_addr
   #vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
   vrrp_mcast_group4 224.0.0.18
}
include /etc/keepalived/conf.d/*.conf

#vrrp子配置
[root@ka1 ~]#cat /etc/keepalived/conf.d/master.conf 
vrrp_instance test1 {
    state MASTER
    interface ens33
    virtual_router_id 55
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
}        


#重启服务
[root@ka1 ~]#systemctl restart keepalived.service
[root@ka1 ~]#systemctl status keepalived.service 
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-12-28 15:08:42 CST; 24s ago
  Process: 34597 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 34599 (keepalived)
    Tasks: 2 (limit: 23364)
   Memory: 1.8M
   CGroup: /system.slice/keepalived.service
           ├─34599 /usr/sbin/keepalived -D -S 6
           └─34600 /usr/sbin/keepalived -D -S 6

Dec 28 15:08:45 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:45 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:45 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:45 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:50 ka1 Keepalived_vrrp[34600]: (test1) Sending/queueing gratuitous ARPs on ens33 for 1>
Dec 28 15:08:50 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:50 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:50 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:50 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15
Dec 28 15:08:50 ka1 Keepalived_vrrp[34600]: Sending gratuitous ARP on ens33 for 10.0.0.15

#查看地址信息
[root@ka1 ~]#ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:95:b7:a2 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.11/24 brd 10.0.0.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 10.0.0.15/32 scope global ens33            #配置的虚拟ip在主设备上可以看到
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe95:b7a2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

3.1.2 BACKUP配置

#配置文件和master基本一致,只需修改三行
[root@ka2 ~]#cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     448803503@qq.com
   }
   notification_email_from 448803503@qq.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id ka2                        #修改此行
   vrrp_skip_check_adv_addr
   #vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
   vrrp_mcast_group4 224.0.0.18
}
include /etc/keepalived/conf.d/*.conf


[root@ka2 ~]#cat /etc/keepalived/conf.d/backup.conf 
vrrp_instance test1 {
    state BACKUP                        #修改此行
    interface ens33
    virtual_router_id 55
    priority 80                         #修改此行
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
} 

#重启服务
[root@ka2 ~]#systemctl restart keepalived.service 
[root@ka2 ~]#ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:ff:33:b2 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.21/24 brd 10.0.0.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever              #没有虚拟ip的信息
    inet6 fe80::20c:29ff:feff:33b2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever


3.1.3连接测试

#客户端ping虚拟地址
root@ubuntu1804:~# ping 10.0.0.15
PING 10.0.0.15 (10.0.0.15) 56(84) bytes of data.
64 bytes from 10.0.0.15: icmp_seq=1 ttl=64 time=0.418 ms
64 bytes from 10.0.0.15: icmp_seq=2 ttl=64 time=0.260 ms
64 bytes from 10.0.0.15: icmp_seq=3 ttl=64 time=0.668 ms
64 bytes from 10.0.0.15: icmp_seq=4 ttl=64 time=0.297 ms
64 bytes from 10.0.0.15: icmp_seq=5 ttl=64 time=0.317 ms
64 bytes from 10.0.0.15: icmp_seq=6 ttl=64 time=0.294 ms
64 bytes from 10.0.0.15: icmp_seq=7 ttl=64 time=0.255 ms
64 bytes from 10.0.0.15: icmp_seq=8 ttl=64 time=0.315 ms
64 bytes from 10.0.0.15: icmp_seq=9 ttl=64 time=0.281 ms
64 bytes from 10.0.0.15: icmp_seq=10 ttl=64 time=0.242 ms

#本地找个设备进行抓包                        #都是主设备发送的组播报文
[root@centos7blog ~]# tcpdump -i ens33 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
23:25:38.412683 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
23:25:39.414746 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
23:25:40.416592 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
23:25:41.420546 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
23:25:42.424322 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
23:25:43.427916 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
23:25:44.430978 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20

3.1.4故障模拟

#在master上停keepalive服务
[root@ka1 ~]#systemctl stop keepalived.service 

#观察ping是否有中断
64 bytes from 10.0.0.15: icmp_seq=176 ttl=64 time=0.410 ms
From 10.0.0.11: icmp_seq=177 Redirect Host(New nexthop: 10.0.0.15)      #短暂的中断
64 bytes from 10.0.0.15: icmp_seq=177 ttl=64 time=314 ms
64 bytes from 10.0.0.15: icmp_seq=178 ttl=64 time=0.321 ms
64 bytes from 10.0.0.15: icmp_seq=179 ttl=64 time=0.548 ms


#观察抓包
23:29:22.915391 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
23:29:22.961564 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 0, authtype simple, intvl 1s, length 20
23:29:23.650834 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20            #由原来的备设备成为主设备发送组播报文
23:29:24.653162 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
23:29:25.656959 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
23:29:26.657822 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20


#查看ip信息
[root@ka2 ~]#ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:ff:33:b2 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.21/24 brd 10.0.0.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 10.0.0.15/32 scope global ens33            #此时虚拟地址在原先的备设备上了
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:feff:33b2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

3.2 抢占模式和非抢占模式

3.2.1 非抢占模式 nopreempt

默认为抢占模式preempt,即当高优先级的主机恢复在线后,会抢占低先级的主机的master角色,造成网络抖动,建议设置为非抢占模式 nopreempt ,即高优级主机恢复后,并不会抢占低优先级主机的master角色

此外原主机down机迁移至新主机后续也发生down时,会将VIP迁移回原主机

3.2.2抢占模式演示

#此时是备设备作为主设备进行工作,进行主设备恢复
[root@ka1 ~]#systemctl restart keepalived.service   

#组播报文有原来的备设备发送恢复成了原来的主设备发送
[root@centos7blog ~]# tcpdump -i ens33 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
00:03:23.477748 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:03:24.478645 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:03:25.481569 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:03:26.483825 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:03:26.548875 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20           #切换为主设备
00:03:27.552367 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:03:28.552676 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:03:29.553209 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:03:30.556446 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20

#这次抢占倒是没有明显的ping中断
root@ubuntu1804:~# ping 10.0.0.15
PING 10.0.0.15 (10.0.0.15) 56(84) bytes of data.
64 bytes from 10.0.0.15: icmp_seq=1 ttl=64 time=0.222 ms
64 bytes from 10.0.0.15: icmp_seq=2 ttl=64 time=0.260 ms
64 bytes from 10.0.0.15: icmp_seq=3 ttl=64 time=0.310 ms
64 bytes from 10.0.0.15: icmp_seq=4 ttl=64 time=0.273 ms
64 bytes from 10.0.0.15: icmp_seq=5 ttl=64 time=0.728 ms
64 bytes from 10.0.0.15: icmp_seq=6 ttl=64 time=0.307 ms
64 bytes from 10.0.0.15: icmp_seq=7 ttl=64 time=0.265 ms
64 bytes from 10.0.0.15: icmp_seq=8 ttl=64 time=0.304 ms
64 bytes from 10.0.0.15: icmp_seq=9 ttl=64 time=0.271 ms
64 bytes from 10.0.0.15: icmp_seq=10 ttl=64 time=0.581 ms
64 bytes from 10.0.0.15: icmp_seq=11 ttl=64 time=0.314 ms
64 bytes from 10.0.0.15: icmp_seq=12 ttl=64 time=0.321 ms
64 bytes from 10.0.0.15: icmp_seq=13 ttl=64 time=0.279 ms
64 bytes from 10.0.0.15: icmp_seq=14 ttl=64 time=0.328 ms
64 bytes from 10.0.0.15: icmp_seq=15 ttl=64 time=1.01 ms

3.2.3关闭抢占模式

注意:要关闭 VIP抢占,必须将各 keepalived 服务器state配置为BACKUP

#修改主设备的配置文件
[root@ka1 ~]#vim /etc/keepalived/conf.d/master.conf 
vrrp_instance test1 {
    state BACKUP                #都为BACKUP
    interface ens33
    virtual_router_id 55
    priority 100                #优先级高
    advert_int 1
    nopreempt                   #添加此行,都为nopreempt  
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
}

#重启服务
[root@ka1 ~]#systemctl restart keepalived.service

#修改从设备的配置文件
[root@ka2 ~]#vim /etc/keepalived/conf.d/backup.conf 
vrrp_instance test1 {
    state BACKUP                #都为BACKUP
    interface ens33
    virtual_router_id 55
    priority 80                 #优先级低
    advert_int 1
    nopreempt                   #添加此行,都为nopreempt
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
}

#重启服务
[root@ka2 ~]#systemctl restart keepalived.service

3.2.4非抢占模式演示

#组播报文正常由主设备发送,模拟主设备故障,切换为备设备发送;故障恢复后依旧是备设备发送
[root@centos7blog ~]# tcpdump -i ens33 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
00:22:50.623167 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:22:51.624082 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:22:52.624925 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:22:53.625535 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:22:54.626160 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:22:55.627201 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:22:56.560252 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 0, authtype simple, intvl 1s, length 20
00:22:57.250830 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20        #切换为备设备
00:22:58.251002 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:22:59.252198 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:23:00.256237 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:23:01.258095 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:23:02.260847 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:23:03.263876 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:23:04.267782 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20        #未进行抢占
00:23:05.270750 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:23:06.272648 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20

3.2.5抢占延迟模式 preempt_delay

抢占延迟模式,即优先级高的主机恢复后,不会立即抢回VIP,而是延迟一段时间(默认300s)再抢回VIP

preempt_delay #   #指定抢占延迟时间为#s,默认延迟300s

注意:需要各keepalived服务器state为BACKUP,并且不要启用 vrrp_strict

#修改主设备配置文件
[root@ka1 ~]#vim /etc/keepalived/conf.d/master.conf 
vrrp_instance test1 {
    state BACKUP                    #都为BACKUP
    interface ens33
    virtual_router_id 55
    priority 100                    #优先级高
    advert_int 1
    preempt_delay 5                 #抢占延迟模式,默认延迟300s
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
}

#重启服务
[root@ka1 ~]#systemctl restart keepalived.service 


#修改备设备配置文件
[root@ka2 ~]#vim /etc/keepalived/conf.d/backup.conf 
vrrp_instance test1 {
    state BACKUP                    #都为BACKUP
    interface ens33
    virtual_router_id 55
    priority 80                     #优先级低
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
}

#重启服务
[root@ka2 ~]#systemctl restart keepalived.service 

3.2.6抢占延迟模式演示

#正常由主服务发送组播报文,模拟故障后由从服务发送组播报文;故障恢复后5秒后主设备再次成为主设备
[root@centos7blog ~]# tcpdump -i ens33 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
00:35:54.839513 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:35:55.840257 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:35:56.843010 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:35:57.845830 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:35:58.848068 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:35:59.850687 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:36:00.851696 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:36:01.854594 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:36:02.855587 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
00:36:03.099360 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 0, authtype simple, intvl 1s, length 20
00:36:03.790192 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:04.791700 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:05.794717 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:06.796545 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:07.798453 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:08.799963 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:09.802847 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:10.805284 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:11.809026 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:12.810821 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:13.813552 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:14.816171 IP 10.0.0.21 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 80, authtype simple, intvl 1s, length 20
00:36:15.420084 IP 10.0.0.11 > 224.0.0.18: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20

3.3VIP单播配置

默认keepalived主机之间利用多播相互通告消息,会造成网络拥塞,可以替换成单播,减少网络流量
注意:启用 vrrp_strict 时,不能启用单播

#主设备修改配置文件
[root@ka1 ~]#vim /etc/keepalived/conf.d/master.conf 
vrrp_instance test1 {
    state BACKUP
    interface ens33
    virtual_router_id 55
    priority 100
    advert_int 1
    preempt_delay 5
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
    unicast_src_ip 10.0.0.11        #本机IP
    unicast_peer{
        10.0.0.2                    #指向对方主机IP,如果有多个keepalived,再加其它节点的IP
    }
}

#重启服务
[root@ka1 ~]#systemctl restart keepalived.service

#备设备修改配置文件
[root@ka2 ~]#vim /etc/keepalived/conf.d/backup.conf 
vrrp_instance test1 {
    state BACKUP
    interface ens33
    virtual_router_id 55
    priority 80
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass sunxiang
    }
    virtual_ipaddress {
        10.0.0.15 dev ens33 laber ens33:0
    }
    unicast_src_ip 10.0.0.21        #本机IP
    unicast_peer{
        10.0.0.11                   #指向对方主机IP
    }
}

#重启服务
[root@ka2 ~]#systemctl restart keepalived.service
#直接抓组播报发现此时抓不到了
[root@centos7blog ~]# tcpdump -i ens33 -nn host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
1 packet received by filter
0 packets dropped by kernel

#指定源目地址抓单播,获取到报文
[root@centos7blog ~]# tcpdump -i ens33 -nn src host 10.0.0.11 and dst host 10.0.0.21
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
05:24:24.528624 IP 10.0.0.11 > 10.0.0.21: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
05:24:25.534558 IP 10.0.0.11 > 10.0.0.21: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
05:24:26.543636 IP 10.0.0.11 > 10.0.0.21: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
05:24:27.549720 IP 10.0.0.11 > 10.0.0.21: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
05:24:28.557037 IP 10.0.0.11 > 10.0.0.21: VRRPv2, Advertisement, vrid 55, prio 100, authtype simple, intvl 1s, length 20
^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel