最近在阿里轻量云买了一些机器, 同样都是1C/0.5G, Ubuntu 20.04 的机器运行非常正常, Ubuntu 22.04 的机器却隔一段时间就没有响应了. 具体表现为能ping通, 但是ssh登录会超时失败, 宿主机监控CPU/磁盘IO大涨. 登录VNC能看到类似这样的提示:

1
2
3
4
5
6
7
8
9
[44211.553196] Out of memory: Killed process 95466(apt-check) ... shmem-rss:0KB, UID:0 pgtables:300KB dom_score_adj:0
[44213.738115] systemd[1]: Failed to start Refresh fwupd metadata and update motd.
[63358.618014] Out of memory: Killed process 123118(apt-check) total-vm:190376KB ... UID:0 pgtables:364kB oom_score_adj:0
[63359.212458] systemd[1] : Failed to start Dailyapt download activities.
[74055.581349] Out of memory: Killed process 126756 (apt-check) total-vm:190376kB, ... UID:0 pgtables:376kB oom_score_adj:0
[121996.542525] Out of memory: Killed process 210249 (apt-check) total-vm:100788kB, ... UID:0 pgtables:228kB oom_score_adj:0
[121996.882131] systemd[1]: snapd.service: Watchdogtimeout (limit 5min)
[121997.311208] systemd[1]: Failed to start Dailyapt download activities.
[179235.303036] Out of memory: Killed process 292938(apt-check) total-vm:190376KB, ... UID:0 pgtables:372kB oom_score_adj:0

而这台机器已经关闭了apt自动更新, 一番搜索之后发现snapd可能会引起这个问题:

  1. snap list
1
2
3
4
Name    Version        Rev    Tracking       Publisher   Notes
core20 20240416 2318 latest/stable canonical✓ base
lxd 5.0.3-80aeff7 29351 5.0/stable/… canonical✓ -
snapd 2.63 21759 latest/stable canonical✓ snapd
  1. 删除snap各个包(有依赖关系)
1
2
3
sudo snap remove --purge lxd
sudo snap remove --purge core20
sudo snap remove --purge snapd
  1. 删除snap
1
2
sudo apt remove snapd
sudo apt purge snapd
  1. 添加apt配置文件防止snapd重新被安装回来

sudo vim /etc/apt/preferences.d/nosnap.pref

1
2
3
Package: snapd
Pin: release a=*
Pin-Priority: -10
  1. 清除apt缓存, 重新更新一下(可选)

sudo apt clean && sudo apt update

在完全删除掉snapd之后, 目前机器已经正常运行了两三天…

后续

后来发现还是不太行, 解决方案是给这个内存超级小的机器加上Swap. 因为阿里云没创建swap, 而且还把swappiness设置成了0! 为防止奇怪的事情发生, 弄成crontab脚本每分钟跑一下好了.

创建并启用swap

1
2
3
4
5
6
7
8
dd if=/dev/zero of=/swap.img bs=1M count=1024
chmod 600 /swap.img
mkswap /swap.img
swapon /swap.img
````

添加分钟级任务 `sudo crontab -e`

@reboot swapon -s | grep -q swap || swapon /swap.img
@reboot echo 60 > /proc/sys/vm/swappiness

          • swapon -s | grep -q swap || swapon /swap.img
          • echo 60 > /proc/sys/vm/swappiness

### 参考

[Terminate unattended-upgrades or whatever is using apt in ubuntu 18.04 or later editions](https://askubuntu.com/questions/1186492/terminate-unattended-upgrades-or-whatever-is-using-apt-in-ubuntu-18-04-or-later)

[How to Remove Snap Packages in Ubuntu Linux](https://www.debugpoint.com/remove-snap-ubuntu/)

[How do I configure swappiness?](https://askubuntu.com/questions/103915/how-do-i-configure-swappiness)

[How to read oom-killer syslog messages?](https://serverfault.com/questions/548736/how-to-read-oom-killer-syslog-messages)

[How can I check if swap is active from the command line?](https://unix.stackexchange.com/questions/23072/how-can-i-check-if-swap-is-active-from-the-command-line)

[Linux Partition HOWTO: 9. Setting Up Swap Space](https://tldp.org/HOWTO/Partition/setting_up_swap.html)

[How to Clear RAM Memory Cache, Buffer and Swap Space on Linux](https://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/)

[Swappiness: What it Is, How it Works & How to Adjust](https://phoenixnap.com/kb/swappiness)

设备是HP Ultrium 6-SCSI, 使用LTO-6磁带进行备份, 磁带空间大约为2TB

安装 mt-st 工具管理磁带

1
sudo mt -f /dev/nst0 status

返回大概是这样的

1
2
3
4
5
6
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 0 bytes. Density code 0x5a (LTO-6).
Soft error count since last status=0
General status bits on (41010000):
BOT ONLINE IM_REP_EN

tar工具备份出来的tar包没有对文件顺序有明确的要求, 最终顺序由 readdir 决定, 某些情况下可能不符合要求, 可以提前生成一个文件列表提供给tar.

1
find tobackupdirname -print0 | sort -z > /tmp/filelist.txt

如果等待tar打包的文件总大小超过了磁带总大小, 需要启用MultiVolume支持 (建议在tmux内执行保证复制不会中断)

1
sudo tar -cvf /dev/nst0 -M --no-recursion --null -T /tmp/filelist.txt

这样tar就会在空间写满的时候提示换盘:

1
2
...
Prepare volume #2 for ‘/dev/nst0’ and hit return:

更换磁盘后回车, tar就会继续进行备份了

当然也可以尝试使用LTFS进行数据的备份, 此处不再赘述.

参考

How do I create a tar file in alphabetical order?

Tar Splitting Into Standalone Volumes

有一台很久之前创建的Windows VM, 使用OVMF UEFI作为引导, 但是主盘是通过IDE挂载的, 而且因为系统没有安装Virtio驱动所以直接改为SCSI挂载会报错 INACCESSABLE_BOOT_DEVICE. 换成SATA也是一样的.

一番搜索后找到了一个靠谱的答案:

  1. 关闭VM
  2. 挂载Windows安装镜像, 这里我用的是 Win10 22H2 2024.07 Business Editon
  3. 挂载Virtio驱动镜像
  4. 设置Boot device为刚刚挂载的两个镜像, 取消对其他磁盘的勾选. (这里遇到点问题, virtio的镜像启动优先级必须比windows iso的高, 否则进入windows安装之后搜索不到驱动盘)
  5. 启动VM, 进入windows安装界面, 按Shift+F10打开控制台
  6. 使用 dir C:\, dir D:\ 之类的命令确定当前系统盘和Virtio驱动盘.
  7. 假设系统盘是 C:, 驱动盘是 E:, 通过如下命令安装驱动: dism /image:C:\ /add-driver /driver:E:\vioscsi\w10\amd64
  8. 命令执行完成后通过命令关机: wpeutil shutdown -s
  9. 删除镜像挂载, 把原来的ide磁盘先卸载再通过SCSI的方式挂载
  10. 启动VM, 可以正常引导进入系统了.

整个流程走下来之后VM顺利更换掉了IDE磁盘. 同时我还升级了Machine的定义, 目前暂时没看到什么异常.

参考

Change disk type (IDE/SATA to SCSI) for existing Windows machine

PVE 8.2.4 ISO安装, 由于同一台机器上其他的磁盘上原来已经安装了 PVE 7.4 所以安装过程中提示是否要rename到pve-OLD 而且不能跳过.

安装之后原来磁盘上的PVE会被保留, 但这也造成了我们没法把原来的磁盘用作其他用途 (Disk Manage会报错占用了, 没法Wipe Disk)

参考以下步骤删除掉 pve-OLD

删除lv

1
2
3
lvdisplay
lvchange -an /dev/pve-OLD-0BFE9176
lvremove /dev/pve-OLD-0BFE9176

删除vg

1
2
vgdisplay
vgremove /pve-OLD-0BFE9176

删除pv

1
2
pvdisplay
pvremove /dev/nvme1n1p3

这样原来的磁盘就可以用作其他用途了.

参考

How to delete PVE Old

jordanhillis/pvekclean

PVE 8.2.4 版本iso安装之后 syslog里一直报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
...
Aug 31 22:29:31 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:29:41 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:29:51 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:30:01 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:30:11 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:30:21 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:30:31 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:30:41 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:30:51 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:31:01 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:31:11 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:31:21 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 31 22:31:31 pve-sg pve-firewall[1727]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
...

如果直接操作 ipset, iptables -vnL 命令会提示:

1
2
3
4
5
6
ipset v7.10: Cannot open session to kernel.
command 'ipset save' failed: exit code 1
...

iptables v1.8.9 (legacy): can't initialize iptables table...: Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.

看起来是因为新版本启用了ebtables但是某些地方还是有问题, 没找到特别好的解决方案, 目前的解决办法是 Datacenter -> Firewall -> Options, 设置 ebtablesNo. 重启后错误日志就消失了.

另外还有个奇怪的问题, 默认安装启用 ebtables 的情况下, CIFS mount会失败, 但是 dmesg 和 journalctl 里看不到错误. 目前也没有找到比较可靠的解决方案, 关闭 ebtables 之后正常了.

参考

status update error: iptables_restore_cmdlist

WebUI 上只有添加 Storage 的地方, 没有删除的方式, 需要手动进入Node Shell 进行删除

这里假设要删除的存储是 localssd

  1. 取消 mount 服务 (会自动umount)
1
2
systemctl status mnt-pve-localssd.mount
systemctl disable --now mnt-pve-localssd.mount
  1. 确定mount点已卸载并删除mount点
1
2
ls -al /mnt/pve/localssd
rmdir /mnt/pve/localssd
  1. 删除 mount 文件
1
rm /etc/systemd/system/mnt-pve-localssd.mount
  1. 删除存储配置
1
nano /etc/pve/storage.cfg

找到localssd这一段并删除

1
2
3
4
5
6
dir: localssd
path /mnt/pve/localssd
content images
is_mountpoint 1
nodes pve-sg
shared 0

参考

[SOLVED] Removing old Storage from GUI

Proper way to remove old kernels from PVE 8.0.4 & which are safe to remove

Linux Quick Tip: How to Delete or Remove LVM volumes

  1. 制作一个Ubuntu Desktop启动盘, 使用这个U盘启动系统

  2. 备份全盘内容

1
dd if=/dev/nvme1n1 of=/mnt/download/diskbackup/diskc.raw bs=4M

其中 of 参数指定的输出文件位置可以在网络上, 例如SMB共享. 需要提前通过类似这样的命令 mount

1
2
sudo mkdir /mnt/download
sudo mount -t cifs <SMB地址> /mnt/download -o rw,uid=1000,gid=1000

如果中间不小心中断了的话, 也可以重新引导进入Ubuntu Desktop, 使用如下命令继续备份

1
dd if=/dev/nvme1n1 of=/mnt/download/diskbackup/diskc.raw bs=4M seek=123456789 skip=123456789 iflag=skip_bytes oflag=seek_bytes

其中 123456789 是已经备份生成出来的文件的长度, 建议稍微减小一些数值, 比如已经备份了157GiB (168577466368), 那就可以选择 150GiB作为继续点 (161061273600)

skip_bytes, seek_bytes 表示skipseek的数值是字节, 而不是bs的数量. 否则实际上跳过的字节会变成 123456789 * 4M. iflag, oflag不要写反了

  1. (可选) 生成出来的文件可以转换成 qcow2 格式, 占用空间更小. 源文件越大转换时间越长.
1
qemu-img convert -O qcow2 diskc.raw diskc.qcow2
  1. 创建Proxmox VM

因为源物理机是Windows系统, 此处按照Windows系统创建.

不挂载ISO, BIOS选择OVMF (UEFI), EFI Disk 正常创建即可.

把文件放到Proxmox指定路径下. 如果该路径在SMB上则需要保证磁盘文件的owner正确, 而且权限正确, 否则vm无法启动报错Permission denied (此处跟stackexchange上遇到的情况不太一样, 我只需要chown+chmod 644就可以让vm正常启动了)

  1. 直接使用raw镜像启动, 进入Grub且退出后无法自动进入系统

虽然不知道是什么原因导致的, 但是可以通过在Grub下输入这些来引导windows. 其中 (hd0,gpt1) 是通过grub下 ls 命令得到的.

1
2
3
4
5
insmod part_gpt
insmod chain
set root=(hd0,gpt1)
chainloader /EFI/Microsoft/Boot/bootmgfw.efi
boot

参考

Translating bash to python; “dd” command “iflag=skip_bytes” how can be converted?

Determine the size of a block device

Resuming a DD of an entire disk

How to output file from the specified offset, but not “dd bs=1 skip=N”?

2.4. Converting Between RAW and QCOW2

How to use QEMU/KVM virtual machine disk image on SMB/CIFS network share: Permission denied

How to start a windows partition from the Grub command line

只讨论Linux VM

在PVE界面上操作Clone之后, 检查一下网卡MAC是不是不一样. 老版本PVE似乎有bug, clone VM的时候会把mac也复制. IP一样没问题, 只要都是DHCP等下重启就可以了.

  1. 打开新创建出来的VM, 修改hostname

sudo vim /etc/hostname

  1. 修改hosts文件, 把当前host的名字加上

sudo vim /etc/hosts

  1. 重置 machine-id
1
2
3
echo -n | sudo tee /etc/machine-id
sudo rm /var/lib/dbus/machine-id
sudo ln -s /etc/machine-id /var/lib/dbus/machine-id
  1. 重新配置 OpenSSH服务器 (复制出来的VM会有一样的服务器公钥,不安全)
1
2
sudo rm -rf /etc/ssh/ssh_host*
sudo dpkg-reconfigure openssh-server
  1. 重启机器即可

sudo reboot

在GCP上配置IPSec隧道

  1. 创建对等VPN网关 (Peering VPN Gateway)

  1. 创建Cloud VPN网关和隧道, 此时就是客户端的角色,这里要选择网卡数量,最多两个网卡,每个网卡分配一个公网IP

此时还不能配置BGP对话, 需要先建立起连接才可以

搭建IPSec链接

安装 Strongswan (IPSec VPN)

1
sudo apt install strongswan strongswan-pki

配置ipsec, 配置文件路径 /etc/ipsec.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
config setup
charondebug="all"
uniqueids=yes
strictcrlpolicy=no

conn %default
ikelifetime=600m # 36,000 s
keylife=180m # 10,800 s
rekeymargin=3m
keyingtries=3
keyexchange=ikev2
mobike=no
ike=aes256gcm16-sha512-modp4096 # 这里的参数需要跟GCP平台匹配
esp=aes256gcm16-sha512-modp8192 # 这里的参数需要跟GCP平台匹配
authby=psk

conn net-net1
leftupdown="/var/lib/strongswan/ipsec-vti.sh 0 169.254.232.77/32 169.254.232.78/32" # 这里是p2p通道的GCP侧IP和本端IP
left=10.0.8.4 # In case of NAT set to internal IP, e.x. 10.164.0.6
leftid=10.0.8.4
leftsubnet=0.0.0.0/0
leftauth=psk
right={GCP平台的公网IP}
rightid=%any
rightsubnet=0.0.0.0/0
rightauth=psk
type=tunnel
# auto=add - means strongSwan won't try to initiate it
# auto=start - means strongSwan will try to establish connection as well
# Note that Google Cloud will also try to initiate the connection
auto=start
# dpdaction=restart - means strongSwan will try to reconnect if Dead Peer Detection spots
# a problem. Change to 'clear' if needed
dpdaction=restart
mark=%unique
# mark=1001
# reqid=1001

conn net-net2
leftupdown="/var/lib/strongswan/ipsec-vti.sh 1 169.254.155.53/32 169.254.155.54/32" # 同上
left=10.0.8.4 # In case of NAT set to internal IP, e.x. 10.164.0.6
leftid=10.0.8.4
leftsubnet=0.0.0.0/0
leftauth=psk
right={GCP平台的公网IP}
rightid=%any
rightsubnet=0.0.0.0/0
rightauth=psk
type=tunnel
# auto=add - means strongSwan won't try to initiate it
# auto=start - means strongSwan will try to establish connection as well
# Note that Google Cloud will also try to initiate the connection
auto=start
# dpdaction=restart - means strongSwan will try to reconnect if Dead Peer Detection spots
# a problem. Change to 'clear' if needed
dpdaction=restart
mark=%unique
# mark=1002
# reqid=1002

其中vti脚本内容如下, 路径 /var/lib/strongswan/ipsec-vti.sh (源自网络,见参考)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#!/bin/bash
set -o nounset
set -o errexit

IP=$(which ip)

PLUTO_MARK_OUT_ARR=(${PLUTO_MARK_OUT//// })
PLUTO_MARK_IN_ARR=(${PLUTO_MARK_IN//// })

VTI_TUNNEL_ID=${1}
VTI_REMOTE=${2}
VTI_LOCAL=${3}

LOCAL_IF="${PLUTO_INTERFACE}"
VTI_IF="vti${VTI_TUNNEL_ID}"
# GCP's MTU is 1460, so it's hardcoded
GCP_MTU="1460"
# ipsec overhead is 73 bytes, we need to compute new mtu.
VTI_MTU=$((GCP_MTU-73))

case "${PLUTO_VERB}" in
up-client)
${IP} link add ${VTI_IF} type vti local ${PLUTO_ME} remote ${PLUTO_PEER} okey ${PLUTO_MARK_OUT_ARR[0]} ikey ${PLUTO_MARK_IN_ARR[0]}
${IP} addr add ${VTI_LOCAL} remote ${VTI_REMOTE} dev "${VTI_IF}"
${IP} link set ${VTI_IF} up mtu ${VTI_MTU}

# Disable IPSEC Policy
sysctl -w net.ipv4.conf.${VTI_IF}.disable_policy=1

# Enable loosy source validation, if possible. Otherwise disable validation.
sysctl -w net.ipv4.conf.${VTI_IF}.rp_filter=2 || sysctl -w net.ipv4.conf.${VTI_IF}.rp_filter=0

# If you would like to use VTI for policy-based you should take care of routing by yourselv, e.x.
#if [[ "${PLUTO_PEER_CLIENT}" != "0.0.0.0/0" ]]; then
# ${IP} r add "${PLUTO_PEER_CLIENT}" dev "${VTI_IF}"
#fi
;;
down-client)
${IP} tunnel del "${VTI_IF}"
;;
esac

# Enable IPv4 forwarding
sysctl -w net.ipv4.ip_forward=1

# Disable IPSEC Encryption on local net
sysctl -w net.ipv4.conf.${LOCAL_IF}.disable_xfrm=1
sysctl -w net.ipv4.conf.${LOCAL_IF}.disable_policy=1

注: 此处VTI也适用于Route-based/Policy-based IPSec, 对应的vti local ip 写自己的eth0网卡ip, remote ip写GCP的公网IP, 跟right=那个参数保持一致即可, 不使用脚本也可以通过命令手动操作:

1
2
3
4
5
6
7
8
9
10
11
12
sudo ip tunnel add ipsec0 local <eth0 IP> remote <GCP IP> mode vti key 1  # 这里的key需要跟ipsec status里面提到的key一样. 也可以在配置ipsec隧道的时候加上key参数来指定.
sudo ip add add 10.10.0.1/30 dev ipsec0 # 这里随便加, 如果对端是Policy-based则加一个对端允许的IP, Route-based根据对方路由表规划加
sudo ip link set dev ipsec0 up
sudo ip route add 10.20.0.0/16 dev ipsec0 src 10.10.0.1 # 这里如果是Policy-based的则加对端的subnet, Route-based的根据路由表规划加.
# 这两条是用来关闭VTI的policy, 让VTI跟普通的interface行为一样.
sudo sysctl net.ipv4.conf.ipsec0.disable_policy=1
sudo sysctl net.ipv4.conf.ipsec0.rp_filter=2
# 这两条用来关闭eth0的XFRM和Policy, 让ipsec流量不再透明加解密.
sudo sysctl net.ipv4.eth0.disable_xfrm=1
sudo sysctl net.ipv4.eth0.disable_policy=1
# 帮助对端访问更多内网(可选)
sudo iptables -t nat -A POSTROUTING -o eth0 -s 10.20.0.0/16 -j MASQUERADE

调整 charon 配置, 不要安装路由 (下一步由bird2配置具体的路由, 否则会像WireGuard一样安装一个0.0.0.0/0的路由) /etc/strongswan.d/vti.conf 这里其实有点像wg的Table=off

1
2
3
4
charon {
# We will handle routes by ourselves
install_routes = no
}

配置PSK, 编辑 /etc/ipsec.secrets. PSK最长可以有63个字符 (strongswan的限制, RFC标准比这个长)

1
2
3
4
5
6
# This file holds shared secrets or RSA private keys for authentication.

# RSA private key for this host, authenticating it to any other host
# which knows the public part.

{GCP平台的公网IP} : PSK "pskhere"

配置防火墙/端口转发(如有需要), ipsec使用: 500/udp,4500/udp,4510/udp,4511/udp

通过命令启动/关闭/查看ipsec隧道

启动: sudo ipsec start

关闭: sudo ipsec stop

查看状态 sudo ipsec statusall (也可以使用 sudo ipsec status)

看到ESTABLISHED字样就表明已经建立连接了 (这里只能看到一个因为写文章的时候已经开始回收测试环境了)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Status of IKE charon daemon (strongSwan 5.9.5, Linux 5.15.0-107-generic, x86_64):
uptime: 13 minutes, since Jul 25 14:04:56 2024
malloc: sbrk 3100672, mmap 0, used 1427696, free 1672976
worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 4
loaded plugins: charon aesni aes rc2 sha2 sha1 md5 mgf1 random nonce x509 revocation constraints pubkey pkcs1 pkcs7 pkcs8 pkcs12 pgp dnskey sshkey pem openssl fips-prf gmp agent xcbc hmac gcm drbg attr kernel-netlink resolve socket-default connmark stroke updown eap-mschapv2 xauth-generic counters
Listening IP addresses:
10.0.8.4
169.254.232.78
Connections:
net-net1: 10.0.8.4...34.128.44.254 IKEv2, dpddelay=30s
net-net1: local: [10.0.8.4] uses pre-shared key authentication
net-net1: remote: uses pre-shared key authentication
net-net1: child: 0.0.0.0/0 === 0.0.0.0/0 TUNNEL, dpdaction=restart
net-net2: 10.0.8.4...<reducted> IKEv2, dpddelay=30s
net-net2: local: [10.0.8.4] uses pre-shared key authentication
net-net2: remote: uses pre-shared key authentication
net-net2: child: 0.0.0.0/0 === 0.0.0.0/0 TUNNEL, dpdaction=restart
Security Associations (1 up, 0 connecting):
net-net1[4]: ESTABLISHED 13 minutes ago, 10.0.8.4[10.0.8.4]...<reducted>
net-net1[4]: IKEv2 SPIs: <reducted>, pre-shared key reauthentication in 9 hours
net-net1[4]: IKE proposal: AES_GCM_16_256/PRF_HMAC_SHA2_512/MODP_4096
net-net1{1}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: <reducted>
net-net1{1}: AES_GCM_16_256, 5986 bytes_i, 6007 bytes_o (96 pkts, 4s ago), rekeying in 2 hours
net-net1{1}: 0.0.0.0/0 === 0.0.0.0/0

在GCP上配置BGP链路

创建高可用VPN, 本质是一个隧道组,要实现HA,最少需要 2 中的两个网卡每个网卡配置一条隧道到1,最多可以配置 n*m 个隧道

搭建BGP链路

安装BIRD2 sudo apt install bird2

编写bird2配置 /etc/bird/bird.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
log syslog all;
debug protocols all;

protocol device {
scan time 10;
}

protocol direct {
ipv4; # Connect to default IPv4 table
}

protocol kernel {
ipv4 { # Connect protocol to IPv4 table by channel
import none;
export all; # Export to protocol. default is export none
};
}

protocol static {
ipv4;
route 192.168.48.0/24 via 10.181.0.2;
route 169.254.232.77/32 via 192.254.232.78;
}

protocol bgp gcp_vpc_a_tun1 {
local 169.254.232.78 as 65003; # 本端ASN
neighbor 169.254.232.77 as 64520; # 对端ASN
multihop;
keepalive time 20;
hold time 60;
graceful restart aware;
ipv4 {
import filter {
gw = 169.254.232.77; # 不知道为什么一定要加这个
accept;
};
import limit 10 action warn;
export filter{
if (net ~ 192.168.0.0/16) then accept; # 这里是输出路由给GCP时的过滤
else reject;
};
export limit 10 action warn;
};
}

protocol bgp gcp_vpc_a_tun2 {
local 169.254.155.54 as 65003; # 本端ASN
neighbor 169.254.155.53 as 64520; # 对端ASN
multihop;
keepalive time 20;
hold time 60;
graceful restart aware;
ipv4 {
import filter {
gw = 169.254.155.53;
accept;
};
import limit 10 action warn;
export filter{
if (net ~ 192.168.0.0/16) then accept;
else reject;
};
export limit 10 action warn;
};
}

重加载bird配置 sudo birdc configure

查看bird协议状态 sudo birdc show protocol all

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
BIRD 2.0.8 ready.

<中间省略>

gcp_vpc_a_tun1 BGP --- up 14:04:57.581 Established
BGP state: Established
Neighbor address: 169.254.232.77
Neighbor AS: 64520
Local AS: 65003
Neighbor ID: 169.254.232.77
Local capabilities
Multiprotocol
AF announced: ipv4
Route refresh
Graceful restart
4-octet AS numbers
Enhanced refresh
Long-lived graceful restart
Neighbor capabilities
Multiprotocol
AF announced: ipv4
Route refresh
Graceful restart
Restart time: 60
Restart recovery
AF supported: ipv4
AF preserved: ipv4
4-octet AS numbers
Session: external multihop AS4
Source address: 169.254.232.78
Hold timer: 43.240/60
Keepalive timer: 1.847/20
Channel ipv4
State: UP
Table: master4
Preference: 100
Input filter: (unnamed)
Output filter: (unnamed)
Import limit: 10
Action: warn
Export limit: 10
Action: warn
Routes: 1 imported, 0 exported, 1 preferred
Route change stats: received rejected filtered ignored accepted
Import updates: 1 0 0 0 1
Import withdraws: 0 0 --- 0 0
Export updates: 3 1 2 --- 0
Export withdraws: 0 --- --- --- 0
BGP Next hop: 169.254.232.78
IGP IPv4 table: master4

gcp_vpc_a_tun2 BGP --- start 14:04:56.161 Active Socket: Connection closed 同上,如果有两个对端且配置正确这里会是Established,但是对端已经被回收了
BGP state: Active
Neighbor address: 169.254.155.53
Neighbor AS: 64520
Local AS: 65003
Connect delay: 2.781/5
Last error: Socket: Connection closed
Channel ipv4
State: DOWN
Table: master4
Preference: 100
Input filter: (unnamed)
Output filter: (unnamed)
Import limit: 10
Action: warn
Export limit: 10
Action: warn
IGP IPv4 table: master4

localnet OSPF master4 up 14:04:56.161 Alone
Channel ipv4
State: UP
Table: master4
Preference: 150
Input filter: (unnamed)
Output filter: (unnamed)
Routes: 0 imported, 3 exported, 0 preferred
Route change stats: received rejected filtered ignored accepted
Import updates: 0 0 0 0 0
Import withdraws: 0 0 --- 0 0
Export updates: 4 0 0 --- 4
Export withdraws: 0 --- --- --- 0

没问题的话就能看到GCP推送过来的路由了 ip route

1
2
3
4
5
6
...
10.0.1.0/24 via 169.254.232.77 dev vti0 proto bird metric 32
...
169.254.232.77 dev vti0 proto kernel scope link src 169.254.232.78
169.254.232.77 dev vti0 proto bird scope link metric 32
...

参考

How to set up a VPN between strongSwan and Cloud VPN

Using Strongswan to setup site to site IPsec VPN between GCP and Digital Ocean

Configuring Site-to-Site IPSec VPN on Ubuntu using Strongswan

Route-based VPN - strongswan Documentation

Figuring out how ipsec transforms work in Linux

Establish VPN tunnel for in-house machine to access GCP network

KB: Connecting OpenWRT/LEDE router to Azure Virtual Network Gateway (IKEv2)

Secure site-to-site connection with Linux IPsec VPN

Google Cloud HA VPN interoperability guide for AWS

How does IPsec VPN really work?

BGP 对等会话

Google史一样的文档: Create two fully configured HA VPN gateways that connect to each other | Establish BGP sessions

GCP Networking: Part 2 Cloud Router

Foo over UDP

IPsec vs. WireGuard

The Noise Protocol Framework

rp_filter - Sysctl Explorer

howto/Bird2 - dn42 虽然是dn42的教程但是非常管用

Multiple connections with longer than 64 byte PSK keys fail with “MAC mismatched”