本文主要关注基于ip命令的配置, 尽量避免使用ifconfig, route, brctl等传统命令, 尽量避开使用systemd-network等网络管理器.

本文基于 利用OSPF协议实现WireGuard高可用 并假设已存在一个由WireGuard安全点对点连接组成的网络, 且网络中运行一种IGP协议(例如OSPF).

OSPF Network Diagram

网络拓扑如上图. 其中点对点链路均使用/30网段, 各路由器均运行OSPF协议. 现在想利用Router 10.65.1.1, 10.65.1.2 两台机器实现Router 10.65.2.210.65.2.1 流量最大化通信.

由于WireGuard本身运行在L3/IP层, 且官方版本不支持设置mac地址(有魔改版据说做到了), 我们无法利用Linux本身提供的Bonding功能来做原生负载均衡. 因此可以在 10.65.2.210.65.2.1 之间分别搭建两条经过不同路由的GRE隧道, 然后在两侧分别将两个GRE端口绑定起来.

GRE Tunnel Diagram

需要注意的是, GRE隧道分为GRE和GRETAP, 其中GRE也是运行在L3的, GRETAP则是运行在L2的. 尽管GRE没有加密功能, 但由于外层隧道本身是加密的, 所以不会有安全问题, 也避免了多次加密带来的性能损耗.

首先加载必要的kernel module (不过这一步似乎可以省略, 因为新建gre设备的时候会自动加载)

1
2
modprobe ip_gre
modprobe bonding

创建GRE隧道

在Router 10.65.2.2上:

1
2
3
4
5
6
7
ip link add gre1 type gretap local 10.65.0.2 remote 10.65.0.6 ttl 255
ip addr add 10.66.0.1/24 dev gre1
ip link set dev gre1 up

ip link add gre2 type gretap local 10.65.0.14 remote 10.65.0.10 ttl 255
ip addr add 10.66.1.1/24 dev gre2
ip link set dev gre2 up

在Router 10.65.2.1上:

1
2
3
4
5
6
7
ip link add gre1 type gretap local 10.65.0.6 remote 10.65.0.2
ip addr add 10.66.0.2/24 dev gre1
ip link set dev gre1 up

ip link add gre2 type gretap local 10.65.0.10 remote 10.65.0.14
ip addr add 10.66.1.2/24 dev gre2
ip link set dev gre2 up

此时两侧应该可以通过gre隧道ping通:

1
2
3
PING 10.66.0.2 (10.66.0.2) 56(84) bytes of data.
64 bytes from 10.66.0.2: icmp_seq=1 ttl=64 time=...
...

创建Bonding

注意: 向bonding添加slave时, 对应的设备状态不能为up.

在Router 10.65.2.2上:

1
2
3
4
5
6
7
8
9
10
11
12
13
ip link add bond0 type bond
ip link set dev bond0 type bond mode balance-rr
ip addr add 10.67.0.1/24 dev bond0

ip link set dev gre1 down
ip link set dev gre1 master bond0
ip link set dev gre1 up

ip link set dev gre2 down
ip link set dev gre2 master bond0
ip link set dev gre2 up

ip link set dev bond0 up

这里, 由于需求是尽量使用带宽, 这里采用了balance-rr模式, 即平均分配入流量到两个接口上. 此外还有 active-backup, balance-xor, broadcase, 802.3ad, balance-tlb, balance-alb 等模式.

在Router 10.65.2.1上:

1
2
3
4
5
6
7
8
9
10
11
12
13
ip link add bond0 type bond
ip link set dev bond0 type bond mode balance-rr
ip addr add 10.67.0.2/24 dev bond0

ip link set dev gre1c down
ip link set dev gre1c master bond0
ip link set dev gre1c up

ip link set dev gre2c down
ip link set dev gre2c master bond0
ip link set dev gre2c up

ip link set dev bond0 up

此时两侧应该可以通过bond0的地址ping通:

1
2
3
PING 10.67.0.1 (10.67.0.1) 56(84) bytes of data.
64 bytes from 10.67.0.1: icmp_seq=1 ttl=64 time=...
...

不知道为什么, 在两侧bond0都启动完成后, 如果只从一侧开始ping刚开始并不能ping通, 如果此时从另一侧也开始ping, 那么两侧从此都可以互相ping通. 推测可能是没有给bond0设置miimon等参数导致的. (MIIMON是Media Independent Interface Monitoring的缩写)

bonding的状态可以通过 /proc/net/bonding/bond0 获取:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: gre1
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: ...
Slave queue ID: 0

Slave Interface: gre2
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: ...
Slave queue ID: 0

至此已基本搭建完毕. 在两侧通过bond0的地址使用iperf3进行测速, 实测可以达到几乎双倍的速度.

另外, 由于底层网络基于WG+OSPF, 当网络中有节点掉线的时, bond接口会有短暂的丢包(实际观测看要>50%, 几乎65%) 经过一段时间OSPF完成收敛后(默认配置下大约45秒), bond接口就会恢复正常. 推测如果bond接口本身配置了miimon可能在bond层会先剔除掉超时的slave.

最后我们来计算一下开销:

GRE with WireGuard Packet

只考虑IPv4的情况下, 从外到内分别是:

  • 外层IPv4, 20 bytes
  • UDP, 8 bytes
  • WireGuard, 32 bytes
  • 内层IPv4, 20 bytes
  • GRE, 4 bytes
  • 以太网帧头部, 14 bytes (因为用的是L2 GRETAP)

最终在基础MTU=1500的前提下, 最内层MTU还剩下1402. 如果外层内层均更换为IPv6, 由于IPv6 header为40 bytes, 那么最后留给最内层的MTU还剩下1362, 距离IPv6要求的最低1280还有一点空间.

最开始给gre隧道挂到bridge下面了, 结果两边bridge一开直接回环网络风暴… = =||

参考资料

GRE bridging, IPsec and NFQUEUE

SETUP GRE TUNNEL ON UBUNTU 20 LINUX SERVER

Syntax for changing the bond mode of an interface

ip-link(8) — Linux manual page

networking:bonding [Wiki]

Bonding - Debian Wiki

7.7. Using Channel Bonding

10.5 Configuring Network Interface Bonding

Switch flooding when bonding interfaces in Linux

A Beginner’s Guide to Generic Routing Encapsulation

How to create a GRE tunnel on Linux

Marnik - Up & Down (Official Video)

Schadenfreude - S3RL

OUTRAGE & Jetty Rachers & Hi3ND - Desire

YOASOBI「三原色」Official Music Video

Doki Doki ドキドキ - S3RL ft Kawaiiconic

Wanna Fight Huh - S3RL

PinocchioP - Magical Girl and Chocolate feat. Hatsune Miku | 【初音ミク】魔法少女とチョコレゐト【ピノキオピー】

PinocchioP - SLoWMoTIoN feat. Hatsune Miku

【脈アリ?】最近カレ死が冷たいの / みつあくま feat. 初音ミク【プロセカNEXT】 (Necro-Fantasista)

【ママに内緒で】ショウコ隠滅、少女純潔 / みつあくま fealty. 初音ミク【プロセカNEXT】 (Virgin birth) | 【初音ミク】【对妈妈保密】消灭证据,少女纯洁【みつあくま】 (Virgin birth)

“終わカレ”はブロックで / みつあくま fear. 初音ミク【プロセカNEXT】(My ex Blocker)

Anemone / mitsu_devil

Ephemeral Melody

往期优秀作品推荐

2022年6-7月

Jannik - Grace 惊鸿 / 网易云音乐

PinocchioP - God-ish feat. Hatsune Miku / 【初音ミク】神っぽいな (像神一样呐)【ピノキオピー】

【五学】像阁下一样呐

The Weeknd - Out of Time (Official Video) 103.5 DAWN FM

R3HAB & KSHMR - Strong (Official Music Video)

Marnik, LUNAX - Bye Bye Bye (Lyrics Video)

Doja Cat - Vegas (From the Original Motion Picture Soundtrack ELVIS) (Official Video)

伊格赛听 & 叶里 - 谪仙(DJ名龙)「称谪仙瑶宫难留,去凡间红楼斗酒」【動態歌詞/pīn yīn gē cí】

麦小兜 - 下山【動態歌詞/Lyrics Video】

往期优秀作品推荐

2022年5月

现象: WmiPrvSE.exe(SYSTEM)高CPU占用.

排查原因: 事件查看器 应用程序和服务日志/Microsoft/Windows/WMI-Activity/Trace 右键启用日志. 可以看到里面提示了发起WMI调用的ClientProcessId, 定位到进程 AUEPMaster.exe

解决方案: Ryzen Master 设置/用户体验计划/AMD用户体验计划 选择退订即可.

参考

WMI Provider Host at high usage due to AUEPMaster.exe causing errors.

Web-Based Enterprise Management Wbem

WMI-Activity Event 5858 logged frequently with ResultCode 0x80041032

系统: Ubuntu 20.04.4 LTS

内核: 5.4.0-105-generic #119-Ubuntu SMP, 5.4.0-109-generic #123-Ubuntu SMP, 5.4.0-110-generic #124-Ubuntu SMP

OSPF Network Diagram

本文以三台机器组网举例, 首先先用WireGuard组成一个Mesh网络(过程略). 在组网的过程中需要注意:

  1. Table=off 关闭wg-quick自动添加路由表的功能.

  2. Address=xxxx/xx CIDR网络号要写正确才能保证被BIRD识别. (刚开始没写, BIRD会默认为是/32 从而学习不到路由…)

  3. sysctl net.ipv4.conf.all.forwarding=1sysctl net.ipv4.ip_forward=1

  4. (可选) iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE 如果需要从两端ping通对方的话

接下来安装BIRDv2 (BIRD Internet Routing Daemon): sudo apt install bird2 注意不要装错成 BIRDv1 了.

本次组网准备实现以下目标:

  1. 将三台机器组成一个OSPF网络并学习基本概念

  2. 实现WireGuard的Failover, 当Mesh网络间两点断开时, 自动切换路由为绕路.

  3. 尝试实现Load balancing.

安装完BIRD之后编辑文件 /etc/bird/bird.conf, 配置可以参考下面(注释):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
log syslog all;
debug protocols all;

#这个值对于不同的节点应该是不同的, 但不需要是真的IP地址.
router id 10.65.2.2;

# 对于边界的节点, 加上这部分
protocol direct {
ipv4;
interface "eth0"; # 根据主机上网卡实际名称填写.
}

protocol kernel {
ipv4 {
export where proto = "wg";
};
}

protocol ospf v2 wg {
# Cost一样的时候要不要启用负载均衡. ECMP默认是开的.
ecmp yes;
merge path yes;

ipv4 {
import where net !~ [10.65.2.0/24, 10.65.1.0/24];
export all;
};

# 这个Area也不需要是真的IP地址, 但为了方便可以起这个名字
area 10.65.2.0 {
interface "test0" {
# 默认Cost是10, Cost越低选路优先. 注意这个Cost是单向向外的.
cost 5;

# 密码, 对端没有的话就不能建立邻居关系, 可以去掉.
authentication cryptographic;
password "pass" {
algorithm hmac sha256;
}

# 链接类型定义. 由于是基于WireGuard的, 所以可以改成PTP网络, 会稍微减少消耗加快速度, 但实际用途不大.
type ptp;
};
interface "test1";
};

# 有其它的区域可以继续定义. Area号为0的区域是骨干网特殊区域.
}

# 如果还有其它OSPF网络可以在下面继续写.
#protocol ospf v2 lan {
# ...
#}

运行 sudo birdc configure 生效配置.

可以看到本地经WireGuard发往多播地址 224.0.0.5224.0.0.22 的包: sudo tcpdump -vvni test1

1
2
3
4
5
6
7
8
9
03:10:32.994883 IP (tos 0xc0, ttl 1, id 13399, offset 0, flags [none], proto OSPF (89), length 64)
10.65.2.1 > 224.0.0.5: OSPFv2, Hello, length 44
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Options [External]
Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.0, Priority 1
03:10:33.006977 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
10.65.2.1 > 224.0.0.22: igmp v3 report, 1 group record(s) [gaddr 224.0.0.5 to_ex { }]
03:10:33.146978 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
10.65.2.1 > 224.0.0.22: igmp v3 report, 1 group record(s) [gaddr 224.0.0.5 to_ex { }]

224.0.0.5: The Open Shortest Path First (OSPF) All OSPF Routers address is used to send Hello packets to all OSPF routers on a network segment. Not routable.

224.0.0.22: Internet Group Management Protocol (IGMP) version 3. Not routable.

打开另一端的BIRD服务, 可以看到两方交换了路由信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
03:10:42.003012 IP (tos 0xc0, ttl 1, id 33479, offset 0, flags [none], proto OSPF (89), length 68)
10.65.2.2 > 224.0.0.5: OSPFv2, Hello, length 48
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0)
Options [External]
Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.0, Priority 1
Neighbor List:
10.65.2.1
03:10:42.003332 IP (tos 0xc0, ttl 1, id 21524, offset 0, flags [none], proto OSPF (89), length 52)
10.65.2.1 > 10.65.2.2: OSPFv2, Database Description, length 32
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Options [External, Opaque], DD Flags [Init, More, Master], MTU: 1420, Sequence: 0x5cf0abf6
03:10:42.337447 IP (tos 0xc0, ttl 1, id 39044, offset 0, flags [none], proto OSPF (89), length 52)
10.65.2.2 > 10.65.2.1: OSPFv2, Database Description, length 32
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0)
Options [External, Opaque], DD Flags [Init, More, Master], MTU: 1420, Sequence: 0x159cc5bf
03:10:42.337594 IP (tos 0xc0, ttl 1, id 21573, offset 0, flags [none], proto OSPF (89), length 92)
10.65.2.1 > 10.65.2.2: OSPFv2, Database Description, length 72
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Options [External, Opaque], DD Flags [none], MTU: 1420, Sequence: 0x159cc5bf
Advertising Router 10.65.2.1, seq 0x80000001, age 9s, length 16
External LSA (5), LSA-ID: 192.168.50.255
Options: [External]
Advertising Router 10.65.2.1, seq 0x80000001, age 8s, length 28
Router LSA (1), LSA-ID: 10.65.2.1
Options: [External, Opaque]
03:10:42.671947 IP (tos 0xc0, ttl 1, id 39091, offset 0, flags [none], proto OSPF (89), length 92)
10.65.2.2 > 10.65.2.1: OSPFv2, Database Description, length 72
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0)
Options [External, Opaque], DD Flags [Master], MTU: 1420, Sequence: 0x159cc5c0
Advertising Router 10.65.2.2, seq 0x80000001, age 393s, length 16
External LSA (5), LSA-ID: 192.168.31.0
Options: [External]
Advertising Router 10.65.2.2, seq 0x80000003, age 74s, length 28
Router LSA (1), LSA-ID: 10.65.2.2
Options: [External, Opaque]
03:10:42.671966 IP (tos 0xc0, ttl 1, id 39092, offset 0, flags [none], proto OSPF (89), length 68)
10.65.2.2 > 10.65.2.1: OSPFv2, LS-Request, length 48
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0)
Advertising Router: 10.65.2.1, External LSA (5), LSA-ID: 192.168.50.255
Advertising Router: 10.65.2.1, Router LSA (1), LSA-ID: 10.65.2.1
03:10:42.672043 IP (tos 0xc0, ttl 1, id 21603, offset 0, flags [none], proto OSPF (89), length 52)
10.65.2.1 > 10.65.2.2: OSPFv2, Database Description, length 32
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Options [External, Opaque], DD Flags [none], MTU: 1420, Sequence: 0x159cc5c0
03:10:42.672065 IP (tos 0xc0, ttl 1, id 21604, offset 0, flags [none], proto OSPF (89), length 68)
10.65.2.1 > 10.65.2.2: OSPFv2, LS-Request, length 48
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Advertising Router: 10.65.2.2, External LSA (5), LSA-ID: 192.168.31.0
Advertising Router: 10.65.2.2, Router LSA (1), LSA-ID: 10.65.2.2
03:10:42.672092 IP (tos 0xc0, ttl 1, id 21605, offset 0, flags [none], proto OSPF (89), length 132)
10.65.2.1 > 10.65.2.2: OSPFv2, LS-Update, length 112
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0), 2 LSAs
LSA #1
Advertising Router 10.65.2.1, seq 0x80000001, age 10s, length 16
External LSA (5), LSA-ID: 192.168.50.255
Options: [External]
Mask 255.255.255.0
topology default (0), type 2, metric 10000
0x0000: ffff ff00 8000 2710 0000 0000 0000 0000
LSA #2
Advertising Router 10.65.2.1, seq 0x80000001, age 9s, length 28
Router LSA (1), LSA-ID: 10.65.2.1
Options: [External, Opaque]
Router LSA Options: [ASBR]
Stub Network: 10.65.0.0, Mask: 255.255.255.0
topology default (0), metric 5
Stub Network: 10.65.2.0, Mask: 255.255.255.0
topology default (0), metric 10
0x0000: 0200 0002 0a41 0000 ffff ff00 0300 0005
0x0010: 0a41 0200 ffff ff00 0300 000a
03:10:42.992559 IP (tos 0xc0, ttl 1, id 14826, offset 0, flags [none], proto OSPF (89), length 68)
10.65.2.1 > 224.0.0.5: OSPFv2, Hello, length 48
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Options [External]
Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.0, Priority 1
Neighbor List:
10.65.2.2
03:10:43.005716 IP (tos 0xc0, ttl 1, id 39106, offset 0, flags [none], proto OSPF (89), length 132)
10.65.2.2 > 10.65.2.1: OSPFv2, LS-Update, length 112
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0), 2 LSAs
LSA #1
Advertising Router 10.65.2.2, seq 0x80000001, age 395s, length 16
External LSA (5), LSA-ID: 192.168.31.0
Options: [External]
Mask 255.255.255.0
topology default (0), type 2, metric 10000
0x0000: ffff ff00 8000 2710 0000 0000 0000 0000
LSA #2
Advertising Router 10.65.2.2, seq 0x80000003, age 76s, length 28
Router LSA (1), LSA-ID: 10.65.2.2
Options: [External, Opaque]
Router LSA Options: [ASBR]
Stub Network: 10.65.1.0, Mask: 255.255.255.0
topology default (0), metric 5
Stub Network: 10.65.2.0, Mask: 255.255.255.0
topology default (0), metric 10
0x0000: 0200 0002 0a41 0100 ffff ff00 0300 0005
0x0010: 0a41 0200 ffff ff00 0300 000a
03:10:44.093123 IP (tos 0xc0, ttl 1, id 21893, offset 0, flags [none], proto OSPF (89), length 108)
10.65.2.1 > 10.65.2.2: OSPFv2, LS-Update, length 88
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0), 1 LSA
LSA #1
Advertising Router 10.65.2.1, seq 0x80000002, age 1s, length 40
Router LSA (1), LSA-ID: 10.65.2.1
Options: [External, Opaque]
Router LSA Options: [ASBR]
Stub Network: 10.65.0.0, Mask: 255.255.255.0
topology default (0), metric 5
Neighbor Router-ID: 10.65.2.2, Interface Address: 10.65.2.1
topology default (0), metric 10
Stub Network: 10.65.2.0, Mask: 255.255.255.0
topology default (0), metric 10
0x0000: 0200 0003 0a41 0000 ffff ff00 0300 0005
0x0010: 0a41 0202 0a41 0201 0100 000a 0a41 0200
0x0020: ffff ff00 0300 000a
03:10:44.427037 IP (tos 0xc0, ttl 1, id 39221, offset 0, flags [none], proto OSPF (89), length 108)
10.65.2.2 > 10.65.2.1: OSPFv2, LS-Update, length 88
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0), 1 LSA
LSA #1
Advertising Router 10.65.2.2, seq 0x80000004, age 1s, length 40
Router LSA (1), LSA-ID: 10.65.2.2
Options: [External, Opaque]
Router LSA Options: [ASBR]
Stub Network: 10.65.1.0, Mask: 255.255.255.0
topology default (0), metric 5
Neighbor Router-ID: 10.65.2.1, Interface Address: 10.65.2.2
topology default (0), metric 10
Stub Network: 10.65.2.0, Mask: 255.255.255.0
topology default (0), metric 10
0x0000: 0200 0003 0a41 0100 ffff ff00 0300 0005
0x0010: 0a41 0201 0a41 0202 0100 000a 0a41 0200
0x0020: ffff ff00 0300 000a
03:10:44.503378 IP (tos 0xc0, ttl 1, id 21953, offset 0, flags [none], proto OSPF (89), length 104)
10.65.2.1 > 10.65.2.2: OSPFv2, LS-Ack, length 84
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Advertising Router 10.65.2.2, seq 0x80000001, age 395s, length 16
External LSA (5), LSA-ID: 192.168.31.0
Options: [External]
Advertising Router 10.65.2.2, seq 0x80000003, age 76s, length 28
Router LSA (1), LSA-ID: 10.65.2.2
Options: [External, Opaque]
Advertising Router 10.65.2.2, seq 0x80000004, age 1s, length 40
Router LSA (1), LSA-ID: 10.65.2.2
Options: [External, Opaque]
03:10:44.837345 IP (tos 0xc0, ttl 1, id 39305, offset 0, flags [none], proto OSPF (89), length 104)
10.65.2.2 > 10.65.2.1: OSPFv2, LS-Ack, length 84
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0)
Advertising Router 10.65.2.1, seq 0x80000001, age 10s, length 16
External LSA (5), LSA-ID: 192.168.50.255
Options: [External]
Advertising Router 10.65.2.1, seq 0x80000001, age 9s, length 28
Router LSA (1), LSA-ID: 10.65.2.1
Options: [External, Opaque]
Advertising Router 10.65.2.1, seq 0x80000002, age 1s, length 40
Router LSA (1), LSA-ID: 10.65.2.1
Options: [External, Opaque]
03:10:52.002272 IP (tos 0xc0, ttl 1, id 34658, offset 0, flags [none], proto OSPF (89), length 68)
10.65.2.2 > 224.0.0.5: OSPFv2, Hello, length 48
Router-ID 10.65.2.2, Area 10.65.2.0, Authentication Type: none (0)
Options [External]
Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.0, Priority 1
Neighbor List:
10.65.2.1
03:10:52.993760 IP (tos 0xc0, ttl 1, id 14999, offset 0, flags [none], proto OSPF (89), length 68)
10.65.2.1 > 224.0.0.5: OSPFv2, Hello, length 48
Router-ID 10.65.2.1, Area 10.65.2.0, Authentication Type: none (0)
Options [External]
Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.0, Priority 1
Neighbor List:
10.65.2.2

可以看到BIRD运行的日志: sudo journalctl -f -u bird.service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: HELLO packet received from nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: New neighbor 10.65.2.2 on test1, IP address 10.65.2.2
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: Neighbor 10.65.2.2 on test1 changed state from Down to Init
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: Neighbor 10.65.2.2 on test1 changed state from Init to 2-Way
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: Neighbor 10.65.2.2 on test1 changed state from 2-Way to ExStart
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: DBDES packet sent to nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 32
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: mtu 1420
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: imms I M MS
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: ddseq 1559276534
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: DBDES packet received from nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 32
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.2
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: mtu 1420
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: imms I M MS
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: ddseq 362595775
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: Neighbor 10.65.2.2 on test1 changed state from ExStart to Exchange
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: DBDES packet sent to nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 72
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: mtu 1420
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: imms
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: ddseq 362595775
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSA Type: 0005, Id: 192.168.50.255, Rt: 10.65.2.1, Seq: 80000001, Age: 9, Sum: 9120
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.1, Rt: 10.65.2.1, Seq: 80000001, Age: 8, Sum: 2c90
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: DBDES packet received from nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 72
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.2
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: mtu 1420
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: imms MS
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: ddseq 362595776
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSA Type: 0005, Id: 192.168.31.0, Rt: 10.65.2.2, Seq: 80000001, Age: 393, Sum: 5d66
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.2, Rt: 10.65.2.2, Seq: 80000003, Age: 74, Sum: 2196
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: DBDES packet sent to nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 32
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: mtu 1420
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: imms
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: ddseq 362595776
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: Neighbor 10.65.2.2 on test1 changed state from Exchange to Loading
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSREQ packet sent to nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 48
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSR Type: 0005, Id: 192.168.31.0, Rt: 10.65.2.2
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSR Type: 0001, Id: 10.65.2.2, Rt: 10.65.2.2
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSREQ packet received from nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 48
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.2
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSR Type: 0005, Id: 192.168.50.255, Rt: 10.65.2.1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSR Type: 0001, Id: 10.65.2.1, Rt: 10.65.2.1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSUPD packet sent to nbr 10.65.2.2 on test1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: length 112
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: router 10.65.2.1
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSA Type: 0005, Id: 192.168.50.255, Rt: 10.65.2.1, Seq: 80000001, Age: 10, Sum: 9120
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.1, Rt: 10.65.2.1, Seq: 80000001, Age: 9, Sum: 2c90
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: HELLO packet sent via test0
May 17 03:10:42 ubuntu-ss-new bird[99124]: wg: HELLO packet sent via test1
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: LSUPD packet received from nbr 10.65.2.2 on test1
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: length 112
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: router 10.65.2.2
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: LSA Type: 0005, Id: 192.168.31.0, Rt: 10.65.2.2, Seq: 80000001, Age: 395, Sum: 5d66
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.2, Rt: 10.65.2.2, Seq: 80000003, Age: 76, Sum: 2196
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Installing LSA: Type: 4005, Id: 192.168.31.0, Rt: 10.65.2.2, Seq: 80000001, Age: 395
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Scheduling routing table calculation
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Installing LSA: Type: 2001, Id: 10.65.2.2, Rt: 10.65.2.2, Seq: 80000003, Age: 76
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Neighbor 10.65.2.2 on test1 changed state from Loading to Full
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Updating router state for area 10.65.2.0
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Originating LSA: Type: 2001, Id: 10.65.2.1, Rt: 10.65.2.1, Seq: 80000002
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation for area 10.65.2.0
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation for inter-area (area 10.65.2.0)
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation for ext routes
May 17 03:10:43 ubuntu-ss-new bird[99124]: wg: Starting routing table synchronization
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSUPD packet flooded via test1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: length 88
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: router 10.65.2.1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.1, Rt: 10.65.2.1, Seq: 80000002, Age: 1, Sum: 4cb9
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSUPD packet received from nbr 10.65.2.2 on test1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: length 88
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: router 10.65.2.2
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.2, Rt: 10.65.2.2, Seq: 80000004, Age: 1, Sum: 45bb
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: Installing LSA: Type: 2001, Id: 10.65.2.2, Rt: 10.65.2.2, Seq: 80000004, Age: 1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: Scheduling routing table calculation
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSACK packet sent via test1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: length 84
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: router 10.65.2.1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0005, Id: 192.168.31.0, Rt: 10.65.2.2, Seq: 80000001, Age: 395, Sum: 5d66
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.2, Rt: 10.65.2.2, Seq: 80000003, Age: 76, Sum: 2196
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.2, Rt: 10.65.2.2, Seq: 80000004, Age: 1, Sum: 45bb
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSACK packet received from nbr 10.65.2.2 on test1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: length 84
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: router 10.65.2.2
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0005, Id: 192.168.50.255, Rt: 10.65.2.1, Seq: 80000001, Age: 10, Sum: 9120
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.1, Rt: 10.65.2.1, Seq: 80000001, Age: 9, Sum: 2c90
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: LSA Type: 0001, Id: 10.65.2.1, Rt: 10.65.2.1, Seq: 80000002, Age: 1, Sum: 4cb9
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: Strange LSACK from nbr 10.65.2.2 on test1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: Type: 2001, Id: 10.65.2.1, Rt: 10.65.2.1
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: I have: Seq: 80000002, Age: 0, Sum: 4cb9
May 17 03:10:44 ubuntu-ss-new bird[99124]: wg: It has: Seq: 80000001, Age: 9, Sum: 2c90
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation for area 10.65.2.0
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation for inter-area (area 10.65.2.0)
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg: Starting routing table calculation for ext routes
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg: Starting routing table synchronization
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg > added [best] 192.168.31.0/24 unicast
May 17 03:10:45 ubuntu-ss-new bird[99124]: kernel1 < added 192.168.31.0/24 unicast
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg < rejected by protocol 192.168.31.0/24 unicast
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg > added [best] 10.65.1.0/24 unicast
May 17 03:10:45 ubuntu-ss-new bird[99124]: kernel1 < added 10.65.1.0/24 unicast
May 17 03:10:45 ubuntu-ss-new bird[99124]: wg < rejected by protocol 10.65.1.0/24 unicast
May 17 03:10:52 ubuntu-ss-new bird[99124]: wg: HELLO packet received from nbr 10.65.2.2 on test1
May 17 03:10:52 ubuntu-ss-new bird[99124]: wg: HELLO packet sent via test1
May 17 03:10:52 ubuntu-ss-new bird[99124]: wg: HELLO packet sent via test0

OSPF信息交换完成后, 由于三个节点在同一个Area, 每个节点拿到的路由信息都是完整且相同的.

查看当前节点建立的OSPF邻居关系: sudo birdc show ospf neighbors

1
2
3
4
5
BIRD 2.0.7 ready.
wg:
Router ID Pri State DTime Interface Router IP
10.65.2.1 1 Full/PtP 34.935 test0 10.65.0.2
10.65.2.2 1 Full/PtP 32.671 test1 10.65.1.2

查看OSPF状态 sudo birdc show ospf state:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
BIRD 2.0.7 ready.

area 10.65.2.0

router 10.65.1.1
distance 0
router 10.65.2.1 metric 5
router 10.65.2.2 metric 5
stubnet 10.65.0.0/24 metric 5
stubnet 10.65.1.0/24 metric 5

router 10.65.2.1
distance 5
router 10.65.1.1 metric 5
router 10.65.2.2 metric 10
stubnet 10.65.0.0/24 metric 5
stubnet 10.65.2.0/24 metric 10
external 192.168.50.0/24 metric2 10000

router 10.65.2.2
distance 5
router 10.65.1.1 metric 5
router 10.65.2.1 metric 10
stubnet 10.65.1.0/24 metric 5
stubnet 10.65.2.0/24 metric 10
external 192.168.31.0/24 metric2 10000

查看BIRD控制的路由: sudo birdc show route

1
2
3
4
5
6
7
8
9
BIRD 2.0.7 ready.
Table master4:
192.168.31.0/24 unicast [wg 11:21:02.317] E2 (150/5/10000) [10.65.2.2]
via 10.65.1.2 on test1
10.65.2.0/24 unicast [wg 11:21:06.318] I (150/15) [10.65.2.2]
via 10.65.0.2 on test0 weight 1
via 10.65.1.2 on test1 weight 1
192.168.50.0/24 unicast [wg 11:21:06.318] E2 (150/5/10000) [10.65.2.1]
via 10.65.0.2 on test0

参考

OSPF Explained | Step by Step

OSPF Multi Area Explained

HIGH AVAILABILITY WIREGUARD SITE TO SITE 非常有用, 不过如果搞OSPFv2的话只需要读前半段, 后面OSPFv3和IPv6可以先不看.

The BIRD Internet Routing Daemon Project - 4. Remote control birdc所有支持的命令

The BIRD Internet Routing Daemon Project - 6.8 OSPF

使用BIRD+OSPF动态路由加速游戏 这个写的比较乱,而且有BIRDv1和BIRDv2混在一起, 看起来很累…

BGP and OSPF. How do they interact. BGP是AS之间交互的协议, 目前还没有这种需求, 可能后面玩DN42的时候会遇到.

Solved: ospf path selection!! - Cisco Community 决定OSPF选路的三个因素: 路由前缀, 管理距离, 其它参数(Metric, 比如Cost)

4.4. Securing Network Access Red Hat Enterprise Linux 7 | Red Hat Customer Portal

以下是一些次选参考:

Understanding OSPF External Route Path Selection | INE

How to Influence Routes in OSPF to Take Precedence Over Static Routes

Commands to Influence OSPF Routing Decisions - Directed Broadcast

debian - OSPF route costs in BIRD - Unix & Linux Stack Exchange

ospf的链路类型分类,ospf 链路的transnet和stub net有什么区别 - 网络工程师培训、思科认证、华为认证培训-onelab网络实验室

subject:”Re: Bird just doesn’t want to find OSPF neighbors although they are there and can communicate”

wireguard “server” HA set-up 有提到浮动IP的, 但是又加了一层Header, 但是MTU一共就只有1420诶…

最近存储空间紧张, 主机上各个磁盘空间都已经告急, iSCSI还在研究中, 正好腾出来了几块2019年左右的硬盘. 由于这部分数据是从其它节点的RAID备份过来的, 所以数据暂时丢失损失也不大, 就索性折腾一下.

Proxmox版本: pve-manager/7.2-3/c743d6c1 (running kernel: 5.13.19-6-pve)

把USB硬盘插上之后, 系统检测到新的USB设备, 并检测到是大容量存储设备:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[535371.886471] usb 12-4.3: new SuperSpeed USB device number 4 using xhci_hcd
[535371.911173] usb 12-4.3: New USB device found, idVendor=1058, idProduct=25e1, bcdDevice=10.21
[535371.911178] usb 12-4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[535371.911179] usb 12-4.3: Product: My Passport 25E1
[535371.911181] usb 12-4.3: Manufacturer: Western Digital
[535371.911182] usb 12-4.3: SerialNumber: ########################
[535371.921050] usb-storage 12-4.3:1.0: USB Mass Storage device detected
[535371.921211] scsi host10: usb-storage 12-4.3:1.0
[535371.921287] usbcore: registered new interface driver usb-storage
[535371.922774] usbcore: registered new interface driver uas
[535372.946932] scsi 10:0:0:0: Direct-Access WD My Passport 25E1 1021 PQ: 0 ANSI: 6
[535372.947169] scsi 10:0:0:1: Enclosure WD SES Device 1021 PQ: 0 ANSI: 6
[535372.948792] sd 10:0:0:0: Attached scsi generic sg4 type 0
[535372.948914] scsi 10:0:0:1: Attached scsi generic sg5 type 13
[535372.949942] sd 10:0:0:0: [sde] Spinning up disk...

接下来可能会显示如下的报错. 出现这种错误的主要原因是外部磁盘转的太慢了(3秒内未就绪), 以至于系统在尝试读取页面的时候磁盘返回了错误的页面, 并不意味着盘坏了.

1
2
3
4
5
6
7
8
[535372.949942] sd 10:0:0:0: [sde] Spinning up disk...
[535373.970474] .
[535375.241700] scsi 10:0:0:1: Wrong diagnostic page; asked for 1 got 8
[535375.241722] scsi 10:0:0:1: Failed to get diagnostic page 0x1
[535375.241728] scsi 10:0:0:1: Failed to bind enclosure -19
[535375.241789] ready
[535375.241911] sd 10:0:0:0: [sde] 1953458176 512-byte logical blocks: (1.00 TB/931 GiB)
... (下面是磁盘信息)

过了大概18个小时之后, 磁盘不知道是因为进入睡眠模式还是接口松动导致掉盘了, 从系统上能看到USB设备断开后重连的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[601159.551185] usb 12-4.3: USB disconnect, device number 4
[601159.989726] usb 12-4.3: new SuperSpeed USB device number 5 using xhci_hcd
[601160.014426] usb 12-4.3: New USB device found, idVendor=1058, idProduct=25e1, bcdDevice=10.21
[601160.014429] usb 12-4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[601160.014430] usb 12-4.3: Product: My Passport 25E1
[601160.014432] usb 12-4.3: Manufacturer: Western Digital
[601160.014432] usb 12-4.3: SerialNumber: ########################
[601160.014786] usb-storage 12-4.3:1.0: USB Mass Storage device detected
[601160.014921] scsi host11: usb-storage 12-4.3:1.0
[601161.050115] scsi 11:0:0:0: Direct-Access WD My Passport 25E1 1021 PQ: 0 ANSI: 6
[601161.050358] scsi 11:0:0:1: Enclosure WD SES Device 1021 PQ: 0 ANSI: 6
[601161.051738] sd 11:0:0:0: Attached scsi generic sg4 type 0
[601161.051811] ses 11:0:0:1: Attached Enclosure device
[601161.051857] ses 11:0:0:1: Attached scsi generic sg5 type 13
[601161.051994] sd 11:0:0:0: [sdf] 1953458176 512-byte logical blocks: (1.00 TB/931 GiB)
[601161.052267] ses 11:0:0:1: Wrong diagnostic page; asked for 1 got 8
[601161.052278] ses 11:0:0:1: Failed to get diagnostic page 0x1
[601161.052284] ses 11:0:0:1: Failed to bind enclosure -19
[601161.052557] sd 11:0:0:0: [sdf] Write Protect is off
[601161.052559] sd 11:0:0:0: [sdf] Mode Sense: 47 00 10 08
[601161.052849] sd 11:0:0:0: [sdf] No Caching mode page found
[601161.052869] sd 11:0:0:0: [sdf] Assuming drive cache: write through
[601163.942028] sdf: sdf1
[601163.967991] sd 11:0:0:0: [sdf] Attached SCSI disk

从上面的输出可以看到磁盘编号从 sde 变为了 sdf. 不知道为什么Proxmox是根据磁盘编号而不是UUID进行mount的, 紧接着出现了ext4文件系统错误: (好几个小时之后才发现)

1
2
3
4
5
6
7
8
9
10
11
12
[601164.336944] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337047] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337072] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337096] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337119] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601169.845913] Aborting journal on device sde1-8.
[601169.845935] Buffer I/O error on dev sde1, logical block 121667584, lost sync page write
[601169.845944] JBD2: Error -5 detected when updating journal superblock for sde1-8.
[601173.798480] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601173.798550] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601173.798562] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
... (下略, 大概每10秒钟输出3条)

尝试使用 umount /mnt/local-usb 提示 target is busy, lsof 找文件无果 (大概是因为文件系统其实已经不存在了, 因此没有file-level的handle)

使用 mount -l /mnt/local-usb 懒卸载原来的设备 (虽然根据回答好像不是那么安全, 因为可能某些进程仍然持有者指向这个文件系统的handle, 重新mount到相同的endpoint可能会有问题)

编辑 /etc/fstab 添加:

UUID=<磁盘UUID> /mnt/local-usb ext4 defaults 0 2

再使用 mount -a 挂载磁盘. 到Proxmox中新建Directory存储, 输入/mnt/local-usb路径即可. 这样可以规避掉PVE默认按照磁盘编号mount的问题. 后面就不再有这个ext4-fs错误了.

dmesg 看输出的时候是uptime, 用journalctl -k可以拿到准确的时间. 也可以cat /proc/uptime, 第一个值就是当前系统的uptime秒数, 第二个值是idle的秒数.

若想通过SCSI控制器输出的11:0:0:0找到对应的磁盘, 运行 cat /proc/scsi/scsi, 输出效果类似下面:

1
2
3
Host: scsi11 Channel: 00 Id: 00 Lun: 00
Vendor: WD Model: My Passport 25E1 Rev: 1021
Type: Direct-Access ANSI SCSI revision: 06

其中, scsi11 Channel: 00 Id: 00 Lun: 00 几个部分分别对应就能确定是这个盘, 即本文中提到的USB外置硬盘.

blkid 命令可以查看磁盘的UUID, 用于配置自动挂载.

lsblk 可以查看磁盘和磁盘当前的挂载点.

若磁盘进入睡眠状态, 除以上命令外 fdisk -l 也可以实现无写入唤醒磁盘.

参考

4 Node Storage Spaces Direct - SCSI Enclosure Services (SES) - HDDs with “Default Enclosure”

SCSI Peripheral Device Type - Wikipedia

[SOLVED-ISH] External USB hard drive issue

How to read dmesg from previous session? (dmesg.0)

Ubuntu 20.04.1 - EXT4-fs error

How to unmount a busy device [closed]

Why is lazy MNT_DETACH or umount -l unsafe / dangerous?

LINUX – SCSI Device Management – Identifying Devices

What command will wake up a sleeping USB drive?

Reserved space for root on a filesystem - why?

【4K修复】郑智化《星星点灯》这是一部值得一辈子珍藏的MV

【4K修复】郑智化《水手》 这是一个值得一辈子珍藏的MV

DANBALAN - Lendo Calendo ( MIX ) | Tiktok Music 抖音神曲

EnV - Dynasty

Hinkik - Outbreaker

EnV - Microburst

DG812 - Eternity

Tungevaag - Peru (Official Lyric Video)

《如果我是___,你会爱我吗》

老司机带带我,但是是欧陆节拍 / 云南山道最速伝説 (网易云音乐完整版)

热火斯卡拉 (DJ版) - 承利

4K超高清 修复版 《明天会更好》1985 60位歌星大合唱~ / 【4K60FPS】华语群星《明天会更好》大合唱神曲!五四青年站起来!

永恒不朽的经典《We are the world》,每次听都是震撼心灵的盛宴

K-391 - Summertime [Sunshine]

往期优秀作品推荐

2022年4月

可以参考以下模板文件:

1
2
3
4
5
6
7
8
9
10
11
12
[Unit]
Description=Docker compose on boot
Wants=network-online.target
After=network-online.target
RequiresMountsFor=/run/containers/storage

[Service]
WorkingDirectory=/opt/project
ExecStart=docker-compose up -d

[Install]
WantedBy=multi-user.target

如果有其它需要mount的文件夹, 例如容器依赖CIFS, 则将路径追加到RequiresMountsFor中即可, 空格分隔.

把文件链接或复制到 /etc/systemd/system/ 目录下, 使用 systemctl daemon-reload, systemctl enable ... 即可启动服务.

如果没有安装podman-docker可能没法直接使用docker-compose命令, 此时只需要加上对应的socket参数即可, 例如(对于root): docker-compose -H unix:///run/podman/podman.sock, 或将其设为DOCKER_HOST环境变量.

Mod: Minecraft Transit Railway

Issue: Refresh Path button not working on multiplayer server #352

非常好的一个mod,但是用的时候刷新路径按键一直不正常, 看了mod的源码也没看出来什么问题. 尝试打开客户端日志, 有一个错误消息非常让人在意:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
[15:37:50] [Render thread/ERROR]: Error executing task on Client
java.lang.IndexOutOfBoundsException: readerIndex(23) + length(1) exceeds writerIndex(23): PooledUnsafeDirectByteBuf(ridx: 23, widx: 23, cap: 23)
at io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1442) ~[netty-all-4.1.68.Final.jar%2326!/:4.1.68.Final]
at io.netty.buffer.AbstractByteBuf.readByte(AbstractByteBuf.java:730) ~[netty-all-4.1.68.Final.jar%2326!/:4.1.68.Final]
at net.minecraft.network.FriendlyByteBuf.readByte(FriendlyByteBuf.java:909) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.network.FriendlyByteBuf.m_130242_(FriendlyByteBuf.java:344) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.network.FriendlyByteBuf.m_130136_(FriendlyByteBuf.java:486) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.network.FriendlyByteBuf.m_130277_(FriendlyByteBuf.java:482) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at mtr.packet.PacketTrainDataGuiClient.createRailS2C(PacketTrainDataGuiClient.java:120) ~[MTR-forge-1.18.2-3.0.0.jar%2372!/:?]
at mtr.MTRClient.lambda$init$22(MTRClient.java:288) ~[MTR-forge-1.18.2-3.0.0.jar%2372!/:?]
at mtr.forge.RegistryClientImpl.lambda$registerNetworkReceiver$0(RegistryClientImpl.java:47) ~[MTR-forge-1.18.2-3.0.0.jar%2372!/:?]
at dev.architectury.networking.forge.NetworkManagerImpl.lambda$createPacketHandler$6(NetworkManagerImpl.java:150) ~[architectury-4.3.53.jar%2354!/:?]
at dev.architectury.networking.transformers.PacketTransformer$1.inbound(PacketTransformer.java:47) ~[architectury-4.3.53.jar%2354!/:?]
at dev.architectury.networking.forge.NetworkManagerImpl.lambda$createPacketHandler$7(NetworkManagerImpl.java:145) ~[architectury-4.3.53.jar%2354!/:?]
at net.minecraftforge.eventbus.EventBus.doCastFilter(EventBus.java:247) ~[eventbus-5.0.3.jar%232!/:?]
at net.minecraftforge.eventbus.EventBus.lambda$addListener$11(EventBus.java:239) ~[eventbus-5.0.3.jar%232!/:?]
at net.minecraftforge.eventbus.EventBus.post(EventBus.java:302) ~[eventbus-5.0.3.jar%232!/:?]
at net.minecraftforge.eventbus.EventBus.post(EventBus.java:283) ~[eventbus-5.0.3.jar%232!/:?]
at net.minecraftforge.network.NetworkInstance.dispatch(NetworkInstance.java:68) ~[forge-1.18.2-40.1.0-universal.jar%2384!/:?]
at net.minecraftforge.network.NetworkHooks.lambda$onCustomPayload$1(NetworkHooks.java:75) ~[forge-1.18.2-40.1.0-universal.jar%2384!/:?]
at java.util.Optional.map(Optional.java:260) ~[?:?]
at net.minecraftforge.network.NetworkHooks.onCustomPayload(NetworkHooks.java:75) ~[forge-1.18.2-40.1.0-universal.jar%2384!/:?]
at net.minecraft.client.multiplayer.ClientPacketListener.m_7413_(ClientPacketListener.java:1824) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.network.protocol.game.ClientboundCustomPayloadPacket.m_5797_(ClientboundCustomPayloadPacket.java:57) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.network.protocol.game.ClientboundCustomPayloadPacket.m_5797_(ClientboundCustomPayloadPacket.java:7) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.network.protocol.PacketUtils.m_131356_(PacketUtils.java:22) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.util.thread.BlockableEventLoop.m_6367_(BlockableEventLoop.java:157) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.util.thread.ReentrantBlockableEventLoop.m_6367_(ReentrantBlockableEventLoop.java:23) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.util.thread.BlockableEventLoop.m_7245_(BlockableEventLoop.java:131) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.util.thread.BlockableEventLoop.m_18699_(BlockableEventLoop.java:116) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.client.Minecraft.m_91383_(Minecraft.java:1013) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.client.Minecraft.m_91374_(Minecraft.java:663) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at net.minecraft.client.main.Main.main(Main.java:205) ~[client-1.18.2-20220404.173914-srg.jar%2380!/:?]
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[?:?]
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
at net.minecraftforge.fml.loading.targets.CommonClientLaunchHandler.lambda$launchService$0(CommonClientLaunchHandler.java:31) ~[fmlloader-1.18.2-40.1.0.jar%2316!/:?]
at cpw.mods.modlauncher.LaunchServiceHandlerDecorator.launch(LaunchServiceHandlerDecorator.java:37) [modlauncher-9.1.3.jar%235!/:?]
at cpw.mods.modlauncher.LaunchServiceHandler.launch(LaunchServiceHandler.java:53) [modlauncher-9.1.3.jar%235!/:?]
at cpw.mods.modlauncher.LaunchServiceHandler.launch(LaunchServiceHandler.java:71) [modlauncher-9.1.3.jar%235!/:?]
at cpw.mods.modlauncher.Launcher.run(Launcher.java:106) [modlauncher-9.1.3.jar%235!/:?]
at cpw.mods.modlauncher.Launcher.main(Launcher.java:77) [modlauncher-9.1.3.jar%235!/:?]
at cpw.mods.modlauncher.BootstrapLaunchConsumer.accept(BootstrapLaunchConsumer.java:26) [modlauncher-9.1.3.jar%235!/:?]
at cpw.mods.modlauncher.BootstrapLaunchConsumer.accept(BootstrapLaunchConsumer.java:23) [modlauncher-9.1.3.jar%235!/:?]
at cpw.mods.bootstraplauncher.BootstrapLauncher.main(BootstrapLauncher.java:149) [bootstraplauncher-1.0.0.jar:?]

从这个错误消息入手, 找到一个非常简短且冷门的帖子: I cannot connect to a modded server 帖子里面提到删除玩家对应的dat文件. 尝试停服删除之后重启, 一切正常. 推测是之前停电导致的服务器未正确关闭引起的玩家数据异常.

Windows访问SMB共享的时候 一个服务器下多个用户的时候会提示 你没有权限访问xxx, 请与网络管理员联系请求访问权限 但又找不到地方可以重新登录SMB, 此时需要使用命令行(震怒):

net use 查看目前所有的链接

net use <路径> /delete 找到对应服务器的已建立的链接, 并删除 名为$IPC的链接也要删除.

net use * /delete 或者使用删除全部的SMB链接.

net use \\<服务器IP>\<Share> /USER:<用户名> <密码> /PERSISTENT:YES 输入新的用户名和密码 注意, 如果删除链接之后直接在Explorer里双击打开的话还是会使用原来的账号密码.

net use <盘符>: \\<服务器IP>\<Share> /USER:<用户名> <密码> /PERSISTENT:YES 添加一个网络映射 执行这个命令不要使用管理员权限, 否则映射出来的磁盘不能在普通用户的文件管理器里看到.

如果访问同一个服务器用了多个账户, 则net use命令会遇到报错:

1
2
3
发生系统错误 1219。

不允许一个用户使用一个以上用户名与服务器或共享资源的多重连接。中断与此服务器或共享资源的所有连接,然后再试一次。

此时, 先尝试按照上述命令清除已经建立的链接. 如果还不行, 可以看下Windows凭据管理器里是否有预设账号密码, 如果有则删除. 如果以上都没问题的话可以尝试重启 Workstation 服务 (任务管理器里叫 LanmanWorkstation) 使用服务单元管理器的 重新启动 或者下面的命令(需要管理员权限):

1
2
net stop workstation
net start workstation

Workstation服务描述: 使用 SMB 协议创建并维护客户端网络与远程服务器之间的连接。如果此服务已停止,这些连接将无法使用。如果此服务已禁用,任何明确依赖它的服务将无法启动。

规避同一服务器只能使用一个账号的限制: Windows对此的限制是服务器名维度, 所以可以尝试使用多个指向相同IP的域名使用不同账户来访问 (例如hosts文件, 自定义dns等) 以达到实际上多账户访问同一服务器的效果.

参考

Accessing a Windows share with a different username

“net use” does not create drives in Explorer on Windows 10

Net Use Error 1219 - I just want to mount a network share

How can I clear the “authentication cache” in Windows 7 to a password protected samba share?

How to Clear Saved Credentials for Network Share or Remote Desktop Connection

Samba+Windows: Allow multiple connections by different users?

深入NAS协议系列: 召唤SMB2 OpLock/Lease