Proxmox挂载使用USB外置硬盘

最近存储空间紧张, 主机上各个磁盘空间都已经告急, iSCSI还在研究中, 正好腾出来了几块2019年左右的硬盘. 由于这部分数据是从其它节点的RAID备份过来的, 所以数据暂时丢失损失也不大, 就索性折腾一下.

Proxmox版本: pve-manager/7.2-3/c743d6c1 (running kernel: 5.13.19-6-pve)

把USB硬盘插上之后, 系统检测到新的USB设备, 并检测到是大容量存储设备:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[535371.886471] usb 12-4.3: new SuperSpeed USB device number 4 using xhci_hcd
[535371.911173] usb 12-4.3: New USB device found, idVendor=1058, idProduct=25e1, bcdDevice=10.21
[535371.911178] usb 12-4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[535371.911179] usb 12-4.3: Product: My Passport 25E1
[535371.911181] usb 12-4.3: Manufacturer: Western Digital
[535371.911182] usb 12-4.3: SerialNumber: ########################
[535371.921050] usb-storage 12-4.3:1.0: USB Mass Storage device detected
[535371.921211] scsi host10: usb-storage 12-4.3:1.0
[535371.921287] usbcore: registered new interface driver usb-storage
[535371.922774] usbcore: registered new interface driver uas
[535372.946932] scsi 10:0:0:0: Direct-Access WD My Passport 25E1 1021 PQ: 0 ANSI: 6
[535372.947169] scsi 10:0:0:1: Enclosure WD SES Device 1021 PQ: 0 ANSI: 6
[535372.948792] sd 10:0:0:0: Attached scsi generic sg4 type 0
[535372.948914] scsi 10:0:0:1: Attached scsi generic sg5 type 13
[535372.949942] sd 10:0:0:0: [sde] Spinning up disk...

接下来可能会显示如下的报错. 出现这种错误的主要原因是外部磁盘转的太慢了(3秒内未就绪), 以至于系统在尝试读取页面的时候磁盘返回了错误的页面, 并不意味着盘坏了.

1
2
3
4
5
6
7
8
[535372.949942] sd 10:0:0:0: [sde] Spinning up disk...
[535373.970474] .
[535375.241700] scsi 10:0:0:1: Wrong diagnostic page; asked for 1 got 8
[535375.241722] scsi 10:0:0:1: Failed to get diagnostic page 0x1
[535375.241728] scsi 10:0:0:1: Failed to bind enclosure -19
[535375.241789] ready
[535375.241911] sd 10:0:0:0: [sde] 1953458176 512-byte logical blocks: (1.00 TB/931 GiB)
... (下面是磁盘信息)

过了大概18个小时之后, 磁盘不知道是因为进入睡眠模式还是接口松动导致掉盘了, 从系统上能看到USB设备断开后重连的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[601159.551185] usb 12-4.3: USB disconnect, device number 4
[601159.989726] usb 12-4.3: new SuperSpeed USB device number 5 using xhci_hcd
[601160.014426] usb 12-4.3: New USB device found, idVendor=1058, idProduct=25e1, bcdDevice=10.21
[601160.014429] usb 12-4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[601160.014430] usb 12-4.3: Product: My Passport 25E1
[601160.014432] usb 12-4.3: Manufacturer: Western Digital
[601160.014432] usb 12-4.3: SerialNumber: ########################
[601160.014786] usb-storage 12-4.3:1.0: USB Mass Storage device detected
[601160.014921] scsi host11: usb-storage 12-4.3:1.0
[601161.050115] scsi 11:0:0:0: Direct-Access WD My Passport 25E1 1021 PQ: 0 ANSI: 6
[601161.050358] scsi 11:0:0:1: Enclosure WD SES Device 1021 PQ: 0 ANSI: 6
[601161.051738] sd 11:0:0:0: Attached scsi generic sg4 type 0
[601161.051811] ses 11:0:0:1: Attached Enclosure device
[601161.051857] ses 11:0:0:1: Attached scsi generic sg5 type 13
[601161.051994] sd 11:0:0:0: [sdf] 1953458176 512-byte logical blocks: (1.00 TB/931 GiB)
[601161.052267] ses 11:0:0:1: Wrong diagnostic page; asked for 1 got 8
[601161.052278] ses 11:0:0:1: Failed to get diagnostic page 0x1
[601161.052284] ses 11:0:0:1: Failed to bind enclosure -19
[601161.052557] sd 11:0:0:0: [sdf] Write Protect is off
[601161.052559] sd 11:0:0:0: [sdf] Mode Sense: 47 00 10 08
[601161.052849] sd 11:0:0:0: [sdf] No Caching mode page found
[601161.052869] sd 11:0:0:0: [sdf] Assuming drive cache: write through
[601163.942028] sdf: sdf1
[601163.967991] sd 11:0:0:0: [sdf] Attached SCSI disk

从上面的输出可以看到磁盘编号从 sde 变为了 sdf. 不知道为什么Proxmox是根据磁盘编号而不是UUID进行mount的, 紧接着出现了ext4文件系统错误: (好几个小时之后才发现)

1
2
3
4
5
6
7
8
9
10
11
12
[601164.336944] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337047] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337072] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337096] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601164.337119] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601169.845913] Aborting journal on device sde1-8.
[601169.845935] Buffer I/O error on dev sde1, logical block 121667584, lost sync page write
[601169.845944] JBD2: Error -5 detected when updating journal superblock for sde1-8.
[601173.798480] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601173.798550] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
[601173.798562] EXT4-fs error (device sde1): __ext4_find_entry:1611: inode #2: comm pvestatd: reading directory lblock 0
... (下略, 大概每10秒钟输出3条)

尝试使用 umount /mnt/local-usb 提示 target is busy, lsof 找文件无果 (大概是因为文件系统其实已经不存在了, 因此没有file-level的handle)

使用 mount -l /mnt/local-usb 懒卸载原来的设备 (虽然根据回答好像不是那么安全, 因为可能某些进程仍然持有者指向这个文件系统的handle, 重新mount到相同的endpoint可能会有问题)

编辑 /etc/fstab 添加:

UUID=<磁盘UUID> /mnt/local-usb ext4 defaults 0 2

再使用 mount -a 挂载磁盘. 到Proxmox中新建Directory存储, 输入/mnt/local-usb路径即可. 这样可以规避掉PVE默认按照磁盘编号mount的问题. 后面就不再有这个ext4-fs错误了.

dmesg 看输出的时候是uptime, 用journalctl -k可以拿到准确的时间. 也可以cat /proc/uptime, 第一个值就是当前系统的uptime秒数, 第二个值是idle的秒数.

若想通过SCSI控制器输出的11:0:0:0找到对应的磁盘, 运行 cat /proc/scsi/scsi, 输出效果类似下面:

1
2
3
Host: scsi11 Channel: 00 Id: 00 Lun: 00
Vendor: WD Model: My Passport 25E1 Rev: 1021
Type: Direct-Access ANSI SCSI revision: 06

其中, scsi11 Channel: 00 Id: 00 Lun: 00 几个部分分别对应就能确定是这个盘, 即本文中提到的USB外置硬盘.

blkid 命令可以查看磁盘的UUID, 用于配置自动挂载.

lsblk 可以查看磁盘和磁盘当前的挂载点.

若磁盘进入睡眠状态, 除以上命令外 fdisk -l 也可以实现无写入唤醒磁盘.

参考

4 Node Storage Spaces Direct - SCSI Enclosure Services (SES) - HDDs with “Default Enclosure”

SCSI Peripheral Device Type - Wikipedia

[SOLVED-ISH] External USB hard drive issue

How to read dmesg from previous session? (dmesg.0)

Ubuntu 20.04.1 - EXT4-fs error

How to unmount a busy device [closed]

Why is lazy MNT_DETACH or umount -l unsafe / dangerous?

LINUX – SCSI Device Management – Identifying Devices

What command will wake up a sleeping USB drive?

Reserved space for root on a filesystem - why?