最近在阿里轻量云买了一些机器, 同样都是1C/0.5G, Ubuntu 20.04 的机器运行非常正常, Ubuntu 22.04 的机器却隔一段时间就没有响应了. 具体表现为能ping通, 但是ssh登录会超时失败, 宿主机监控CPU/磁盘IO大涨. 登录VNC能看到类似这样的提示:
1 | [44211.553196] Out of memory: Killed process 95466(apt-check) ... shmem-rss:0KB, UID:0 pgtables:300KB dom_score_adj:0 |
而这台机器已经关闭了apt自动更新, 一番搜索之后发现snapd可能会引起这个问题:
snap list
1 | Name Version Rev Tracking Publisher Notes |
- 删除snap各个包(有依赖关系)
1 | sudo snap remove --purge lxd |
- 删除snap
1 | sudo apt remove snapd |
- 添加apt配置文件防止snapd重新被安装回来
sudo vim /etc/apt/preferences.d/nosnap.pref
1 | Package: snapd |
- 清除apt缓存, 重新更新一下(可选)
sudo apt clean && sudo apt update
在完全删除掉snapd之后, 目前机器已经正常运行了两三天…
后续
后来发现还是不太行, 解决方案是给这个内存超级小的机器加上Swap. 因为阿里云没创建swap, 而且还把swappiness设置成了0! 为防止奇怪的事情发生, 弄成crontab脚本每分钟跑一下好了.
创建并启用swap
1 | dd if=/dev/zero of=/swap.img bs=1M count=1024 |
@reboot swapon -s | grep -q swap || swapon /swap.img
@reboot echo 60 > /proc/sys/vm/swappiness
- swapon -s | grep -q swap || swapon /swap.img
- echo 60 > /proc/sys/vm/swappiness
### 参考
[Terminate unattended-upgrades or whatever is using apt in ubuntu 18.04 or later editions](https://askubuntu.com/questions/1186492/terminate-unattended-upgrades-or-whatever-is-using-apt-in-ubuntu-18-04-or-later)
[How to Remove Snap Packages in Ubuntu Linux](https://www.debugpoint.com/remove-snap-ubuntu/)
[How do I configure swappiness?](https://askubuntu.com/questions/103915/how-do-i-configure-swappiness)
[How to read oom-killer syslog messages?](https://serverfault.com/questions/548736/how-to-read-oom-killer-syslog-messages)
[How can I check if swap is active from the command line?](https://unix.stackexchange.com/questions/23072/how-can-i-check-if-swap-is-active-from-the-command-line)
[Linux Partition HOWTO: 9. Setting Up Swap Space](https://tldp.org/HOWTO/Partition/setting_up_swap.html)
[How to Clear RAM Memory Cache, Buffer and Swap Space on Linux](https://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/)
[Swappiness: What it Is, How it Works & How to Adjust](https://phoenixnap.com/kb/swappiness)