有问必答: EMC Unity 存储系统root drive空间告警提醒
本周的一个常见的客户问题这里分享给大家:
客户的问题如下:
EMC Unity 350F存储出现告警,和十一节前报错一样,具体如下:
Unity node "spb" component "EmcSupportServices" detected a warning
event: The system root drive on SPB has less than 15% of its drive space left.
后续可能会需要远程确认具体情况,大致时间目前在等用户确认,确认后会尽快告知,感谢支持!
这个故障告警其实很清楚的告诉我们,root卷的空间要不够了,已经少于15%了,需要尽快处理。root卷是承担存储操作系统正常运行的卷,EMC的最佳实践是可用空间要大于15%。如果这个空间满了或者将要满了,很容易导致控制器宕机,或者升级code的时候失败。下面我们就来看看具体的处理方案。
首先ssh或者ipmitool登录存储系统的命令行,至于如何使用ipmitool或者ssh登录,可以参考以前的文章,单纯使用图形界面是无法处理这个故障告警的。
运行命令df -h 这个大家都懂的,看看当前的文件系统的空间占用情况:
service@Unity spa:~/user# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/alteroot 13G 6.4G 5.7G 85% /
udev 992M 8.0K 992M 1% /dev
tmpfs 4.0G 295M 3.7G 8% /dev/shm
tmpfs 4.0G 449M 3.6G 12% /run
tmpfs 4.0G 0 4.0G 0% /sys/fs/cgroup
/dev/ram0 59M 13M 46M 22% /tmp
/dev/sda1 117M 6.0M 105M 6% /mnt/c4lx-cfg_msata
/dev/sda7 13G 1.8G 11G 15% /cores
/pramfs 122M 38M 85M 31% /pramfs
/dev/mirrora5 15G 3.8G 9.8G 28% /EMC/backend/service
/dev/c4nasdba2 3.5G 557M 2.8G 17% /EMC/backend/CEM
/dev/c4nasdba1 1013M 61M 901M 7% /nbsnas
/dev/c4loga1 3.4G 266M 3.0G 9% /EMC/backend/log_shared
/dev/c4loga2 4.7G 138M 4.3G 4% /EMC/backend/perf_stats
/dev/metricsluna1 16G 4.5G 11G 30% /EMC/backend/metricsluna1
/dev/c4fastvpa1 6.0G 143M 5.6G 3% /EMC/fastvp
注意,版本不同,这个df -h 输出的文件系统内容可能不同,另外告警中提示的是root drive,所以如果要处理空间,要针对root drive的分区进行。
可以使用命令 svc_purge_logs 来分析和清理root文件系统的空间占用。
-f 这个是生成一个报告
--clear 清理空间,不加--force是不会清理掉的
root@CETV****700021 spa:/cores/service/user# svc_purge_logs -c
INFO: Peer SP found - attempting to run on both SPs.
--force not used, files will not be removed
*** WARNING *** Unity service shell activated! *** WARNING ***
--clear --force 才可以清理空间
root@CETV****00021 spa:/cores/service/user# svc_purge_logs -c --force
INFO: Peer SP found - attempting to run on both SPs.
Now searching for files that could be deleted on SPA ...Done!
[Mon May 1 01:17:58 UTC 2023] Filesystem Usage Report (SPA) :
=============================================
Local Root filesystem mount point : /
Total Size of Drive : 13041MB
Current Total Space Used : 5855MB, 44.89% usage
A total of 67MB can be saved on this system by removing files.
Total Space Used (after / cleanup) : 5788MB, 44.38% usage
Writing report file for SPA ...
The report file contains the list of files which could be safely deleted.
Removed local file: /EMC/CEM/log/cemtracer_file_services.log
Removed local file: /EMC/CEM/log/cemtracer_sysapi.log
Removed local file: /EMC/CEM/log/cimomlog.txt.1
Removed local file: /EMC/CEM/log/securitylog.txt.0
Removed local file: /EMC/CEM/log/cimomlog.txt.0
Removed local file: /EMC/C4Core/log/obs_collector/ObsCollector.log
Removed local file: /EMC/C4Core/log/logDaemon_trace_debug.log
Removed local file: /EMC/C4Core/log/memory.log
Removed local file: /EMC/C4Core/log/EMCSystemLogBackup.log
Removed local file: /usr/apache-tomcat/logs/timing.log
Removed local file: /usr/apache-tomcat/logs/server.log
Removed local file: /var/log/pacemaker/pacemaker.log
Removed local file: /var/log/audit/audit.log
Cleared 67MB of space off of SPA root filesystem!
Checking SPA for deleted files still consuming space ...
Files that have been deleted but are still considered open
==========================================================
0.0283556M of deleted files is still considered used by the system.
Purge old dc/core dump files...
Report for SPB can be found here:
SPB:/home/service/user/svc_purge_logs_REPORT-SPB-1682903878.txt
Now searching for files that could be deleted on SPB ...Done!
[Mon May 1 01:17:59 UTC 2023] Filesystem Usage Report (SPB) :
=============================================
Local Root filesystem mount point : /
Total Size of Drive : 13041MB
Current Total Space Used : 5767MB, 44.22% usage
A total of 23MB can be saved on this system by removing files.
Total Space Used (after / cleanup) : 5744MB, 44.04% usage
Writing report file for SPB ...
The report file contains the list of files which could be safely deleted.
Removed local file: /EMC/CEM/log/cimomlog.txt.0
Removed local file: /EMC/C4Core/log/obs_collector/ObsCollector.log
Removed local file: /EMC/C4Core/log/logDaemon_trace_debug.log
Removed local file: /var/log/audit/audit.log
Cleared 23MB of space off of SPB root filesystem!
Checking SPB for deleted files still consuming space ...
Files that have been deleted but are still considered open
==========================================================
0.0283556M of deleted files is still considered used by the system.
Purge old dc/core dump files...
*** WARNING *** Unity service shell activated! *** WARNING ***
下面这几个命令是Linux下查找大文件的命令,也可以用来找到root空间下的大文件。
/EMC/C4Core/log/fsstats.root.critical.log 或者
/EMC/C4Core/log/fsstats.root.high.log
上面两个文件中保存了root卷中的文件系统状态,也就是文件或者目录的的情况。使用命令cat来找到前10个最大的文件:
cat /EMC/C4Core/log/fsstats.root.high.log | head -10
或者
cat /EMC/C4Core/log/fsstats.root.critical.log | head -10
可以根据这个输出结果找到一些诸如升级文件,dump文件等,然后可以rm删除这些大的文件。要注意呀,不是所有的前10个文件都是可以删除,要自己判断这个文件具体是做什么的,然后再来删除。今天分享到此结束,有其他DELL EMC unity的存储故障,可以添加vx StorageExpert来进一步讨论。