rhel6: dbus-daemon and udisks-daemon consuming CPU when USB drive is plugged to the non UTF-8 system
Solution Verified - Updated September 28 2016 at 4:26 PM -
English
Environment
· Red Hat Enterprise Linux 6
· gnome-disk-utility-2.30.1-2.el6
· non UTF-8 locale
Issue
· System freezes with high load average.
· dbus-daemon and udisks-daemon consuming lots of CPU.
· Issue occurs on non UTF-8 system.
Raw
1578 dbus 20 0 22564 2112 780 S 31.6 0.0 46:14.10 dbus-daemon
4130 root 20 0 40732 2992 2240 R 19.7 0.0 28:42.05 udisks-daemon
· Lots of messages as below captured in ~/.xsession-errors file.
Raw
(nautilus:6380): GVFS-RemoteVolumeMonitor-WARNING **: New owner :1.55 for volume monitor org.gtk.Private.GduVolumeMonitor connected to the bus; seeding drives/volumes/mounts
(gnome-panel:6379): GVFS-RemoteVolumeMonitor-WARNING **: invoking List() failed for type GProxyVolumeMonitorGdu: org.freedesktop.DBus.Error.NoReply: Message did not receive a reply (timeout by message bus)
· Problem is seen when USB drive is plugged in to the system before logging in to gnome session.
Resolution
· RHEL6: Update gnome-disk-utility to 2.30.1-3.el6 (released with RHBA-2016:0725) or later, and dbus to 1.2.24-8.el6 to solve this issue. This fix is part of RHEL6.7GA and later. This issue was investigated in (private) bz1118456.
· RHEL6.2.z: Update dbus to 1.2.24-6.el6_2 (released with RHBA-2016-1579) or later to solve this issue.
· RHEL6.4.z: Update dbus to 1.2.24-8.el6_4 (released with RHBA-2016-1578) or later to solve this issue.
· RHEL6.5.z: Update dbus to 1.2.24-8.el6_5 (released with RHBA-2016-1577) or later to solve this issue.
NOTE : For the update to take effect, all running instances of dbus-daemon and all running applications using the libdbus library must be restarted, or the system rebooted.
As a workaround, a UTF-8 locale can be used. About how to change your locale,please refer to: How to change the system locale in RHEL
- If unable to use UTF-8 locale, try removing GDU (gnome-disk-utility, udisks).
Root Cause
When affected versions of the dbus daemon fail to read a command line from a process, this causes in some cases an excessively high CPU load. The fix ensures that file descriptors used by the D-Bus system are closed correctly, which prevents the CPU performance from being impacted in the described scenario.
We are expecting this issue to occur only when many processes are simultaneously connecting to dbus-daemon, but failures from read() (mentioned below) are possible for many reasons, so many simultaneous connections would only be the most probable trigger, not the exclusive trigger. These factors could occur on all of boot/running/shutdown of the system.
Diagnostic Steps
Question: In particular, according to the fixing patch, read() in_dbus_command_for_pid() to /proc/$pid/cmdline must return 0 when the issue occurs. Could you explain when and how 0 is returned?
Answer: The read(2) man page documents several error conditions that would cause 0 to be retured when calling read(): http://man7.org/linux/man-pages/man2/read.2.html#ERRORS and it is important to note the text "Other errors may occur, depending on the object connected to fd." As to how an error during read() may be triggered, probably the most likely scenario is that many processes attempt to initiate connections to dbus-daemon in a short space of time, causing it to reach the limit of open file descriptors. Then, reading of command lines from connected processes would fail, causing file descriptor leaks and leading to an infinite loop.
Question: Can hung_task_timeout detect this problem? If yes, please tell us the message text by hung_task_timeout. We'd like to know the task name indicated in hung_task_timeout message.
Answer: There is no good way to use hung_task_timeout to detect the issue. This is more of a symptom, and hung tasks can be associated with numerous other issues, such as high IOwait.
Question: Does dbus-daemon output any error message when this problem occurs?
Answer: When the issue occurs, no messages are directly seen. Once the dbus process consumes 100% CPU ressources, an strace can be attached and will output the following: accept(3, 0x7fff848dd900, [16]) = -1 EMFILE (Too many open files) <0.000017>
Question: Is there any workaround for this problem?
Answer: Our recommendation for this issue is to patch per ERRATA. As a workaround, dbus could be restarted regularly via a cron job. Clients (whether those are system daemons or client applications) which communicate with dbus are unlikely to tolerate a dbus-daemon restart gracefully. In other words, dbus clients would likely be terminated if dbus-daemon were restarted. A desktop session has many clients connected using dbus, so would not survive a dbus-daemon restart. The customer would need to evaluate what clients are connected to dbus-daemon and whether it is acceptable that those clients would be terminated. As this would have to be done (potentially) many times during a day or week, a package upgrade to the fixed version of dbus would be less invasive, as it would require only a single restart cycle.
· Product(s)
· Red Hat Enterprise Linux
· Component
· gnome-panel
· gnome-session
· nautilus
· Category
· Troubleshoot
· Tags
· desktop
· gnome
· rhel
· rhel_6
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
加载中,请稍候......