Opened 16 years ago
Closed 16 years ago
#7135 closed defect (invalid)
multiple [mythfrontend] <defunct>
| Reported by: | Owned by: | Isaac Richards | |
|---|---|---|---|
| Priority: | trivial | Milestone: | unknown |
| Component: | MythTV - General | Version: | head |
| Severity: | low | Keywords: | mythwelcome |
| Cc: | Ticket locked: | no |
Description
my box is starting mythwelcome in autostart (through autologin of mythtv user and a .xinitrc) I've 3 instances of mythfrontend and 2 are <defunct>
could be because mythwelcome try to lauchn mythfrontend before mythbackend is completely up and running
Attachments (8)
Change History (35)
comment:1 by , 16 years ago
| Status: | new → infoneeded_new |
|---|
comment:3 by , 16 years ago
| Milestone: | 0.22 → unknown |
|---|
Nothing notable in the log. We're probably just not waiting on the child process some place in mythwelcome. As a rule this doesn't consume any resources aside from the entry in the process table, so not a big deal to fix before 0.22.
Simon you are seeing this with trunk, not 0.21-fixes? In trunk, we use myth_system() which should be waiting on the mythfrontend pid to exit.
comment:5 by , 16 years ago
| Status: | infoneeded_new → new |
|---|
Are you sure this is a problem with MythWelcome? I'm seeing a defunct mythfrontend process from just starting it and letting it sit on the menu.
comment:7 by , 16 years ago
At the request with sphery from #mythtv-users:
I"m seeing the mythfrontend<defunct> processes and are NOT using mythwelcome.
My system is a mythbuntu based machine, without VDPAU, operating as a remote frontend.
I removed all traces of mythtv that I could find before installing mythtv 0.22-fixes.
This includes removing the old autostart entry from "Startup Programs" and replacing it with my own.
Attaching a snippet of 'ps -efw' for refrence.
comment:8 by , 16 years ago
| Component: | MythTV - Mythwelcome & Mythshutdown → MythTV - General |
|---|---|
| Summary: | mythwelcome creating mythfrontend <defunct> → multiple [mythfrontend] <defunct> |
Seems unrelated to mythwelcome.
comment:9 by , 16 years ago
| Status: | new → infoneeded_new |
|---|
Can you provide compressed logs with -v all, please - and a matching ps -efw for when you see the issue.
comment:10 by , 16 years ago
This may or may not be important, but I notice that once I get the <defunct> processes, they are reaped by the kernel when mythfrontend is killed (as they should be).
However, when I restart mythfrontend, the defunct processes come back with the new mythfrontend instance. This behavior is occurring as soon as mythfrontend is started, no sort of interaction with mythfrontend has been done otherwise.
comment:11 by , 16 years ago
I am also doing the same:
- autologin with mingetty on tty7
- start mythwelcome from .xinitrc
I always see two defunct mythfrontend processes but there are no visible problems. Logs are absolutely ok.
root@mythbox:/tmp# ps -efw | grep mythfront mythtv 4004 3944 0 08:43 tty7 00:00:04 /usr/bin/mythfrontend -d -v general mythtv 4023 4004 0 08:43 tty7 00:00:00 [mythfrontend] <defunct> mythtv 4026 4004 0 08:43 tty7 00:00:00 [mythfrontend] <defunct>
comment:12 by , 16 years ago
On my combined frontend/backend running mythbuntu 9.04, when mythfrontend is started, I get:
PID PPID USER STAT COMMAND 13399 1 bill Rl mythfrontend --verbose all --logfile /var/log/mythtv/0.22-fe.log 13420 13399 bill Z \_ [mythfrontend] <defunct> 13423 13399 bill Z \_ [mythfrontend] <defunct> 13425 13399 bill Z \_ [mythfrontend] <defunct> 13428 13399 bill Z \_ [mythfrontend] <defunct> 13431 13399 bill Z \_ [mythfrontend] <defunct> 13434 13399 bill Z \_ [mythfrontend] <defunct> 13437 13399 bill Z \_ [mythfrontend] <defunct> 13440 13399 bill Z \_ [mythfrontend] <defunct> 13443 13399 bill Z \_ [mythfrontend] <defunct>
MythTV Version : 22679M MythTV Branch : trunk Network Protocol : 50 Library API : 0.22.20091022-1 QT Version : 4.5.0
My frontend is not started automatically.
This was for a 4 minute session. Started, waited for 'quiet' log and exited.
Logfile attached (I think.)
Bill
comment:13 by , 16 years ago
Started wondering why I have 13 defuncts and the report before mine and the original had only 2. So, I plugged in an SD card into my card reader, restarted the frontend and my defunct count dropped from 13 to 12.
Most of the time, there are no cards plugged into the card reader.
On a roll here, I shutdown and disconnected the USB plug for the card reader (which has 5 slots CF/SD/uSD...). Restarting the frontend again, I got 2 defuncts, (for /dev/sd0?) which I'm guessing match log entries:
MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power,
When the card reader is plugged in, there are 12 Error entries. 2 each for /dev/sd[defgh] and /dev/sr0.
bill@rc1:~/Download$ zcat mlog.gz|cut -c25- |grep /dev/ MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdd MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sde MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdf MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdg MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdh MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sr0
There are truly no /dev/bdi or /dev/power files, however,
/sys/devices/pci0000:00/0000:00:14.1/host6/target6:0:0/6:0:0:0/block/sr0/bdi
exists.
My frontend logs go back as far as 2009-06-16, which is when I started running the trunk. The 1st entry with this type of error started on 2009-07-21 and I was at 20844. I update my box about every 100 commits. The card reader was purchased on 2008-11-12 and most likely installed the same day, although I won't swear to that.
Hope this helps.
Bill
by , 16 years ago
| Attachment: | mediamonitor.diff added |
|---|
hard codes udevadm rather than udevinfo (deprecated?)
comment:14 by , 16 years ago
The attached changes work on a 9.04 mythbuntu distribution. If there are still distributions without udevadm, this 'fix' will give them the same problem we're seeing in this ticket.
Point me in the right direction and give me a shove and I'd be happy to make a real fix.
Details:
trunk/mythtv/libs/libmyth/mediamonitor-unix.cpp executes udevinfo, which doesn't exist in mythubuntu 9.04 and ubuntu 9.10 (the two distributions I have.)
% type udevinfo -bash: type: udevinfo: not found % type udevadm udevadm is /sbin/udevadm
If the device is valid, both return the full path, as in:
udevinfo -q name -rp /sys/block/sdd (existing code)
udevadm info -q name -rp /sys/block/sdd (proposed)
/dev/sdd
In the error case, (... -q name -rp /sys/block/sdfoo) the existing 'udevinfo' code checks for a response of:
device not found in database
but udevadm returns:
device path not found
Also, if udevinfo is used but linked to udevadm, the following will appear in mythfrontend.log:
MMUnix::GetDeviceFile(/sys/block/sdd) - udevinfo error... the program '/usr/local/bin/mythfrontend' called 'udevinfo', it should use 'udevadm info <options>', this will stop working in a future release
comment:16 by , 16 years ago
| Status: | infoneeded_new → new |
|---|
We can't switch to udevadm, it's root-only on some distributions.
by , 16 years ago
| Attachment: | mythtv-7135-defunct_processes.patch added |
|---|
comment:17 by , 16 years ago
| Status: | new → infoneeded_new |
|---|
Attached patch, mythtv-7135-defunct_processes.patch , might work to prevent the zombie processes. I can't reproduce the issue, so I'm posting the patch for others to test.
The only way I could get defunct processes with my contrived test application was to delete the QProcess before the process exited. It's possible that this can happen when deleteLater() is called right after kill(), so the patch just puts another waitForFinished() call after the kill() to allow the process to die before deleteLater() gets called (it will wait much less than 2s, but will give up after 2s if the kill() fails). This extra waitForFinished() should probably be done regardless of whether it solves the issue or not.
However, in theory, if the specified application doesn't exist, waitForStarted() should return false, meaning that in the described cases, we should be returning from the waitForStarted() block above. (If you enable -v important,general,media on mythfrontend, you should see "Error - udevinfo failed to start!" and/or "Error - udevinfo failed to end! Terminating" which will indicate whether the problem is in the waitForStarted() or waitForFinished() block.) If the problem is in the waitForStarted() block, we may need to do the same kill() and waitForFinished() in it before deleteLater().
If someone can test and adjust the patch as necessary, please report back whether it works (and upload any modified version of the patch). #6137 (udevadm vs udevinfo) will be handled separately.
by , 16 years ago
| Attachment: | mediamonitor-unix.cpp-svn-diff added |
|---|
Patch as modified per mdean's request.
comment:18 by , 16 years ago
Thanks for the quick response. Unfortunately, the kill()/waitFor...() patch didn't eliminate the defunct processes.
Your theory is spot on, the "Error - udevinfo failed to start!" leg is the one taken.
I've attached your patch, to be sure I modified it correctly. Also, attached is a stand alone file with libudev tests that would solve??? #6137 and eliminate the need for these changes.
The program will take /sys/block/sda and find /dev/sda like udevinfo/adm do.
Full disclosure, I don't know spit about udev, but would be willing to try modifying the program if it looks reasonable. It did require getting libudev-dev, which I'm guessing is a drawback.
by , 16 years ago
| Attachment: | mdean.patch.mediamonitor-unix.cpp added |
|---|
comment:19 by , 16 years ago
The fix from the mailing list eliminates the defunct processes.
Nice work!
To be clear, the udev.c attachment I added wasn't intended as a fix or workaround, but a test of the udev library which could replace the existing calls to udevinfo.
What I don't know is how to find out which distributions have the library and/or if they require root access to run them.
comment:20 by , 16 years ago
Bad test.
I had a udevinfo script that called udevadm in place when I tested.
Removed it and the defuncts are still happening.
Sorry.
comment:21 by , 16 years ago
I have also tested this fix and it does not work.
1648 ? Rsl 0:11 /usr/local/bin/mythfrontend 1746 ? Z 0:00 \_ [mythfrontend] <defunct> 1748 ? Z 0:00 \_ [mythfrontend] <defunct>
comment:22 by , 16 years ago
| Severity: | medium → low |
|---|---|
| Status: | infoneeded_new → new |
comment:23 by , 16 years ago
Update: haven't given up, but the number of defuncts changes. I can't figure out why.
I usually had about 15 defunct processes. The next patch dropped that to a solid 5. Both "bdi" and "power" appear on my machine, "trace" was added because I saw a comment on gossamer-threads by jarpublic. This seems like a keeper.
Index: mediamonitor-unix.cpp
===================================================================
--- mediamonitor-unix.cpp (revision 22872)
+++ mediamonitor-unix.cpp (working copy)
@@ -553,7 +553,9 @@
// skip some sysfs dirs that are _not_ sub-partitions
if (*pit == "device" || *pit == "holders" || *pit == "queue"
- || *pit == "slaves" || *pit == "subsystem")
+ || *pit == "slaves" || *pit == "subsystem"
+ || *pit == "bdi" || *pit == "power"
+ || *pit == "trace")
continue;
found_partitions |= FindPartitions(
I suspect that mediamonitor-unix.cpp isn't and wasn't causing the actual defuncts to occur. When I was at 5, there were a matching 5 failures to mount. That makes sense, because there are no memory cards plugged in any of the 5 slots. In these cases, myth_system() is called to do the mounts and I tried the below. But the number of defuncts now floats between 1 and 6. Not recommending this as a fix, but the change does make a difference.
Index: mythmedia.cpp
===================================================================
--- mythmedia.cpp (revision 22872)
+++ mythmedia.cpp (working copy)
@@ -121,7 +121,7 @@
.arg(m_DevicePath);
VERBOSE(VB_MEDIA, QString("Executing '%1'").arg(MountCommand));
- if (0 == myth_system(MountCommand))
+ if (0 == myth_system(MountCommand, MYTH_SYSTEM_DONT_BLOCK_PARENT))
{
if (DoMount)
{
I keep tripping over why is it that udevinfo works and the error case doesn't? udevinfo returns the same string, although the default has no trailing new line. I tried appending a new line to the default case, (ret.append(QChar '\n')) to no avail.
comment:24 by , 16 years ago
| Priority: | minor → trivial |
|---|
There is a known bug with mythsystem in 0.22/trunk, multiple concurrent processes started with mythsystem share the same pid file meaning they aren't cleaned up properly when complete. This is probably related.
comment:25 by , 16 years ago
I added VERBOSE lines to print out the child PIDs in myth_system and found that they didn't match those of the defunct processes. So I did the same [udevinfo->pid()] in mediamonitor-unix.cpp right after udevinfo->start and got an exact match! The usleep() in the following has eliminated the defuncts for me. The additional tests below that cut down the number of ?unrequired? attempts.
Index: mediamonitor-unix.cpp
===================================================================
--- mediamonitor-unix.cpp (revision 22889)
+++ mediamonitor-unix.cpp (working copy)
@@ -229,6 +229,8 @@
args << sysfs;
udevinfo->start("udevinfo", args);
+ usleep(100000);
+
if (!udevinfo->waitForStarted(2000 /*ms*/))
{
VERBOSE(VB_MEDIA, msg + ", Error - udevinfo failed to start!");
@@ -553,7 +555,10 @@
// skip some sysfs dirs that are _not_ sub-partitions
if (*pit == "device" || *pit == "holders" || *pit == "queue"
- || *pit == "slaves" || *pit == "subsystem")
+ || *pit == "slaves" || *pit == "subsystem"
+ || *pit == "bdi" || *pit == "power"
+ || *pit == "trace")
+
continue;
found_partitions |= FindPartitions(
Of course those affected could just add a shell script something like this as /sbin/udevinfo:
# If you already have udevinfo, you don't want this script!!!
UDEVADM=/sbin/udevadm
if [ ! -e $UDEVADM ]; then
echo "Strange, you don't have $UDEVADM either, bye!"
exit 1
fi
RESULT=`$UDEVADM info $1 $2 $3 $4 2>&1`
RETURN_CODE=$?
if [ $RETURN_CODE = 0 ]; then
echo "$RESULT"
else
echo "device not found in database"
fi
exit $RETURN_CODE
comment:26 by , 16 years ago
Update:
This link http://bugreports.qt.nokia.com/browse/QTBUG-5990 seems to address the defunct/zombie problem.
Also, http://qt.nokia.com/doc/4.5/qprocess.html#starts speaks to starting a process that is still running. The log shows all my udevinfo->start()s (up to 16) were done in a 122msec window. I saw a log from another user with 5 removable devices do it in 112msec.
Also, the echo "device not found in database" above should have: 1>&2 appended in order to emulate udevinfo.
comment:27 by , 16 years ago
| Resolution: | → invalid |
|---|---|
| Status: | new → closed |
The problem is a bug in Qt ( http://bugreports.qt.nokia.com/browse/QTBUG-5990 ). When #6137 is fixed, it will prevent our seeing the symptoms of this Qt bug, even on broken Qt versions. Until #6137 is fixed, users may use workarounds mentioned above or keep an eye on QTBUG-5990.
Thanks to Bill Meek for tracking down the Qt bug and to Bill Meek and Josh Winters and all the others for all the debugging help.

Please provide log files from mythwelcome and mythfrontend, otherwise it is impossible for us to diagnose.