Context Navigation

← Previous Ticket
Next Ticket →

Opened 16 years ago

Closed 16 years ago

#7135 closed defect (invalid)

multiple [mythfrontend] <defunct>

Reported by:	simons.philippe@…	Owned by:	Isaac Richards
Priority:	trivial	Milestone:	unknown
Component:	MythTV - General	Version:	head
Severity:	low	Keywords:	mythwelcome
Cc:		Ticket locked:	no

Description

my box is starting mythwelcome in autostart (through autologin of mythtv user and a .xinitrc) I've 3 instances of mythfrontend and 2 are <defunct>

could be because mythwelcome try to lauchn mythfrontend before mythbackend is completely up and running

Attachments (8)

mythfrontend.log (3.3 KB ) - added by simons.philippe@… 16 years ago.: mythfrontend.log
process.txt (378 bytes ) - added by Josh Winters <fuxxociety@…> 16 years ago.: ps -efw
mlog.gz (22.1 KB ) - added by Bill Meek <llibkeem@…> 16 years ago.: mythfrontend log
mediamonitor.diff (1.5 KB ) - added by Bill Meek <llibkeem@…> 16 years ago.: hard codes udevadm rather than udevinfo (deprecated?)
mythtv-7135-defunct_processes.patch (585 bytes ) - added by sphery 16 years ago.
mediamonitor-unix.cpp-svn-diff (769 bytes ) - added by Bill Meek <llibkeem@…> 16 years ago.: Patch as modified per mdean's request.
udev.c (1.1 KB ) - added by Bill Meek <llibkeem@…> 16 years ago.: Standalone udev library test.
mdean.patch.mediamonitor-unix.cpp (1.8 KB ) - added by Bill Meek <llibkeem@…> 16 years ago.: From mdean, http://www.gossamer-threads.com/lists/mythtv/users/408825#408825

Download all attachments as: .zip

Change History (35)

comment:1 by cpinkham, 16 years ago

Status:	new → infoneeded_new

Please provide log files from mythwelcome and mythfrontend, otherwise it is impossible for us to diagnose.

by simons.philippe@…, 16 years ago

Attachment:	mythfrontend.log added

mythfrontend.log

comment:2 by simons.philippe@…, 16 years ago

here is mythwelcome log but there is not much intersting here...

comment:3 by danielk, 16 years ago

Milestone:	0.22 → unknown

Nothing notable in the log. We're probably just not waiting on the child process some place in mythwelcome. As a rule this doesn't consume any resources aside from the entry in the process table, so not a big deal to fix before 0.22.

Simon you are seeing this with trunk, not 0.21-fixes? In trunk, we use myth_system() which should be waiting on the mythfrontend pid to exit.

comment:4 by simons.philippe@…, 16 years ago

Yup, only with trunk, didnt see this with .21-fixes

comment:5 by paulh, 16 years ago

Status:	infoneeded_new → new

Are you sure this is a problem with MythWelcome? I'm seeing a defunct mythfrontend process from just starting it and letting it sit on the menu.

comment:6 by simons.philippe@…, 16 years ago

honestly, no, it was an assumption (seems i was wrong)

comment:7 by Josh Winters <fuxxociety@…>, 16 years ago

At the request with sphery from #mythtv-users:

I"m seeing the mythfrontend<defunct> processes and are NOT using mythwelcome.

My system is a mythbuntu based machine, without VDPAU, operating as a remote frontend.

I removed all traces of mythtv that I could find before installing mythtv 0.22-fixes.

This includes removing the old autostart entry from "Startup Programs" and replacing it with my own.

Attaching a snippet of 'ps -efw' for refrence.

by Josh Winters <fuxxociety@…>, 16 years ago

Attachment:	process.txt added

ps -efw

comment:8 by sphery, 16 years ago

Component:	MythTV - Mythwelcome & Mythshutdown → MythTV - General
Summary:	mythwelcome creating mythfrontend <defunct> → multiple [mythfrontend] <defunct>

Seems unrelated to mythwelcome.

comment:9 by Dibblah, 16 years ago

Status:	new → infoneeded_new

Can you provide compressed logs with -v all, please - and a matching ps -efw for when you see the issue.

comment:10 by Josh Winters <fuxxociety@…>, 16 years ago

This may or may not be important, but I notice that once I get the <defunct> processes, they are reaped by the kernel when mythfrontend is killed (as they should be).

However, when I restart mythfrontend, the defunct processes come back with the new mythfrontend instance. This behavior is occurring as soon as mythfrontend is started, no sort of interaction with mythfrontend has been done otherwise.

comment:11 by derliebegott@…, 16 years ago

I am also doing the same:

autologin with mingetty on tty7
start mythwelcome from .xinitrc

I always see two defunct mythfrontend processes but there are no visible problems. Logs are absolutely ok.

root@mythbox:/tmp# ps -efw | grep mythfront mythtv 4004 3944 0 08:43 tty7 00:00:04 /usr/bin/mythfrontend -d -v general mythtv 4023 4004 0 08:43 tty7 00:00:00 [mythfrontend] <defunct> mythtv 4026 4004 0 08:43 tty7 00:00:00 [mythfrontend] <defunct>

by Bill Meek <llibkeem@…>, 16 years ago

Attachment:	mlog.gz added

mythfrontend log

comment:12 by Bill Meek <llibkeem@…>, 16 years ago

On my combined frontend/backend running mythbuntu 9.04, when mythfrontend is started, I get:

  PID  PPID USER     STAT COMMAND
  13399     1 bill     Rl   mythfrontend --verbose all --logfile /var/log/mythtv/0.22-fe.log
  13420 13399 bill     Z     \_ [mythfrontend] <defunct>
  13423 13399 bill     Z     \_ [mythfrontend] <defunct>
  13425 13399 bill     Z     \_ [mythfrontend] <defunct>
  13428 13399 bill     Z     \_ [mythfrontend] <defunct>
  13431 13399 bill     Z     \_ [mythfrontend] <defunct>
  13434 13399 bill     Z     \_ [mythfrontend] <defunct>
  13437 13399 bill     Z     \_ [mythfrontend] <defunct>
  13440 13399 bill     Z     \_ [mythfrontend] <defunct>
  13443 13399 bill     Z     \_ [mythfrontend] <defunct>

MythTV Version   : 22679M
MythTV Branch    : trunk
Network Protocol : 50
Library API      : 0.22.20091022-1
QT Version       : 4.5.0

My frontend is not started automatically.

This was for a 4 minute session. Started, waited for 'quiet' log and exited.

Logfile attached (I think.)

Bill

comment:13 by Bill Meek <llibkeem@…>, 16 years ago

Started wondering why I have 13 defuncts and the report before mine and the original had only 2. So, I plugged in an SD card into my card reader, restarted the frontend and my defunct count dropped from 13 to 12.

Most of the time, there are no cards plugged into the card reader.

On a roll here, I shutdown and disconnected the USB plug for the card reader (which has 5 slots CF/SD/uSD...). Restarting the frontend again, I got 2 defuncts, (for /dev/sd0?) which I'm guessing match log entries:

MMUnix::AddDevice() Error: failed to stat /dev/bdi, 
MMUnix::AddDevice() Error: failed to stat /dev/power,

When the card reader is plugged in, there are 12 Error entries. 2 each for /dev/sd[defgh] and /dev/sr0.

bill@rc1:~/Download$ zcat mlog.gz|cut -c25- |grep /dev/
MMUnix::AddDevice() Error: failed to stat /dev/bdi, 
MMUnix::AddDevice() Error: failed to stat /dev/power,  
MMUnix::AddDevice() - Added /dev/sdd
MMUnix::AddDevice() Error: failed to stat /dev/bdi, 
MMUnix::AddDevice() Error: failed to stat /dev/power, 
MMUnix::AddDevice() - Added /dev/sde
MMUnix::AddDevice() Error: failed to stat /dev/bdi, 
MMUnix::AddDevice() Error: failed to stat /dev/power, 
MMUnix::AddDevice() - Added /dev/sdf
MMUnix::AddDevice() Error: failed to stat /dev/bdi, 
MMUnix::AddDevice() Error: failed to stat /dev/power, 
MMUnix::AddDevice() - Added /dev/sdg
MMUnix::AddDevice() Error: failed to stat /dev/bdi, 
MMUnix::AddDevice() Error: failed to stat /dev/power, 
MMUnix::AddDevice() - Added /dev/sdh
MMUnix::AddDevice() Error: failed to stat /dev/bdi, 
MMUnix::AddDevice() Error: failed to stat /dev/power, 
MMUnix::AddDevice() - Added /dev/sr0

There are truly no /dev/bdi or /dev/power files, however,

/sys/devices/pci0000:00/0000:00:14.1/host6/target6:0:0/6:0:0:0/block/sr0/bdi

exists.

My frontend logs go back as far as 2009-06-16, which is when I started running the trunk. The 1st entry with this type of error started on 2009-07-21 and I was at 20844. I update my box about every 100 commits. The card reader was purchased on 2008-11-12 and most likely installed the same day, although I won't swear to that.

Hope this helps.

Bill

by Bill Meek <llibkeem@…>, 16 years ago

Attachment:	mediamonitor.diff added

hard codes udevadm rather than udevinfo (deprecated?)

comment:14 by Bill Meek <llibkeem@…>, 16 years ago

The attached changes work on a 9.04 mythbuntu distribution. If there are still distributions without udevadm, this 'fix' will give them the same problem we're seeing in this ticket.

Point me in the right direction and give me a shove and I'd be happy to make a real fix.

Details:

trunk/mythtv/libs/libmyth/mediamonitor-unix.cpp executes udevinfo, which doesn't exist in mythubuntu 9.04 and ubuntu 9.10 (the two distributions I have.)

% type udevinfo
-bash: type: udevinfo: not found

% type udevadm
udevadm is /sbin/udevadm

If the device is valid, both return the full path, as in:

udevinfo -q name -rp /sys/block/sdd      (existing code)
udevadm info -q name -rp /sys/block/sdd  (proposed) 
    /dev/sdd

In the error case, (... -q name -rp /sys/block/sdfoo) the existing 'udevinfo' code checks for a response of:

device not found in database

but udevadm returns:

device path not found

Also, if udevinfo is used but linked to udevadm, the following will appear in mythfrontend.log:

MMUnix::GetDeviceFile(/sys/block/sdd) - udevinfo error...
the program '/usr/local/bin/mythfrontend' called 'udevinfo',
it should use 'udevadm info <options>', this will stop working
in a future release

comment:15 by sphery, 16 years ago

Refs #6137

comment:16 by stuartm, 16 years ago

Status:	infoneeded_new → new

We can't switch to udevadm, it's root-only on some distributions.

by sphery, 16 years ago

Attachment:	mythtv-7135-defunct_processes.patch added

comment:17 by sphery, 16 years ago

Status:	new → infoneeded_new

Attached patch, mythtv-7135-defunct_processes.patch , might work to prevent the zombie processes. I can't reproduce the issue, so I'm posting the patch for others to test.

The only way I could get defunct processes with my contrived test application was to delete the QProcess before the process exited. It's possible that this can happen when deleteLater() is called right after kill(), so the patch just puts another waitForFinished() call after the kill() to allow the process to die before deleteLater() gets called (it will wait much less than 2s, but will give up after 2s if the kill() fails). This extra waitForFinished() should probably be done regardless of whether it solves the issue or not.

However, in theory, if the specified application doesn't exist, waitForStarted() should return false, meaning that in the described cases, we should be returning from the waitForStarted() block above. (If you enable -v important,general,media on mythfrontend, you should see "Error - udevinfo failed to start!" and/or "Error - udevinfo failed to end! Terminating" which will indicate whether the problem is in the waitForStarted() or waitForFinished() block.) If the problem is in the waitForStarted() block, we may need to do the same kill() and waitForFinished() in it before deleteLater().

If someone can test and adjust the patch as necessary, please report back whether it works (and upload any modified version of the patch). #6137 (udevadm vs udevinfo) will be handled separately.

by Bill Meek <llibkeem@…>, 16 years ago

Attachment:	mediamonitor-unix.cpp-svn-diff added

Patch as modified per mdean's request.

by Bill Meek <llibkeem@…>, 16 years ago

Attachment:	udev.c added

Standalone udev library test.

comment:18 by Bill Meek <llibkeem@…>, 16 years ago

Thanks for the quick response. Unfortunately, the kill()/waitFor...() patch didn't eliminate the defunct processes.

Your theory is spot on, the "Error - udevinfo failed to start!" leg is the one taken.

I've attached your patch, to be sure I modified it correctly. Also, attached is a stand alone file with libudev tests that would solve??? #6137 and eliminate the need for these changes.

The program will take /sys/block/sda and find /dev/sda like udevinfo/adm do.

Full disclosure, I don't know spit about udev, but would be willing to try modifying the program if it looks reasonable. It did require getting libudev-dev, which I'm guessing is a drawback.

by Bill Meek <llibkeem@…>, 16 years ago

Attachment:	mdean.patch.mediamonitor-unix.cpp added

From mdean, http://www.gossamer-threads.com/lists/mythtv/users/408825#408825

comment:19 by Bill Meek <llibkeem@…>, 16 years ago

The fix from the mailing list eliminates the defunct processes.

Nice work!

To be clear, the udev.c attachment I added wasn't intended as a fix or workaround, but a test of the udev library which could replace the existing calls to udevinfo.

What I don't know is how to find out which distributions have the library and/or if they require root access to run them.

comment:20 by Bill Meek <llibkeem@…>, 16 years ago

Bad test.

I had a udevinfo script that called udevadm in place when I tested.

Removed it and the defuncts are still happening.

Sorry.

comment:21 by JohnnyJboss <johnnyjboss@…>, 16 years ago

I have also tested this fix and it does not work.

1648 ? Rsl 0:11 /usr/local/bin/mythfrontend 1746 ? Z 0:00 \_ [mythfrontend] <defunct> 1748 ? Z 0:00 \_ [mythfrontend] <defunct>

comment:22 by stuartm, 16 years ago

Severity:	medium → low
Status:	infoneeded_new → new

comment:23 by Bill Meek <llibkeem@…>, 16 years ago

Update: haven't given up, but the number of defuncts changes. I can't figure out why.

I usually had about 15 defunct processes. The next patch dropped that to a solid 5. Both "bdi" and "power" appear on my machine, "trace" was added because I saw a comment on gossamer-threads by jarpublic. This seems like a keeper.

Index: mediamonitor-unix.cpp
===================================================================
--- mediamonitor-unix.cpp	(revision 22872)
+++ mediamonitor-unix.cpp	(working copy)
@@ -553,7 +553,9 @@
 
             // skip some sysfs dirs that are _not_ sub-partitions
             if (*pit == "device" || *pit == "holders" || *pit == "queue"
-                                 || *pit == "slaves"  || *pit == "subsystem")
+                                 || *pit == "slaves"  || *pit == "subsystem"
+                                 || *pit == "bdi"     || *pit == "power"
+                                 || *pit == "trace")
                 continue;
 
             found_partitions |= FindPartitions(

I suspect that mediamonitor-unix.cpp isn't and wasn't causing the actual defuncts to occur. When I was at 5, there were a matching 5 failures to mount. That makes sense, because there are no memory cards plugged in any of the 5 slots. In these cases, myth_system() is called to do the mounts and I tried the below. But the number of defuncts now floats between 1 and 6. Not recommending this as a fix, but the change does make a difference.

Index: mythmedia.cpp
===================================================================
--- mythmedia.cpp	(revision 22872)
+++ mythmedia.cpp	(working copy)
@@ -121,7 +121,7 @@
                 .arg(m_DevicePath);
     
         VERBOSE(VB_MEDIA, QString("Executing '%1'").arg(MountCommand));
-        if (0 == myth_system(MountCommand)) 
+        if (0 == myth_system(MountCommand, MYTH_SYSTEM_DONT_BLOCK_PARENT)) 
         {
             if (DoMount)
             {

I keep tripping over why is it that udevinfo works and the error case doesn't? udevinfo returns the same string, although the default has no trailing new line. I tried appending a new line to the default case, (ret.append(QChar '\n')) to no avail.

comment:24 by stuartm, 16 years ago

Priority:	minor → trivial

There is a known bug with mythsystem in 0.22/trunk, multiple concurrent processes started with mythsystem share the same pid file meaning they aren't cleaned up properly when complete. This is probably related.

comment:25 by Bill Meek <llibkeem@…>, 16 years ago

I added VERBOSE lines to print out the child PIDs in myth_system and found that they didn't match those of the defunct processes. So I did the same [udevinfo->pid()] in mediamonitor-unix.cpp right after udevinfo->start and got an exact match! The usleep() in the following has eliminated the defuncts for me. The additional tests below that cut down the number of ?unrequired? attempts.

Index: mediamonitor-unix.cpp
===================================================================
--- mediamonitor-unix.cpp	(revision 22889)
+++ mediamonitor-unix.cpp	(working copy)
@@ -229,6 +229,8 @@
     args << sysfs;
     udevinfo->start("udevinfo", args);
 
+    usleep(100000);
+
     if (!udevinfo->waitForStarted(2000 /*ms*/))
     {
         VERBOSE(VB_MEDIA, msg + ", Error - udevinfo failed to start!");
@@ -553,7 +555,10 @@
 
             // skip some sysfs dirs that are _not_ sub-partitions
             if (*pit == "device" || *pit == "holders" || *pit == "queue"
-                                 || *pit == "slaves"  || *pit == "subsystem")
+                                 || *pit == "slaves"  || *pit == "subsystem"
+                                 || *pit == "bdi"     || *pit == "power"
+                                 || *pit == "trace")
+
                 continue;
 
             found_partitions |= FindPartitions(

Of course those affected could just add a shell script something like this as /sbin/udevinfo:

# If you already have udevinfo, you don't want this script!!!
UDEVADM=/sbin/udevadm

if [ ! -e $UDEVADM ]; then
    echo "Strange, you don't have $UDEVADM either, bye!"
    exit 1
fi

RESULT=`$UDEVADM info $1 $2 $3  $4 2>&1`

RETURN_CODE=$?

if [ $RETURN_CODE = 0 ]; then
    echo "$RESULT"
else
    echo "device not found in database"
fi

exit $RETURN_CODE

comment:26 by Bill Meek <llibkeem@…>, 16 years ago

Update:

This link http://bugreports.qt.nokia.com/browse/QTBUG-5990 seems to address the defunct/zombie problem.

Also, http://qt.nokia.com/doc/4.5/qprocess.html#starts speaks to starting a process that is still running. The log shows all my udevinfo->start()s (up to 16) were done in a 122msec window. I saw a log from another user with 5 removable devices do it in 112msec.

Also, the echo "device not found in database" above should have: 1>&2 appended in order to emulate udevinfo.

comment:27 by sphery, 16 years ago

Resolution:	→ invalid
Status:	new → closed

The problem is a bug in Qt ( http://bugreports.qt.nokia.com/browse/QTBUG-5990 ). When #6137 is fixed, it will prevent our seeing the symptoms of this Qt bug, even on broken Qt versions. Until #6137 is fixed, users may use workarounds mentioned above or keep an eye on QTBUG-5990.

Thanks to Bill Meek for tracking down the Qt bug and to Bill Meek and Josh Winters and all the others for all the debugging help.

Note: See TracTickets for help on using tickets.

Download in other formats: