Opened 14 years ago
Closed 12 years ago
#9836 closed Bug Report - General (Fixed)
UTF-16LE .srt subtitle files do not work
Reported by: | Owned by: | Jim Stichnoth | |
---|---|---|---|
Priority: | minor | Milestone: | 0.27 |
Component: | MythTV - Captions | Version: | 0.24-fixes |
Severity: | medium | Keywords: | subtitle srt utf-16 |
Cc: | Ticket locked: | no |
Description
I have several UTF-16LE .srt subtitle files. I had to convert them to UTF-8 with iconv before they were recognized during playback.
Attachments (5)
Change History (28)
comment:1 by , 14 years ago
Milestone: | 0.24.2 → unknown |
---|
comment:2 by , 14 years ago
Please post the first bytes of a subtitle. E.g. the output of
hexdump -n 16 [filename.srt]
Google suggests that the presence/absence of a Byte Order Mark might irritate ffmpeg.
comment:3 by , 14 years ago
Here is the hex dump requested:
0000000 feff 0031 000d 000a 0030 0030 003a 0030 0000010
You should be able to reproduce the issue by converting any UTF-8 .srt file to 16bit LE using:
iconv -f UTF-8 -t UTF-16LE OldFilename.srt > NewFilename.srt
comment:4 by , 14 years ago
Owner: | removed |
---|---|
Status: | new → assigned |
by , 14 years ago
Attachment: | external-subs-utf.diff added |
---|
comment:5 by , 14 years ago
I've attached a patch that should fix the issue.
The root problem is that the code imported from xine does not handle multi-byte encodings. The 'cleanest' route would be to ensure it does.
Instead I've taken the approach of probing the subtitle file for a UTF encoding, using it if detected and otherwise falling back to the encoding setting. The file is then loaded in full and converted to unicode, converted back to utf-8 for parsing and finally converted back to unicode for presentation. Not pretty but I couldn't see another way of doing it without touching the actual parsing code.
The only problem is that it doesn't seem to handle files converted to utf using iconv. Material converted inside QtCreator (i.e. probably using Qt code) seems to work fine.
by , 14 years ago
Attachment: | Chalet Girl (2011) BluRay.edited.srt added |
---|
comment:6 by , 14 years ago
Status: | assigned → infoneeded |
---|
Does the patch attached by Mark fix the issue?
comment:7 by , 14 years ago
I do not compile myth from source, so I cannot test any time soon... Would need a few hours to relearn the process because it has been a long time and spare time has been lacking lately. I will test when I can, but would appreciate it if other testers could give its try in case I don't get to it soon.
comment:9 by , 12 years ago
I can confirm that it is still an issue with 0.26-fixes mythbuntu nightly build from today.
I have not been able to test the patch submitted by markk.
comment:10 by , 12 years ago
I run master and I'm willing to test the patch, however most of it fails to apply (I guess because I... run master):
checking file mythtv/libs/libmythtv/textsubtitleparser.cpp Hunk #1 succeeded at 19 with fuzz 2 (offset 2 lines). Hunk #2 FAILED at 119. Hunk #3 FAILED at 163. 2 out of 3 hunks FAILED checking file mythtv/libs/libmythtv/xine_demux_sputext.cpp Hunk #2 FAILED at 96. Hunk #3 FAILED at 1108. Hunk #4 FAILED at 1123. Hunk #5 succeeded at 1151 (offset 2 lines). 3 out of 5 hunks FAILED checking file mythtv/libs/libmythtv/xine_demux_sputext.h Hunk #1 FAILED at 21. 1 out of 1 hunk FAILED MythTV Version : v0.27-pre2-2004-g30d7cc4-dirty MythTV Branch : master Network Protocol : 77 Library API : 0.27.20130719-1 QT Version : 4.8.4 Options compiled in: linux profile use_hidesyms using_alsa using_oss using_pulse using_pulseoutput using_backend using_bindings_perl using_bindings_python using_bindings_php using_dvb using_frontend using_ceton using_hdpvr using_libcrypto using_libfftw3 using_libxml2 using_libudf using_lirc using_mheg using_opengl using_opengl_video using_qtwebkit using_qtscript using_qtdbus using_taglib using_v4l2 using_x11 using_xrandr using_xv using_profiletype using_bindings_perl using_bindings_python using_bindings_php using_mythtranscode using_opengl using_vdpau using_ffmpeg_threads using_mheg using_libass using_libxml2 using_libudf
If someone can provide a patch against master, I can test it.
comment:11 by , 12 years ago
Milestone: | unknown → 0.27 |
---|---|
Status: | infoneeded → assigned |
comment:12 by , 12 years ago
Owner: | set to |
---|
comment:13 by , 12 years ago
Component: | MythTV - Video Playback → MythTV - Captions |
---|---|
Status: | assigned → accepted |
comment:14 by , 12 years ago
Yianni (and others),
Please try srt.patch and see if it works. It is an adaptation of Mark's original patch. Like Mark, I found that it still wouldn't load files converted by iconv, so I couldn't test the attached Chalet Girl example.
comment:15 by , 12 years ago
Sorry for the late reply, but we were away for the weekend. Anyway, the patch seems to work with the provided Chalet Girl (2011) BluRay.edited.srt file.
I just downloaded the srt file, I renamed it to fit one of my films, and I could see the Arabic (I think) subs with the film.
Can I be of more help?
Yianni.
comment:16 by , 12 years ago
Scratch that. I hadn't paid close attention to your post, Jim. It appears the Chalet Girl was downloaded as UTF-8, that's why it was appearing normally.
I have other srt files encoded as utl-16le that still don't play with the patch.
comment:18 by , 12 years ago
Milestone: | 0.27 → 0.27.1 |
---|
by , 12 years ago
Attachment: | srt2.patch added |
---|
comment:19 by , 12 years ago
Yianni, srt2.patch works for me with your sample, could you give it a try? (A little more cleanup is needed, e.g. it probably leaks memory.)
comment:20 by , 12 years ago
Yes, Jim, I can confirm it works here, too. Thanks for your effort. I'm off to work now!
comment:23 by , 12 years ago
Milestone: | 0.27.1 → 0.27 |
---|---|
Resolution: | → Fixed |
Status: | accepted → closed |
Resetting milestone to default.