blog.kfish.org

My name is Conrad Parker, and I live in Kyoto, Japan. I am working towards a PhD in Computer Science at Kyoto University, finishing September 2009. I also work on some free software projects including the Sweep sound editor and the Annodex media system, and various smaller projects which you can read about here.

Sunday, 13 January 2008

Release: liboggz 0.9.6

This release of Oggz 0.9.6 contains a new tool, oggz-comment, which can be used to edit the basic metadata (title, producer, copyright etc.) of Ogg Theora files. The library also has some pretty major improvements to the way it works out timestamps and does seeking, mostly the work of Shane Stephens.

In media files, timing and synchronization is extremely important. If the image and audio start to go out of sync, it is very noticeable and the video quickly becomes unwatchable. When you scan through a file you often need to decode a lot more data than you actually display. This is particularly the case when you jump backwards, which is common in a user interface that supports scrubbing. As video frames are stored as a difference relative to earlier (or later) frames, you end up needing to secretly jump further back in the file to the previous keyframe, and then decode many frames up to the one you actually want to show. For a smooth user experience you need to do this as quickly as possible.

Ogg has some interesting framing properties. Given that timing is so important, you might expect that every packet has its precise timing information associated with it. In Ogg, it turns out not to be so. Packets are stored in pages, and there is only one timestamp per page. It is common for many audio packets to be crammed onto one page; the timing information for all the rest is not stored in the file. On the other hand, the encoded data for video keyframes is usually much larger, and spans multiple pages. Only the last packet on a page has its timestamp recorded, so if the keyframe is followed by an a much smaller packet of frame data in the same page, the timestamp for the keyframe will be lost. For these reasons I tend to refer to Ogg as a "lossy" container.

In order to minimize these problems, liboggz now inspects the encoded data in order reconstruct the expected granulepos (corresponding to a timestamp) for every packet in an Ogg stream. This allows applications to use reliable timestamps, even though these are only sparsely recorded in most Ogg streams. This is not as easy as it sounds, particularly for Ogg Vorbis. To get a flavour of what's involved, read Shane's rant in the comments, explaining how to calculate Vorbis timestamps.

For an in-depth discussion, come to Ralph Giles' talk at linux.conf.au, Seeking is hard: Ogg design internals.

Labels:

Release: xsel 1.0.0

XSel is a command-line program for getting and setting the contents of the X selection. You can use xsel in shell scripts and desktop keybindings, so that the contents of the X selection are available to command arguments:

mozilla --remote "openurl(`xsel`)"

This release adds UTF-8 support and fixes various bugs. The last version of XSel was 0.9.6, released sometime around 2001. It may have been the first version also. For some reason a bunch of patches came in recently, and I've had the joy of revisiting this project.

For old time's sake, my thoughts on ICCCM. (Warning: explicit language). Back then I made a point of implementing as much of that crack as possible. You can even tell applications to delete their selected text:

  • To delete the contents of the selection: xsel --delete

(This really works, you can try it on xedit to remotely delete text in the editor window).

This time around, of course, nothing does what the docs say anymore. So we ignore the details in the 2001 proposal for Inter-Client Exchange of Unicode Text and just grunt atoms at the selection owner until they yield all their secrets. And now, finally, xsel works on Japanese.

People have come up with some interesting uses for xsel over the years, but nobody has yet come up with a nifty use for the following options:

  • To append to the X selection: xsel --append < file
  • To follow a growing file: xsel --follow < file

Any ideas?

Labels:

Saturday, 12 January 2008

Release: libfishsound 0.9.0

Now libfishsound 0.9.0 supports FLAC, the Free Lossless Audio Codec. The patches were originally contributed by Tobias Gehrig in 2004. There hasn't been much use of Ogg FLAC, whereas FLAC in its native encoding is very popular. However, the point of the Ogg mapping is to allow FLAC to be used in parallel with other codecs, in particular as the audio codec for video files. The combination of Theora video and FLAC audio can be very useful for music videos, where you might not care too much if the image has lost some quality but you want the sound to be as good as possible.

However, creating such a file isn't so easy. Let's say you have a source video, like GrooveTV #204 - Jacob Fred Jazz Odyssey. I took the MPEG-1 file as recommended; for clarity, let's call it source.mpg. To make a video to test on, I did:

ffmpeg2theora source.mpg
to encode the video into an Ogg file containing Theora video and Vorbis audio. This produces source.ogv.

oggzrip -c theora source.ogv -o video-theora.ogv
to extract only the Theora video track, into video-theora.ogv.

mpg123 -w source.wav source.mpg
to extract the audio to a wav file, source.wav. Here the audio in the source material was encoded as MPEG I layer II; obviously if you were producing a music video, you'd skip this step and encode FLAC from the original recording. I didn't have that here, and I just wanted a file I could test on.

However, at the least this step means that no further artifacts are introduced into the audio, other than those which were present in the MPEG encoding. If the only source material you have is already encoded, you don't want to degrade it further by re-encoding it with a different codec.

flac --ogg source.wav -o audio-flac.oga
to encode the audio. This produces an Ogg FLAC file called audio-flac.oga.

oggzmerge video-theora.ogv audio-flac.oga -o final.ogv
to merge the video and audio tracks into the final Ogg video file, final.ogv.

Note that we're using the recently recommended file extensions for Ogg video and audio.

If you know an easier way to create Ogg Theora+FLAC files, please leave a note in the comments :-)

Labels: ,