blog.kfish.org

My name is Conrad Parker, and I live in Kyoto, Japan. I work with Renesas in Tokyo, designing the Linux multimedia architecture for a new line of mobile processors; and for Wikimedia Foundation, working on Ogg integration for Mozilla Firefox. I am also working towards a PhD in Computer Science at Kyoto University. Free software projects include the Sweep sound editor and the Annodex media system, and various smaller ones that you can read about here.

Tuesday, 25 March 2008

Release: HOgg 0.4.0

HOgg is a Haskell library and commandline tool for manipulating Ogg files. This release contains a bunch of code written during FOMS and LCA 2008, including a new sort subcommand and proper handling of Skeleton when merging and ripping files. Full details are in the release notes.

sort implementation

My favourite part is the implementation of the new sort subcommand:

sort :: [OggPage] -> [OggPage]
sort = sortHeaders . listMerge . demux

This is somewhat shorter than the equivalent C implementation, oggz-sort.cHaskell affords abstraction whereas in C it's a trade-off. sortHeaders is a long (21 line) function that re-orders header pages according to the Theora and Skeleton specifications, and listMerge is a generic list merging function, also used in the merge subcommand. demux is tiny:

demux :: (Serialled a) => [a] -> [[a]]
demux = classify serialEq
You can read that as "demux is classification by serial number": classify is a generic list function, classifying list elements according to some criterion you give it. Here, for example, the list of pages:
[Video0, Audio0, Video1, Audio1, Audio2, Audio3, Video2, Audio4, Video3, ...]
will get classified into two separate lists:
[[Video0, Video1, Video2, Video3, ...],
 [Audio0, Audio1, Audio2, Audio3, Audio4, ...]]
This is done lazily, meaning that the processing is done on the fly and big intermediate lists are not constructed in memory. Video0, Audio0 will be passed through listMerge and sortHeaders and written to disk by the consumer of sort well before Video103 and Audio5007 are seen.

Documentation improvements and self-checking

The help for each subcommand now contains long descriptions, mostly similar to the man pages of the Oggz tools. The descriptions also have explicit sections describing how Theora, Skeleton and chained files are handled. The example commandlines for each subcommand use the Ogg MIME types and file extensions that we are now recommending in Xiph.Org.

The best bit though is hogg selfcheck, which checks that the help examples are valid. It checks that all the example commandlines pass through getOpt without errors, and that all file extensions used in options are valid. This is the kind of nice touch which would have been a pain to code up in C, but fell out cleanly in the Haskell implementation. As it is fairly cheap to run (and printing help text is hardly a performance-critical operation), this option is also silently run after printing out any help output at all, so that such errors are more likely to be found and reported. The same commit that introduced hogg selfcheck also fixed two such documentation errors which were found by this option :-)

Labels: ,

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home