about

i am currently working as an open source software developer/maintainer and freelancer. you can hire me for any of these tasks:

  • system administration
  • system programming
  • web application development
  • consulting

for detailed information take a look at my résumé. please note that i currently do unix/linux and open source/free software only.

contact

Benedikt Böhm
Berlin, Germany

SMTP/XMPP
bb@xnull.de
IRC
Hollow at irc.oftc.net or irc.freenode.net
GPG
key id: 0xB5FAF161

detailed information is available. don't forget to take a look at my xing profile and gulp.de profile.

networking

if you do social networking you can follow me with these services:

xing gulp github ohloh twitter soup.io delicious last.fm

mtscrape

Author: Benedikt Böhm
Version: 0.2
Web:http://bb.xnull.de/projects/mtscrape/
Source:http://git.xnull.de/gitweb/?p=mtscrape.git (also on github)
Download:http://bb.xnull.de/projects/mtscrape/dist/

Rationale

In 2007, the public broadcasting services in Germany started their online portal for on-demand content. Users can now watch all broadcastings without license restrictions for (at least) 7 days after their original transmission.

Unfortunately, the only available formats are WMV and MOV, and therefore users of free software are forced to use the VLC browser plugin, which does not support pausing the stream, is awfully slow, and generally sucks.

To rectify this situation, mtscrape can save and convert these streams.

Usage

mtscrape [OPTIONS] [SOURCES]

Options:
  -a, --age       Maximum age of feed item
  -c, --convert   Convert wmv to ogg/theora
  -d, --dir       Output directory
  -f, --fast      Stream low-quality (faster)
  -m, --match     Regular expression to match against items
  -h, --help      This help text
  -v, --verbose   Verbose output

Sources:
  -A, --asx       Parse ASX link/file directly
  -C, --category  Parse RSS feed of mediathek category ID
  -I, --item      Parse JSON of mediathek item ID
  -L, --link      Parse JSON of mediathek link

Note: -a and -m are only used in combination with -C. -f is not used in combination with -A.

Note: you can specify all sources multiple times to download multiple items.

Examples

This command will download the item with ID 257404 (heute 100sec, 15.10.07):

mtscrape -v -I 257404

Same file as above, but using the ASX file directly (easier to find in the webinterface):

mtscrape -v -A http://wstreaming.zdf.de/zdf/veryhigh/071015_hko_2000.asx

Same file as above, but using the HTTP link directly (easier to find in the RSS feed):

mtscrape -v -L http://www.zdf.de/ZDFmediathek/content/heute_100SEC/166/257404

This will stream all items from category 208 (JBK) and 414 (Maybritt Illner) since one week ago into directory /data/talkshows/:

mtscrape -v -a 7 -d /data/talkshows -C 208 -C 414

This will stream all items from category ID 228 (heute) since yesterday whose title matches the regular expression '.*heute-journal.*':

mtscrape -v -a 1 -C 228 -m '.*heute-journal.*'

Bugs

This script will not work with livestreams. mtscrape has been designed for automatic download of ondemand content, but downloading (parts) of the livestream requires manual work (i.e. stoping the recording) and is therefore not implemented in this script.

In fact, mtscrape even prevents livestreams from being downloaded if they appear in the RSS feeds.

To dump the livestream you can use mplayer directly:

mplayer \
  -playlist \
  http://wgeostreaming.zdf.de/encoder/livestream15_vh.asx \
  -dumpstream \
  -dumpfile zdf_stream.wmv

Requirements

To use this script you need the following software installed on your system:

  • Ruby-1.8.x (apt-get install ruby1.8 rubygems, emerge ruby, etc)
  • Ruby-JSON (gem install json)
  • LibXML-Ruby (gem install libxml-ruby)
  • mplayer (apt-get install mplayer, emerge mplayer, etc)
  • ffmpeg2theora (only for --convert)