doc/QuickStart

To get the most out of the XMLTV package you will probably want to use
a grabber to get some listings, and then perhaps pipe them through some
filter programs and then use one of the two chooser programs to select
your viewing for the next week.


* Grabbers

These are programs which retrieve TV listings data and output them in
XMLTV format.  Grabbers are included for the following countries:

    Finland                 tv_grab_fi, tv_grab_fi_sv
    France                  tv_grab_fr
    Hungary                 tv_grab_huro
    Iceland                 tv_grab_is
    Italy                   tv_grab_it, tv_grab_it_dvb
    Portugal                tv_grab_pt_vodafone
    Switzerland             tv_grab_ch_search
    US and Canada           tv_grab_na_dd, tv_grab_na_tvmedia

Grabbers are included for the following larger geographic areas:

    Europe/US/Canda/Latin America/Caribbean
                            tv_grab_zz_sdjson, tv_grab_zz_sdjson_sqlite

Contributions from other countries are welcome, of course.

Most grabbers have a configuration stage: once you've decided
what grabber to use, first run it with --configure, for example:

    % tv_grab_fi --configure

If the grabber does not need configuration it will tell you so,
otherwise you will be able to choose what channels to download listings
for (fewer is usually faster).

By default, grabbers print listings to standard output (STDOUT) but you
will probably want to redirect output to a file, for example:

    % tv_grab_fi --output fi.xml

The default is to grab listings for the longest time period possible (one
or two weeks) but to speed up downloading you may want to specify
'--days 1' for one day only.


* Filters

There are some Unix-style filter programs which perform processing on
XMLTV listings files.  In particular the grabber output is not
guaranteed to be in any particular order and you probably want to sort
it.

Each filter is normally run with both input and output redirected, for
example:

    % tv_sort <fi.xml >fi_sorted.xml

Please see the programs' manual pages for more detailed instructions;
also each supports the --help convention to print a usage message.

* * tv_sort: sort listings into date order

Programmes are sorted based on their start time and date, and if those
are the same then their end time.  tv_sort also adds in the 'stop
time' field of a programme, if it is missing, by looking at when the
next thing starts on the same channel.  It also performs some sanity
checks such as checking that two programmes on the same channel do
not overlap.


* * tv_grep: filter listings by regexp matching

This is a tool to extract all programmes or channels that match a
given regular expression.  You can use tv_grep as a quick and dirty
way to filter for your favourite shows, but it is easier to use
tv_pick_cgi or tv_check for that.

The simple usage is with a single regular expression, for example
'tv_grep Countdown'.  It is also possible to match individual fields
of a programme, for example:

    % tv_grep --ignore-case --category drama

How effective this is depends on how well the listings data is organized
into different fields.

There are also some tests which don't quite count as 'grepping':
--on-after TIME will remove all programmes which would already have
finished at time TIME.  So 'tv_grep --on-after now' will filter out
shows you have already missed.  You can combine tests with --or and
--and.


* * tv_split: split listings into separate files

If you want separate XML files for different time periods or channels,
pipe the listings into tv_split with a suitable template for the
output filename.  For example:

    % tv_split --output %channel-%Y%m%d.xml

will generate one output file for each channel/date combination.


* * tv_extractinfo_en: extract info from English-language programmes

Often the listings source being grabbed won't be as machine-readable
as it could be.  For example instead of storing the director of a film
in a separate field, it may simply write 'Directed by William Wyler'.
Or separate programmes on at different times may be combined into a
single entry.  tv_extractinfo_en attempts to sort out this mess using
heuristics and regular expressions that match English-language
descriptions.

It was written for the UK listings, and some of the things corrected
(such as multipart programmes) are specific to that source.  But it
should work for any anglophone listings source.  The North American
programme descriptions are too terse to extract much information, but
tv_extractinfo_en occasionally manages to get the name of a
presenter.

Lots of details that could be handled are not, because any heuristic
for this kind of thing must occasionally get the wrong answer.  It's
more important to minimize the number of false positives.  You should
run a week's listings through tv_extractinfo_en and diff the results
against the original to decide whether you trust the program enough to
use it.


* * tv_imdb: enrich listings with data from the Internet Movie Database

tv_imdb is a filter program which tries to look up programmes in the
publicly available imdb data and add information to them before
writing them out again.  At present it requires you to download some
(rather large) data files from the imdb ftp site.  See the manual page
for more details.


* * tv_tmdb: enrich listings with data from The Movie Database

This filter program tries to look up programmes in the publicly 
available tmdb data and add information to them before writing them 
out again. Registration with tmdb is required to obtain a license key. 
See the manual page for more details.


* * tv_to_latex: convert listings to LaTeX source

To print out listings in a concise format run them through tv_to_latex
and then LaTeX, for example:

    % tv_to_latex <listings_sorted.xml >tv.tex
    % latex tv.tex
    % dvips tv.dvi

and then print tv.ps if it hasn't printed already.  You may want to do
this on the full sorted listings for a complete TV guide (which will
run to many pages), or on the output from tv_pick_cgi or tv_grep for a
personal TV guide.

Tools exist to convert XMLTV data to HTML or to PDFs, but they are not
included in this release.


* * tv_to_text: convert TV listings to plain text

This filter generates a plain text summary of listings.  The
information included in the summary is the same as with tv_to_latex.


* Choosers

The real point of getting a TV guide in machine-readable form is to
let the computer do the work of looking through it finding things for
you to watch.  Two programs are distributed to do this.  tv_check is a
GUI-based program where you select some shows and then generate a
printed report which flags any deviations from the normal weekly
schedule.  tv_pick_cgi is a Web-based program which takes a different
approach: it shows you all the programmes that are on and asks what to
do with each one, then generates a personal TV guide with the shows
chosen.  However preferences are remembered for the next run, so next
time you'll only be asked about new programmes.

See README.tv_check for instructions on using tv_check.

To use tv_pick_cgi, you will need an environment for running CGI
scripts.  If you're lucky enough to have a web server handy, copy the
file as tv_pick.cgi to a directory somewhere, copy XMLTV.pm and the
XMLTV/ directory (which contains more Perl modules) to the same place,
and copy a listings file there as tv.xml.  It is best for the listings
to be sorted.

If you have no web server, you can still run tv_pick_cgi using the
'CGI emulation' mode of the Lynx text-based web browser.  Run 'lynx
lynxcgi:tv_pick_cgi'.  This assumes your Lynx has the CGI emulation
compiled in - if not, suggest it to your vendor.  Quick guide to Lynx:
move between radio buttons using up-arrow and down-arrow.  Press
right-arrow to select a radio button, to press an on-screen button
like 'Submit' move the highlight to it and press Enter.

You should now be presented with a list of all programmes you haven't
seen before (on the first run, this will be everything).  For each
programme there are four choices:

  never  - no, I don't want to watch it, and don't ask me about
           programmes with this title ever again.

  no     - I won't watch it this time, but ask me again next time.

  yes    - I might watch it (put it in the output listings), but ask me
           again next time.

  always - whenever a programme with this title appears, always put it
           in the output without asking.

The default option for unrecognized titles is 'never', reflecting the
fact that most things on TV are rubbish.  Because something marked as
'never' is effectively censored from all future sessions with
tv_pick_cgi, you should be sure to change this for any programme you
might want to watch in the future.  Saying 'no' is a safe choice for
things you don't want to watch.

when you've chosen your preferences for everything on the page, press
'Submit' and a page will appear confirming your preferences and
listing which of the programmes will appear in the output ('planning
to watch').  There will probably be several pages of listings, so go
to 'Next page' and repeat as necessary.  Take comfort in the thought
that you'll never have to deal with most of these shows ever
again :-).

At the end a personal listings file towatch.xml is generated, which
you can download with your browser if you want, and your preferences
are stored for next time in the file tvprefs.  It is worth checking
this file after your first use of pick_cgi in case you accidentally
marked something as 'never'.


* Using the tools together

It's probably easiest, once you get used to the tools, to run them
together in a pipeline.  For example:

    % tv_grab_fi | tv_sort | tv_extractinfo_en \
          | tv_sort | tv_grep --on-after now >guide.xml

This gets listings, sorts them, munges them through
tv_extractinfo_en to see what it finds (in this case it will probably
break up 'Open University' into subprogrammes, among other things),
sorts again and filters out those programmes already missed.  The
first sorting is needed to add stop times to programmes to give
tv_extractinfo_en the most information to work on; the second sorting
because tv_extractinfo_en does not necessarily produce fully sorted
output.  Most of the XMLTV tools do not strictly require that the
input be sorted, but they tend to work a bit better if it is.

Then run 'tv_check --scan' or use tv_pick_cgi to generate a text
report or a personal TV listing.