-
Notifications
You must be signed in to change notification settings - Fork 98
/
Copy pathQuickStart
258 lines (185 loc) · 10.3 KB
/
QuickStart
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
To get the most out of the XMLTV package you will probably want to use
a grabber to get some listings, and then perhaps pipe them through some
filter programs and then use one of the two chooser programs to select
your viewing for the next week.
* Grabbers
These are programs which retrieve TV listings data and output them in
XMLTV format. Grabbers are included for the following countries:
Finland tv_grab_fi, tv_grab_fi_sv
France tv_grab_fr
Hungary tv_grab_huro
Iceland tv_grab_is
Italy tv_grab_it, tv_grab_it_dvb
Portugal tv_grab_pt_vodafone
Switzerland tv_grab_ch_search
US and Canada tv_grab_na_dd, tv_grab_na_tvmedia
Grabbers are included for the following larger geographic areas:
Europe/US/Canda/Latin America/Caribbean
tv_grab_zz_sdjson, tv_grab_zz_sdjson_sqlite
Contributions from other countries are welcome, of course.
Most grabbers have a configuration stage: once you've decided
what grabber to use, first run it with --configure, for example:
% tv_grab_fi --configure
If the grabber does not need configuration it will tell you so,
otherwise you will be able to choose what channels to download listings
for (fewer is usually faster).
By default, grabbers print listings to standard output (STDOUT) but you
will probably want to redirect output to a file, for example:
% tv_grab_fi --output fi.xml
The default is to grab listings for the longest time period possible (one
or two weeks) but to speed up downloading you may want to specify
'--days 1' for one day only.
* Filters
There are some Unix-style filter programs which perform processing on
XMLTV listings files. In particular the grabber output is not
guaranteed to be in any particular order and you probably want to sort
it.
Each filter is normally run with both input and output redirected, for
example:
% tv_sort <fi.xml >fi_sorted.xml
Please see the programs' manual pages for more detailed instructions;
also each supports the --help convention to print a usage message.
* * tv_sort: sort listings into date order
Programmes are sorted based on their start time and date, and if those
are the same then their end time. tv_sort also adds in the 'stop
time' field of a programme, if it is missing, by looking at when the
next thing starts on the same channel. It also performs some sanity
checks such as checking that two programmes on the same channel do
not overlap.
* * tv_grep: filter listings by regexp matching
This is a tool to extract all programmes or channels that match a
given regular expression. You can use tv_grep as a quick and dirty
way to filter for your favourite shows, but it is easier to use
tv_pick_cgi or tv_check for that.
The simple usage is with a single regular expression, for example
'tv_grep Countdown'. It is also possible to match individual fields
of a programme, for example:
% tv_grep --ignore-case --category drama
How effective this is depends on how well the listings data is organized
into different fields.
There are also some tests which don't quite count as 'grepping':
--on-after TIME will remove all programmes which would already have
finished at time TIME. So 'tv_grep --on-after now' will filter out
shows you have already missed. You can combine tests with --or and
--and.
* * tv_split: split listings into separate files
If you want separate XML files for different time periods or channels,
pipe the listings into tv_split with a suitable template for the
output filename. For example:
% tv_split --output %channel-%Y%m%d.xml
will generate one output file for each channel/date combination.
* * tv_extractinfo_en: extract info from English-language programmes
Often the listings source being grabbed won't be as machine-readable
as it could be. For example instead of storing the director of a film
in a separate field, it may simply write 'Directed by William Wyler'.
Or separate programmes on at different times may be combined into a
single entry. tv_extractinfo_en attempts to sort out this mess using
heuristics and regular expressions that match English-language
descriptions.
It was written for the UK listings, and some of the things corrected
(such as multipart programmes) are specific to that source. But it
should work for any anglophone listings source. The North American
programme descriptions are too terse to extract much information, but
tv_extractinfo_en occasionally manages to get the name of a
presenter.
Lots of details that could be handled are not, because any heuristic
for this kind of thing must occasionally get the wrong answer. It's
more important to minimize the number of false positives. You should
run a week's listings through tv_extractinfo_en and diff the results
against the original to decide whether you trust the program enough to
use it.
* * tv_imdb: enrich listings with data from the Internet Movie Database
tv_imdb is a filter program which tries to look up programmes in the
publicly available imdb data and add information to them before
writing them out again. At present it requires you to download some
(rather large) data files from the imdb ftp site. See the manual page
for more details.
* * tv_tmdb: enrich listings with data from The Movie Database
This filter program tries to look up programmes in the publicly
available tmdb data and add information to them before writing them
out again. Registration with tmdb is required to obtain a license key.
See the manual page for more details.
* * tv_to_latex: convert listings to LaTeX source
To print out listings in a concise format run them through tv_to_latex
and then LaTeX, for example:
% tv_to_latex <listings_sorted.xml >tv.tex
% latex tv.tex
% dvips tv.dvi
and then print tv.ps if it hasn't printed already. You may want to do
this on the full sorted listings for a complete TV guide (which will
run to many pages), or on the output from tv_pick_cgi or tv_grep for a
personal TV guide.
Tools exist to convert XMLTV data to HTML or to PDFs, but they are not
included in this release.
* * tv_to_text: convert TV listings to plain text
This filter generates a plain text summary of listings. The
information included in the summary is the same as with tv_to_latex.
* Choosers
The real point of getting a TV guide in machine-readable form is to
let the computer do the work of looking through it finding things for
you to watch. Two programs are distributed to do this. tv_check is a
GUI-based program where you select some shows and then generate a
printed report which flags any deviations from the normal weekly
schedule. tv_pick_cgi is a Web-based program which takes a different
approach: it shows you all the programmes that are on and asks what to
do with each one, then generates a personal TV guide with the shows
chosen. However preferences are remembered for the next run, so next
time you'll only be asked about new programmes.
See README.tv_check for instructions on using tv_check.
To use tv_pick_cgi, you will need an environment for running CGI
scripts. If you're lucky enough to have a web server handy, copy the
file as tv_pick.cgi to a directory somewhere, copy XMLTV.pm and the
XMLTV/ directory (which contains more Perl modules) to the same place,
and copy a listings file there as tv.xml. It is best for the listings
to be sorted.
If you have no web server, you can still run tv_pick_cgi using the
'CGI emulation' mode of the Lynx text-based web browser. Run 'lynx
lynxcgi:tv_pick_cgi'. This assumes your Lynx has the CGI emulation
compiled in - if not, suggest it to your vendor. Quick guide to Lynx:
move between radio buttons using up-arrow and down-arrow. Press
right-arrow to select a radio button, to press an on-screen button
like 'Submit' move the highlight to it and press Enter.
You should now be presented with a list of all programmes you haven't
seen before (on the first run, this will be everything). For each
programme there are four choices:
never - no, I don't want to watch it, and don't ask me about
programmes with this title ever again.
no - I won't watch it this time, but ask me again next time.
yes - I might watch it (put it in the output listings), but ask me
again next time.
always - whenever a programme with this title appears, always put it
in the output without asking.
The default option for unrecognized titles is 'never', reflecting the
fact that most things on TV are rubbish. Because something marked as
'never' is effectively censored from all future sessions with
tv_pick_cgi, you should be sure to change this for any programme you
might want to watch in the future. Saying 'no' is a safe choice for
things you don't want to watch.
when you've chosen your preferences for everything on the page, press
'Submit' and a page will appear confirming your preferences and
listing which of the programmes will appear in the output ('planning
to watch'). There will probably be several pages of listings, so go
to 'Next page' and repeat as necessary. Take comfort in the thought
that you'll never have to deal with most of these shows ever
again :-).
At the end a personal listings file towatch.xml is generated, which
you can download with your browser if you want, and your preferences
are stored for next time in the file tvprefs. It is worth checking
this file after your first use of pick_cgi in case you accidentally
marked something as 'never'.
* Using the tools together
It's probably easiest, once you get used to the tools, to run them
together in a pipeline. For example:
% tv_grab_fi | tv_sort | tv_extractinfo_en \
| tv_sort | tv_grep --on-after now >guide.xml
This gets listings, sorts them, munges them through
tv_extractinfo_en to see what it finds (in this case it will probably
break up 'Open University' into subprogrammes, among other things),
sorts again and filters out those programmes already missed. The
first sorting is needed to add stop times to programmes to give
tv_extractinfo_en the most information to work on; the second sorting
because tv_extractinfo_en does not necessarily produce fully sorted
output. Most of the XMLTV tools do not strictly require that the
input be sorted, but they tend to work a bit better if it is.
Then run 'tv_check --scan' or use tv_pick_cgi to generate a text
report or a personal TV listing.