-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2019-04-paderborn.html
77 lines (77 loc) · 31.4 KB
/
2019-04-paderborn.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:t="http://www.tei-c.org/ns/1.0" xml:lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta><title>Modelling meaning: a short history of text markup </title><meta name="generator" content="Generated by TEISLIDY stylesheet"></meta><script src="https://www.w3.org/Talks/Tools/Slidy/slidy.js" type="text/javascript"></script><link rel="stylesheet" type="text/css" media="screen, projection" href="https://www.w3.org/Talks/Tools/Slidy/show.css"></link><link href="../css/egXMLhandling.css" rel="stylesheet" type="text/css"></link><link href="../css/tei.css" rel="stylesheet" type="text/css"></link></head><body class="simple" id="TOP"><div class="slide cover"><img src="media/logo.jpg" width="40%" style="float:left" alt="[Put logo here]" class="cover"></img><br clear="all"></br><h1>Modelling meaning: a short history of text markup </h1><p>Lou Burnard</p></div><div class="slide"><h2>A naive realist's manifesto</h2><div class="figure"><img src="media/model-1.png" alt="" class="graphic"></img></div><p class="box">Modelling matters</p></div><div class="slide"><h2>A very long time ago</h2><p>Let's start in the unfamiliar world of the mid-1980s... </p><ul class="pause"><li class="item">the world wide web did not exist</li><li class="item">the tunnel beneath the English Channel was still being built</li><li class="item">a state called the Soviet Union had just launched a space station called Mir</li><li class="item">serious computing was done on mainframes </li><li class="item">the world was managing nicely without the DVD, the mobile phone, cable tv, or Microsoft Word</li></ul></div><div class="slide"><h2>...but also a familiar one</h2><ul><li class="item">corpus linguistics and <span class="q">‘artificial intelligence’</span> had created a demand for large scale textual resources in academia and beyond</li><li class="item">advances in text processing were beginning to affect lexicography and document management systems (e.g. TeX, Scribe, (S)GML ...)</li><li class="item">the Internet existed for academics and for the military; theories about how to use it <span class="q">‘hypertextually’</span> abounded</li><li class="item">books, articles, and even courses in something called "Computing in the Humanities" were beginning to appear</li></ul></div><div class="slide"><h2>Modelling the data vs modelling the text</h2><p>By the end of the 1970s, methods variously called <span class="q">‘data modelling’</span>, <span class="q">‘conceptual analysis’</span>, <span class="q">‘database design’</span> vel sim. had become common practice.</p><ul><li class="item">remember: a centralised mainframe world dominated by IBM</li><li class="item">spread of office automation and consequent data integration</li><li class="item">ANSI SPARC three level model</li></ul><div class="figure"><img src="media/ansi-sparc.png" alt="" class="graphic"></img></div></div><div class="slide"><h2>An inherently reductive process</h2><div class="figure"><img src="media/ansi-sparc-2.png" alt="" class="graphic"></img></div><p class="box">how applicable are such methods to the complexity of historical data sources?</p></div><div class="slide"><h2>The 1980s were a period of technological enthusiasm</h2><ul><li class="item">Digital methods and digital resources, despite their perceived strangeness were increasingly evident in the Humanities </li><li class="item">There was some public funding of infrastructural activities, both at national and European levels: in the UK, for example, the <span class="titlem">Computers in Teaching Initiative</span> and the <span class="titlem">Arts and Humanities Data Service</span></li><li class="item">Something radically new, or just an update ? </li><li class="item">Humanities Computing (aka Digital Humanities) gets a foothold, by establishing courses </li></ul><div class="figure"><img src="media/slide20.png" alt="" class="graphic" style=" height:30%;"></img></div></div><div class="slide"><h2>Re-invention of <span class="foreign">quellenkritik</span></h2><p class="box"><span class="q">‘History that is not quantifiable cannot claim to be scientific’</span> (Le Roy Ladurie 1972)</p><ul><li class="item">In the UK, a series of <span class="hi">History and Computing</span> (1986-1990) conferences showed historians already using commercial DBMS, data analysis tools developed for survey analysis, "personal database systems" ...</li><li class="item">In France, J-P Genet and others influenced by the <span class="hi">Annales</span> school proposed a programme of digitization of historical sources records</li><li class="item">Further pursued by Manfred Thaller with the program <em>kleio</em> (1982) -- a tool for transcribing and analysing (extracts from) historical sources, which included annotation of their content/significance</li><li class="item">Thaller also (in 1989) challenged advocates of Humanities Computing to define its underlying theory </li></ul></div><div class="slide"><h2>Theorizing Humanities Computing</h2><ul><li class="item">What <span class="hi">are</span> the underlying principles of the tools used in Humanities Computing (then) or Digital Humanities (now)?</li><li class="item">Unsworth and others eventually (by 2002) start using the phrase ”scholarly primitives” to characterise a core set of procedures e.g. <ul><li class="item"><span class="hi">searching</span> on the basis of externally-defined features</li><li class="item"><span class="hi">analysis</span> in terms of internally-defined features</li><li class="item"><span class="hi">association</span> according to shared readings</li></ul></li></ul><div class="figure"><img src="media/scholPrim.png" alt="(Hughes 2012)" class="graphic" style=" width:50%;"></img><h2>(Hughes 2012)</h2></div><p class="box">Isn't the <span class="hi">modelling of textual data</span> at the heart of all these?</p></div><div class="slide"><h2>Serious computing meets text</h2><div class="figure"><img src="media/txttrin.png" alt="The Textual Trinity (Burnard 1987)" class="graphic" style=" height:70%;"></img><h2>The Textual Trinity (Burnard 1987)</h2></div><blockquote class="quote"><p>In interpreting text, the trained human brain operates quite successfully on three distinct levels; not surprisingly, three distinct types of computer software have evolved to mimic these capabilities.</p></blockquote></div><div class="slide"><h2>Text is little boxes</h2><div class="figure"><img src="media/texQuote.png" alt="(Preliminary description of TEX: D Knuth, May 13, 1977)" class="graphic" style=" width:80%;"></img><h2>(Preliminary description of TEX: D Knuth, May 13, 1977)</h2></div><ul><li class="item"><span class="hi">TeX</span> was developed by Donald Knuth, a Stanford mathematician, to produce high quality typeset output from annotated text</li><li class="item">Knuth also developed the associated idea of <span class="hi">literate programming</span>: that software and its documentation should be written and maintained as an integrated whole</li><li class="item">TeX is still widely used, particularly in the academic community: it is open source and there are several implementations</li></ul></div><div class="slide"><h2>No, text is data</h2><div class="figure"><img src="media/germanDict.jpg" alt="" class="graphic" style=" height:90%;"></img></div></div><div class="slide"><h2>Database orthodoxy</h2><ul><li class="item">identify important entities which exist in the real world and the relationships amongst them</li><li class="item">formally define a conceptual model of that universe of discourse</li><li class="item">map the conceptual model to a storage model (network, relational, whatever...)</li></ul><p class="box">But what are the "important entities" we might wish to identify in a textual resource? </p></div><div class="slide"><div class="frame"><div class="col"><h2>Assize court records, for example</h2><div class="figure"><img src="media/recogModel.png" alt="(, 1980) " class="graphic" style=" width:80%;"></img><h2>(<span class="titlem">An application of CODASYL techniques to research in the humanities</span>, 1980) </h2></div></div><div class="col"><div class="figure"><img src="media/1671assizes.jpg" alt="" class="graphic" style=" width:80%;"></img></div></div></div></div><div class="slide"><h2>Scribe</h2><p><span class="hi">Scribe</span> developed by Brian K Reid at Carnegie Mellon in the 1980s, was one of the earliest successful document production systems to separate content and format, and to use a formal document specification language. Its commercial exploitation was short-lived, but its ideas were very influential.</p><div class="figure"><img src="media/scribe-1.png" alt="" class="graphic"></img></div><div class="figure"><img src="media/scribe-2.png" alt="" class="graphic"></img></div></div><div class="slide"><h2>(S)GML</h2><div class="figure"><img src="media/sgmlQuote.png" alt="(A Brief History of the Development of SGML (C)1990 SGML Users' Group )" class="graphic" style=" width:70%;"></img><h2>(A Brief History of the Development of SGML (C)1990 SGML Users' Group )</h2></div><p>Charles Goldfarb and others developed a "Generalized Markup Language" for IBM, which subsequently became an ISO standard (ISO 8879: 1986) </p><p>SGML was designed to enable the sharing and long term preservation of machine-readable documents for use in large scale projects in government, the law, and industry. the military, and other industrial-scale publishing industries. </p><p class="box">SGML is the ancestor of HTML and of XML ... it defined for a whole generation a new way of thinking about <span class="hi">what text really is</span></p></div><div class="slide"><h2>Motivations</h2><ul><li class="item">an enormous increase in the quantity of technical documentation : the aircraft carrier story</li><li class="item">an enormous increase in its complexity : the Gare de Lyon story </li><li class="item">a proliferation of mutually incompatible document formats </li><li class="item">an almost evangelical desire for centrally-defined standards</li><li class="item">a mainframe-based, not yet distributed, world in transition </li></ul></div><div class="slide"><h2>What is a text?</h2><ul><li class="item">content: the components (words, images etc). which make up a document </li><li class="item">structure: the organization and inter-relationship of those components </li><li class="item">presentation: how a document looks and what processes are applied to it </li><li class="item">.. and possibly many readings</li></ul><div class="figure"><img src="media/19790809_002v.jpg" alt="" class="graphic" style=" height:60%;"></img></div></div><div class="slide"><h2>Separating content, structure, and presentation means : </h2><ul><li class="item">the content can be re-used </li><li class="item">the structure can be formally validated </li><li class="item">the presentation can be customized for <ul><li class="item">different media </li><li class="item">different audiences</li></ul></li><li class="item">in short, the information can be uncoupled from its processing</li></ul><p>This is not a new idea! But it's a good one...</p></div><div class="slide"><h2>Some ambitious claims ensued </h2><div class="figure"><img src="media/xml-slide.png" alt="(Presentation for Oxford IT Support Staff Conference, 1994)" class="graphic"></img><h2>(Presentation for Oxford IT Support Staff Conference, 1994)</h2></div></div><div class="slide"><div class="figure"><img src="media/questions.png" alt="" class="graphic"></img></div></div><div class="slide"><div class="figure"><img src="media/questions-2.png" alt="" class="graphic"></img></div></div><div class="slide"><div class="figure"><img src="media/questions-3.png" alt="" class="graphic"></img></div></div><div class="slide"><div class="frame"><div class="col"><h2>A fuller example...</h2><div id="index.xml-egXML-d30e342" class="pre egXML_valid"><!--This otherwise redundant comment ensures egXMLs format nicely--><span class="element"><carte <span class="attribute">n</span>="<span class="attributevalue">0010</span>"></span>
<span class="element"><recto <span class="attribute">url</span>="<span class="attributevalue">19800726_001r.jpg</span>"></span>
<span class="element"><desc></span>Vue d'un cours d'eau avec un pont en pierre et des
petites maisons de style mexicain. Un homme et une femme
navigue un pédalo en premier plan a gauche.<span class="element"></desc></span>
<span class="element"><head></span>San Antonio River<span class="element"></head></span>
<span class="element"></recto></span>
<span class="element"><verso <span class="attribute">url</span>="<span class="attributevalue">19800726_001v.jpg</span>"></span>
<span class="element"><obliteration></span>
<span class="element"><lieu></span>El Paso TX 799<span class="element"></lieu></span>
<span class="element"><date></span>18-08-1980<span class="element"></date></span>
<span class="element"></obliteration></span>
<span class="element"><message></span>
<span class="element"><p></span>26 juill 80<span class="element"></p></span>
<span class="element"><p></span>Chère Madame , après New-York et Washington dont le
gigantisme m'a beaucoup séduite, nous avons commencé
notre conquête de l'Ouest par New Orleans, ville folle
en fête perpétuelle. Il fait une chaleur torride au
Texas mais le coca-cola permet de résister –
l'Amérique m'enchante ! Bientôt, le grand Canyon, le
Colorado et San Francisco... <span class="element"></p></span>
<span class="element"><p></span> En espérant que vous passez de bonnes vacances,
affectueusement. <span class="element"></p></span>
<span class="element"><signature></span> Sylvie <span class="element"></signature></span>
<span class="element"><signature></span>François <span class="element"></signature></span>
<span class="element"></message></span>
<span class="element"><destinataire></span>Madame Lefrère
4, allée George Rouault
75020 Paris
France
<span class="element"></destinataire></span>
<span class="element"></verso></span>
<span class="element"></carte></span></div></div><div class="col"><div class="figure"><img src="media/19800726_001r.jpg" alt="" class="graphic" style=" height:40%;"></img></div><div class="figure"><img src="media/19800726_001v.jpg" alt="" class="graphic" style=" height:40%;"></img></div></div></div></div><div class="slide"><h2>A digital text may be ... </h2><p class="box">a <span class="q">‘substitute’</span> (surrogate) simply representing the appearance of an existing document</p><div class="figure"><img src="media/graves-2.png" alt="" class="graphic" style=" height:80%;"></img></div></div><div class="slide"><h2>... or it may be</h2><p class="box">a representation of its linguistic content and structure, with additional annotations about its meaning and context.</p><div class="figure"><img src="media/graves-1.png" alt="" class="graphic" style=" height:80%;"></img></div></div><div class="slide"><h2>What does the markup do?</h2><ul><li class="item">It makes explicit to a processor <em>how</em> something should be processed.</li><li class="item">In the past, ‘markup’ was what told a typesetter how to deal with a manuscript</li><li class="item">Nowadays, it is what tells a computer program how to deal with a stream of <span class="hi">textual data</span>.</li></ul><p class="box">... and it also expresses the encoder's view of what <span class="hi">matters</span> in this document, thus determining how it can subsequently be analysed.</p></div><div class="slide"><h2>Where is the textual data and where is the markup?</h2><div class="figure"><img src="media/beowulf-ms.png" alt="BL Ms Cotton Vitelius A xv, fol. 129r" class="graphic" style=" height:80%;"></img><h2>BL Ms Cotton Vitelius A xv, fol. 129r</h2></div></div><div class="slide"><h2>Where is the textual data and where is the markup?</h2><div class="figure"><img src="media/beowulf-wrenn.png" alt="Beowulf, ed. C L Wrenn (with student annotations)" class="graphic" style=" height:70%;"></img><h2>Beowulf, ed. C L Wrenn (with student annotations)</h2></div></div><div class="slide"><h2>Which textual data matters ?</h2><ul class="pause"><li class="item">the shape of the letters and their layout?</li><li class="item">the presumed creator of the writing?</li><li class="item">the (presumed) intentions of the creator? </li><li class="item">the stories we read into the writing? </li></ul><p class="box">A ‘document’ is something that exists in the world, which we can <span class="term">digitize</span>.</p><p class="box">A ‘text’ is an abstraction, created by or for a community of readers, which we can <span class="term">encode</span>.</p></div><div class="slide"><h2>The document as <span class="q">‘Text-Bearing Object’</span>(TBO)</h2><p class="box"><span style="font-style:italic">Materia appetit formam ut virum foemina</span></p><ul><li class="item">Traditionally, we distinguish form and content</li><li class="item">In the same way, we might think of an inscription or a manuscript as the bearer or container or form instantiating an abstract notion -- a text</li></ul><p class="box">But don't forget... digital texts are also TBOs!</p></div><div class="slide"><h2>Markup is a scholarly activity</h2><ul><li class="item">The application of markup to a document is an intellectual activity</li><li class="item">Deciding exactly what markup to apply and why is much the same as editing a text </li><li class="item">Markup is rarely neutral, objective, or deterministic : interpretation is needed</li><li class="item">Because it obliges us to confront difficult ontological questions, markup can be considered a research activity in itself</li><li class="item">Good textual encoding is never as easy or quick as people would believe -- do things better, not necessarily quicker</li><li class="item">The markup scheme used for a project should result from a detailed analysis of the properties of the objects the project aims to use or create</li></ul></div><div class="slide"><h2>Compare the markup</h2><div class="p"><div id="index.xml-egXML-d30e513" class="pre egXML_valid"><!--This otherwise redundant comment ensures egXMLs format nicely--><span class="element"><hi <span class="attribute">rend</span>="<span class="attributevalue">dropcap</span>"></span>H<span class="element"></hi></span>
<span class="element"><g <span class="attribute">ref</span>="<span class="attributevalue">#wynn</span>"></span>W<span class="element"></g></span>ÆT WE GARDE <span class="element"><lb/></span>na in
gear-dagum þeod-cyninga <span class="element"><lb/></span>þrym gefrunon, hu ða æþelingas <span class="element"><lb/></span>ellen
fremedon. oft scyld scefing sceaþe
<span class="element"><add></span>na<span class="element"></add></span>
<span class="element"><lb/></span>þreatum, moneg<span class="element"><expan></span>um<span class="element"></expan></span> mægþum meodo-setl
<span class="element"><add></span>a<span class="element"></add></span>
<span class="element"><lb/></span>of<span class="element"><damage></span>
<span class="element"><desc></span>blot<span class="element"></desc></span>
<span class="element"></damage></span>teah ...</div> <div id="index.xml-egXML-d30e542" class="pre egXML_valid"><!--This otherwise redundant comment ensures egXMLs format nicely--><span class="element"><lg></span>
<span class="element"><l></span>Hwæt! we Gar-dena in gear-dagum<span class="element"></l></span>
<span class="element"><l></span>þeod-cyninga þrym gefrunon,<span class="element"></l></span>
<span class="element"><l></span>hu ða æþelingas ellen fremedon,<span class="element"></l></span>
<span class="element"></lg></span>
<span class="element"><lg></span>
<span class="element"><l></span>Oft <span class="element"><persName></span>Scyld Scefing<span class="element"></persName></span>
sceaþena þreatum,<span class="element"></l></span>
<span class="element"><l></span>monegum mægþum meodo-setla ofteah;<span class="element"></l></span>
<span class="element"><l></span>egsode <span class="element"><orgName></span>Eorle<span class="element"></orgName></span>, syððan ærest wearþ<span class="element"></l></span>
<span class="element"><l></span>feasceaft funden...<span class="element"></l></span>
<span class="element"></lg></span></div></div></div><div class="slide"><h2>... and </h2><div id="index.xml-egXML-d30e569" class="pre egXML_valid"><!--This otherwise redundant comment ensures egXMLs format nicely--><span class="element"><s></span>
<span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">interj</span>" <span class="attribute">lemma</span>="<span class="attributevalue">hwaet</span>"></span>Hwæt<span class="element"></w></span>
<span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">pron</span>" <span class="attribute">lemma</span>="<span class="attributevalue">we</span>"></span>we<span class="element"></w></span>
<span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">npl</span>" <span class="attribute">lemma</span>="<span class="attributevalue">gar-denum</span>"></span>Gar-dena<span class="element"></w></span>
<span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">prep</span>" <span class="attribute">lemma</span>="<span class="attributevalue">in</span>"></span>in<span class="element"></w></span>
<span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">npl</span>" <span class="attribute">lemma</span>="<span class="attributevalue">gear-dagum</span>"></span>gear-dagum<span class="element"></w></span> ...
<span class="element"></s></span></div><p>or even</p><div id="index.xml-egXML-d30e584" class="pre egXML_valid"><!--This otherwise redundant comment ensures egXMLs format nicely--><span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">npl</span>" <span class="attribute">corresp</span>="<span class="attributevalue">#w2</span>"></span>Gar-dena<span class="element"></w></span>
<span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">prep</span>" <span class="attribute">corresp</span>="<span class="attributevalue">#w3</span>"></span>in<span class="element"></w></span>
<span class="element"><w <span class="attribute">pos</span>="<span class="attributevalue">npl</span>" <span class="attribute">corresp</span>="<span class="attributevalue">#w4</span>"></span>gear-dagum<span class="element"></w></span>
<span class="comment"><!-- ... --></span>
<span class="element"><w <span class="attribute">xml:id</span>="<span class="attributevalue">w2</span>"></span>armed danes<span class="element"></w></span>
<span class="element"><w <span class="attribute">xml:id</span>="<span class="attributevalue">w3</span>"></span>in<span class="element"></w></span>
<span class="element"><w <span class="attribute">xml:id</span>="<span class="attributevalue">w4</span>"></span>days of yore<span class="element"></w></span></div><p>.. not to mention ... </p><div id="index.xml-egXML-d30e600" class="pre egXML_valid"><!--This otherwise redundant comment ensures egXMLs format nicely-->
<span class="comment"><!-- ... --></span><span class="element"><l></span>Oft <span class="element"><persName <span class="attribute">ref</span>="<span class="attributevalue">https://en.wikipedia.org/wiki/Skj%C3%B6ldr</span>"></span>Scyld Scefing<span class="element"></persName></span>
sceaþena þreatum,<span class="element"></l></span></div><p>or even</p><div id="index.xml-egXML-d30e609" class="pre egXML_valid"><!--This otherwise redundant comment ensures egXMLs format nicely--><span class="element"><l></span>Oft <span class="element"><persName <span class="attribute">ref</span>="<span class="attributevalue">#skioldus</span>"></span>Scyld Scefing<span class="element"></persName></span>
sceaþena þreatum,<span class="element"></l></span>
<span class="comment"><!-- ... --></span>
<span class="element"><person <span class="attribute">xml:id</span>="<span class="attributevalue">skioldus</span>"></span>
<span class="element"><persName <span class="attribute">source</span>="<span class="attributevalue">#beowulf</span>"></span>Scyld Scefing<span class="element"></persName></span>
<span class="element"><persName <span class="attribute">xml:lang</span>="<span class="attributevalue">lat</span>"></span>Skioldus<span class="element"></persName></span>
<span class="element"><persName <span class="attribute">xml:lang</span>="<span class="attributevalue">non</span>"></span>Skjöld<span class="element"></persName></span>
<span class="element"><occupation></span>Legendary Norse King<span class="element"></occupation></span>
<span class="element"><ref <span class="attribute">target</span>="<span class="attributevalue">https://en.wikipedia.org/wiki/Skj%C3%B6ldr</span>"></span>Wikipedia entry<span class="element"></ref></span>
<span class="comment"><!-- ... --></span>
<span class="element"></person></span></div></div><div class="slide"><h2>Wait ... </h2><ul><li class="item">How many markup systems does the world need?<ul><li class="item">One size fits all?</li><li class="item">Let a thousand flowers bloom?</li><li class="item">Roll your own! </li></ul></li><li class="item">We've been here before...<ul><li class="item">One construct and many views</li><li class="item">modularity and extensibility</li></ul></li></ul><p class="box">... did someone mention the TEI ?</p></div><div class="slide"><h2>The Text Encoding Initiative</h2><ul><li class="item">Spring 1987: European workshops on standardisation of historical data (J.P. Genet, M. Thaller )</li><li class="item">Autumn 1987: In the US, the NEH funds an exploratory international workshop on the feasibility of defining "text encoding guidelines"</li></ul><div class="figure"><img src="../Graphics/poughkeepsie.png" alt="Vassar College, Poughkeepsie" class="graphic" style=" height:70%;"></img><h2>Vassar College, Poughkeepsie</h2></div></div><div class="slide"><h2>The obvious question</h2><ul><li class="item">So the TEI is <em>very old</em>! <ul><li class="item">Not much in computing survives 5 years, never mind 20</li><li class="item">Why is it still here, and how has it survived?</li><li class="item">What relevance can it possibly have today?</li></ul></li><li class="item">And with XML everyone can create their own markup system and still share data!</li><li class="item">And in the Semantic Web, XML systems will all understand each other's data!</li><li class="item">RDF can describe every kind of markup; SPARQL can search it! </li></ul><p class="box">Well .... maybe .... </p></div><div class="slide"><h2>Why the TEI?</h2><p>The TEI provides </p><ul class="pause"><li class="item">a language-independent framework for defining markup languages</li><li class="item">a very simple consensus-based way of organizing and structuring textual (and other) resources...</li><li class="item">... which can be enriched and personalized in highly idiosyncratic or specialised ways</li><li class="item">a very rich library of existing specialised components</li><li class="item">an integrated suite of standard stylesheets for delivering schemas and documentation in various languages and formats</li><li class="item">a large and active open source style user community</li></ul></div><div class="slide"><h2>Relevance</h2><p>Why would you want those things? </p><ul class="pause"><li class="item">because we need to interchange resources <ul><li class="item">between people</li><li class="item">(increasingly) between machines</li></ul></li><li class="item">because we need to integrate resources <ul><li class="item">of different media types</li><li class="item">from different technical contexts</li></ul></li><li class="item">because we need to preserve resources <ul><li class="item">cryogenics is not the answer!</li><li class="item">we need to preserve metadata as well as data</li></ul></li></ul></div><div class="slide"><h2>The virtuous circle of encoding</h2><p><img class="graphic" src="media/model.png" alt=""></img></p></div><div class="slide"><h2>The scope of intelligent markup</h2><p>The TEI provides -- amongst others -- recommended markup for </p><ul><li class="item">basic structural and functional components of text </li><li class="item">diplomatic transcription, images, annotation</li><li class="item">links, correspondence, alignment</li><li class="item">data-like objects such as dates, times, places, persons, events (<span class="term">named entity recognition</span>)</li><li class="item">meta-textual annotations (correction, deletion, etc)</li><li class="item">linguistic analysis at all levels</li><li class="item">contextual metadata of all kinds</li><li class="item">... and so on and so on and so forth</li></ul><p class="box">Is it possible to delimit encyclopaedically all possible kinds of markup? </p></div><div class="slide"><h2>Why use a common framework ?</h2><ul><li class="item">re-usability and repurposing of resources</li><li class="item">modular software development </li><li class="item">lower training costs</li><li class="item"><span class="q">‘frequently answered questions’</span> — common technical solutions for different application areas</li></ul><p class="box">The TEI was <em>designed</em> to support multiple views of the same resource</p></div><div class="slide"><h2>Conformance issues</h2><p>A document is <span class="term">TEI Conformant</span> if and only if it: </p><ul><li class="item">is a well-formed XML document</li><li class="item">can be validated against a <span class="term">TEI Schema</span>, that is, a schema derived from the TEI Guidelines</li><li class="item">conforms to the TEI Abstract Model </li><li class="item">uses the <span class="term">TEI Namespace</span> (and other namespaces where relevant) correctly</li><li class="item">is documented by means of a TEI Conformant <span class="term">ODD file</span> which refers to the TEI Guidelines</li></ul><p class="box">TEI conformance does not mean <span class="q">‘Do what I do’</span>, but rather <span class="q">‘Explain what you do in terms I can understand’</span></p></div><div class="slide"><h2>Why is the TEI still here?</h2><p class="box">Because it is a model of textual data which is ... </p><ul><li class="item">customisable, </li><li class="item">self-descriptive, </li><li class="item">and user-driven</li></ul></div></body></html>