- 1 Where to Begin...
- 2 Now to the Transcription
- 3 Transition from P4 to P5
- 4 Common Errors
- 4.1 The first lines of your document should look similar to below if transitioning from P4 to P5:
- 4.2 Certainty Attributes
- 4.3 Work Entity
- 4.4 When Attribute
- 4.5 Remove "xmlns" in Date Line
- 4.6 RespStmt
- 4.7 Date Ranges
- 4.8 Replace "00" to "01"
- 4.9 <app>
- 4.10 Anchored Tags
- 4.11 Hand Tags
- 4.12 Attribute "To"
- 4.13 "pb" Tags
- 4.14 <rdg> Tag
- 4.15 Element "l"
- 4.16 Attribute "corr"
- 4.17 List
- 4.18 <app> Tags
- 4.19 Attribute "id"
- 4.20 Type="Extract"
- 4.21 Bibliography
- 4.22 Choice Tags with Orig
- 4.23 <poem notes>
- 4.24 Profile Description
- 4.25 Figure Entity
- 4.26 <xref>
- 4.27 Permission Statements
Where to Begin...
- Oxygen is the recommended and Brett-supported program for P5 encoding.(Oxygen is an xml-editing application, specifically dedicated to XML.)
- Template: a skeleton document that had all of the parts that every document would need. Could be downloaded and filled in with the relevant information. In Oxygen, there is actually a feature for templates. Follow examples in the zip file, which will tell you where to put the template. Then, in Oxygen, you will have an option to click the TEIP5-wwa template. Double-click this template, and you will get a new empty document based on the template. There are a few comments in the file where Brett has put instructions for using the template. There are fewer in the P5 template than in the P4 template. As you fill in the necessary items, you can delete these comments.
- Brett has updated a lot of information in the header.
- The <revisionDesc> is different. It's simple if you look at the examples. There is now a who attribute, versus content in an element. At the very start you want to give it its canonical name and save it to whatever local file you keep your files--do not save it in the template file. In addition, look to the example in the Common Errors section.
Now to the Transcription
- Lots of other little things have changed in the encoding. One big one is <pb />. For each page you will have a <pb /> One small thing that changes is that IDs have to be xml:id. To give a link to the page is now simpler, but it is different. There is no doctype declaration/DTD extension prior to the header. Instead, you just give you a URL type link to the image. So now you also have ti include the .jpg; There is an example in the Common Errors section.
- Schema: there are lots of different syntaxes for schemas. RNG ODD and DTD. DTD is the oldest syntax and is not XML and its really readable. Many of the other syntaxes are XML, so they have a value in that you don't have to learn a new language to read them, but there is not an agreement on which everyone should be using. W3C recommends XMLSchema. TEIs native schema is RNGs, so we are doing our schemas in RNG (RelaxNG). There is another Relax schema, RNC (Relax Compact, does not use XML). The way the project's rules for creating XML documents.
- Authoring a schema involves a new TEI module, called ODD. So you author an ODD file (a TEI XML file, with a whole bunch of new elements). Then, you take this file and put it in a TEI processer, such as Roma online, or a desktop processor (for most purposes better, can get from Sourceforge) called Vesta. A simple, stripped-down program. Give it your ODD file and tell it schemas to give you. Can also generate documentation. Start with something TEI offers of the shelf and then modify it. This is where your customizations come.
- Brett recommends getting rid of comments as you take care of them.
- Validating files: Need to have a network connection and the server needs to be up. In Oxygen, the validate is the paper and checkmark option.
- Some quick comments about the differences between P4 & P5:
divs are still numbered
lgs are no longer numbered
<seg> no longer in poetry
<app> changed to <subst>
Transition from P4 to P5
- Before you begin, create a P4P5 directory and a P4P5 Output directory on the c:\ drive. Also, be sure the Saxon folder is on the c:\ drive.
- From Whitman Archive Server, copy and past manuscripts to c:\P4P5
- Open documents in Note Tab Pro
- Open !!P4P5 clip
- Remove Doc Types by running Step 1
- Save documents
- Minimize Note Tab Pro
- Open Command Line
- Type in C:\Documents and Settings\Whitman Archive>cd../../Saxon
- Press enter
- Copy (highlight and press enter)and paste (right click) the following command:
java -jar saxon8.jar -o C:\P4P5Output\ C:\P4P5\ C:\composite2.xsl
- Hit enter to run command
- Minimize command line
- Reopen Note Tab Pro. Close all documents after saving.
- Open documents from c:\P4P5 Output
- Run Step 2 of clip
- Save all documents and exit (or minimize) out of Note Tab Pro
- Open Oxygen XML Editor
- Open documents from c:\P4P5 Output
- Associate schema (which should be put in the P4P5 Output directory)
- Associate schema by clicking documents-xml document-associate schema-Relax NG Schema-XML-navigate to schema (c:\P4P5 Output-whitman-v1.rng)
- Validate all documents & navigate through errors. See Common Errors for correction.
The first lines of your document should look similar to below if transitioning from P4 to P5:
<?xml version="1.0" encoding="ASCII"?>
<?xml-stylesheet type="application/xml" href="../xsl/reviews.xsl"?>
<?oxygen RNGSchema="whitman-v1.rng" type="xml"?>
<TEI xmlns="http://www.whitmanarchive.org/namespace" xml:id="xxx.00000">
Replace percentages with high, low, or medium depending on value.
Change entity to ref. See below: <work entity="xxx.00526" cert="high"/> to <work ref="xxx.00526" cert="high"/>
When attributes change to notBefore/notAfter tages. See below: <date when="1855-1859"> to <notBefore="1855" notAfter="1859">
Remove "xmlns" in Date Line
<date xmlns>2008</date> to <date>2008</date>
Reformat the statement to one line as below:
<item>Removed hi tags in Work notes; corrected miscellaneous errors</item>
<change when="2009-10-03" who="#ss">updated TEI header</change>
- ALWAYS ATTEND TO THE REVISION DESC!!!
Date ranges are encoded as the "when" attribute above. <date value="1855/1857"> to <date notBefore="1855" notAfter="1857">
Replace "00" to "01"
If you have values that are "00" replace them with "01" in order for validation.
Change all <app> tags to <subst> tags
In anchored tags, change "yes" or "no" to "true" or "false"
Change <hand> to <handNote>
|<addSpan to="">||<addSpan spanTo="">|
|<pb corresp="amh.00009.001" id="leaf001r" type="recto"/>||<pb facs="../figures/amh.00006.001.jpg" xml:id="leaf001r" type="recto"/>|
You can usually transfer the information into the "del" or "add" tags that it contains and then delete the <rdg> tag.
<add place="supralinear" rend="insertion">then</add>
<del rend="overstrike" seq="1">now</del>
<add place="supralinear" rend="insertion" seq="2>then</add>
Rearrange <l> and </l> to contain <add></add> or <del></del> instead of vice versa.
See the example below as to how to format this:
<sic>the incorrect way it's written</sic>
<corr>the correct way to write it</corr>
Format as below:
<add type="insertion" place="supralinear">another text</add>
<add rend="insertion" place="supralinear" seq="2">another text</add>
Attribute ‘id’ changes to attribute ‘xml:id’
In the reviews, the bibliographical transcription separated into structured and unstructured. For the reviews, the unstructured markings are used and thus, elements <analytic>, <imprint>, and <monogr> should be removed.
DELETE:<analytic> <imprint> <monogr>
Choice Tags with Orig
<seg>As if it were not <orig reg="indispensable">indis‑</orig></seg>
<poem notes> is changed to: <poemNotes>
Document Figures as below:
<head type="main-authorial">Whitman's alternative plan for structuring the 1855 edition of <hi rend="italic">Leaves of Grass</hi>. Harry Ransom Humanities Research Center, University of Texas at Austin.</head>
<figDesc>Manuscript containing Whitman's alternative plan for structuring the 1855 edition of <bibl><title rend="italic">Leaves of Grass</title></bibl>.</figDesc>
The element <xref> is not allowed in P5 encoding standards. Instead, this element should be only <ref> and the attribute should be changed to "target" instead of "doc" or "url".
For a list of these corrections, click <xref doc="anc.00161"> here </xref >.
For a list of these corrections, click <ref target="anc.00161"> here </ref >.
<div1 type="permission statement">