TEI project list[edit]

I was wondering: shouldn't the page keep a list of

  • projects that have articles on the wiki
  • projects that are truly TEI-based?

Firstly, the TEI has two or three (if you count their wiki) places where TEI-based projects can be listed, whether or not they are 'encyclopedic' or 'notable' from the point of view of Wikipedia. There is no need to use this article as an additional link farm.

Secondly, I'm thinking e.g. of the recent addition of the "SWORD" project, which says "The software is also capable of utilizing certain resources encoded in using the Text Encoding Initiative (TEI) format" - is that enough to qualify it as a TEI project? It seems to be an OSIS project.

Thanks, XPtr (talk) 21:12, 27 November 2009 (UTC)

The situation should probably be re-evaluated in the light of the new TEI category too. Maybe restrict the list to content-oriented projects which have TEI as the authoritative version of the text? Stuartyeates (talk) 05:56, 29 November 2009 (UTC)

The SWORD Project produces and publishes texts encoded using TEI, including texts converted from other sources/formats (XML or otherwise) as well as newly authored/encoded texts. Some of those documents are only publicly released in a privately defined format (a compressed & indexed database), but a number of them are publicly released as XML documents, à la Perseus. That's certainly more than I can say for some of the listed projects, which use TEI in only an organization-internal manner and do not offer any actual TEI documents. The same also holds for The SWORD Project with respect to OSIS. So, to whatever degree it is an OSIS project, it is likewise a TEI project. Oskilla (talk) 10:25, 3 December 2009 (UTC)


(REFS:) There are now enough references – I deleted the call for references – but the article is (JARGON:) too compact, and uses a jargonish language which is slightly similar to marketroid language, f.ex. "text-centric" and "community of practice". Now word such as these are really needed and sadly missing in the standard languages from Europe – and probably most others as well – but I perceive that the text in general should be fleshed out to comply with the normal human slow speed of "conceptualization" (pardon for my jargon!) with respect to the number of words produced. Also: a certain increased verbosity (pardon!) allows for somewhat better precision on how items described in the text interconnects.

The dscription of the context of TEI have to be fleshed out by purpose and examples: the "initiative" is an organization created to deal with the wild-grown flora of computer formats for storing various text from the "humanities, social science and linguistics", primary sources from great authors, dictionaries etc.. Rursus dixit. (mbork3!) 08:25, 2 April 2011 (UTC)

Request for an improvement of the TEI article[edit]

Hi, TEI has got a very active community. Could somebody improve the article respective the maintained standard TEI-XML. I'm familiar with several data formats and XML, but still I found no place that provides a good entry point to understand TEI. Normally, Wikipedia is the place, where people (or is it just me?) look first. Please break it down to people not familiar with the TEI technical standard and explain how the format works. Also: could somebody provide an example of how TEI-XML looks like? A comparison: The RDFa article is really helpful. RDFa is also not so easy to understand, but the article provides good examples. SebastianHellmann (talk) 21:14, 5 February 2012 (UTC)

Suggested merge[edit]

Merge Both articles (TEI_Lite and ODD_(Text_Encoding_Initiative)) are not very long. I think it would improve the Text Encoding Initiative article (which has some issues, see my comment above) to have the content right here. I am no insider, but it seems that the Text Encoding Initiative's main purpose is to generate, standardize and maintain these formats. SebastianHellmann (talk) 21:14, 5 February 2012 (UTC)

  • Oppose. To me they look like their subjects are at very different levels of detail — TEI being a big community of people working on a broad array of text encoding issues, and ODD being a very specific technical solution to a low-level problem for this community (how to formalize the metadata that describes an encoding). I think it would cause big WP:UNDUE problems to try to merge them. I don't see a big need for the ODD article to grow; I agree that expansion of the TEI article would be reasonable but I think the place to start is in the sections on projects and customizations, which are currently very terse (just a list each), rather than trying to bring in four paragraphs of text on one very specific file format. —David Eppstein (talk) 21:52, 5 February 2012 (UTC)
Fair point. Then TEI_Lite and ODD should be extended (with an example) and the TEI article should have some sort of summary explaining the relation between the three articles (like size or popularity). Currently the article seems to stress the importance of ODD as the main aspect of the guidelines, so WP:UNDUE is countered by the article itself ... Having one article per customised format, on the other hand, seems to result in Wikipedia becoming a WP:DICTIONARY. We should discuss merging ODD and TEI Lite then as Formats build on TEI guidelines. I will write an email to the TEI List SebastianHellmann (talk) 22:28, 5 February 2012 (UTC)

Just wondering: wouldn't it be so much nicer to first get more than a faint idea about something, and only then mess up (oh, pardon, 'improve') the article about that thing? The TEI is many things, TEI Lite is a popular encoding format, and ODD is basically a meta-schema language with a zing. And it takes one hopefully well-willing but sadly ignorant zealot to mess stuff up, what a pity he didn't insert those templates asking for confirming every second statement. Ooops, I shouldn't have mentioned that, should I. XPtr (talk) 23:11, 5 February 2012 (UTC)

Should I be sorry for trying to improve the article? Just use the rollback function, I would not be offended, if you did so. By the way, if the article were better, I would already have more than a faint idea. SebastianHellmann (talk) 09:23, 6 February 2012 (UTC)
  • I suggest that what we need is a completely new structure based on encyclopedia lines rather than happenstance. I suspect that it'll need to be done by someone with perspective rather than a long-time TEI person. Stuartyeates (talk) 02:27, 6 February 2012 (UTC)
  • Oppose Different enough for 2 pages. However, Text Encoding Initiative is somewhat chaotic and I would suggest a 90% rewrite - but I am not going to volunteer for it. ODD could do with a rename to suggest it is mostly about the format not the effort to support the format. Both articles need attention and various fixes, not a merge. History2007 (talk) 10:13, 7 February 2012 (UTC)


I've had a crack at a restucture, based on SebastianHellmann's suggestions. There are still some things to do, but the structure now seems to make sense (or at least it does to me). Stuartyeates (talk) 06:22, 7 February 2012 (UTC)

TEI presuppositions[edit]

CES Eagles broken links[edit]

Missing as See Also: CES

So many CES page links are broken, that this ref may be helpful:

Oxford and Tufts: the facts[edit]

The factual question is whether non-professors are using these services.

I am under-whelmed by Tufts Perseus: am I alone?

Do students use TEI-encoded text ? Are Cliff Notes and their ilk using TEI ?

Do more engineering students read "notes" or the Shakespeare ? Where is TEI in that social phenomenon of the non-text reader in the age of social networking? Does TEI mean more or fewer students actually read the Darwin texts? The Freud texts?

Of course the objective of TEI is to aid corpus-based research ... but does that include student term papers? And the sociological evidence is ?

G. Robert Shiplett 14:40, 19 March 2012 (UTC)


Sword doesn't generally use TEI except for rather a minuscule subsection of this books (dictionaries, Main format for Sword is [OSIS] (or Could I delete the line in the table?

Ceplm (talk) 15:57, 19 April 2016 (UTC)