Talk:Metadata

From Wikipedia, the free encyclopedia
Jump to: navigation, search
          This article is of interest to the following WikiProjects:
WikiProject Computing (Rated Start-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 
WikiProject Libraries (Rated Start-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Libraries, a collaborative effort to improve the coverage of Libraries on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 
WikiProject Databases / Computer science  (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Databases, a collaborative effort to improve the coverage of database related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
Taskforce icon
This article is supported by WikiProject Computer science (marked as Mid-importance).
 
WikiProject Mass surveillance (Rated C-class, Mid-importance)
WikiProject icon Metadata is within the scope of WikiProject Mass surveillance, which aims to improve Wikipedia's coverage of mass surveillance and mass surveillance-related topics. If you would like to participate, visit the project page, or contribute to the discussion.
C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 


Interesting thread.

I think the French, Spaniards, and Italians are calling it Metadato.

Data in my language means date (like the date on a calendar,or the date of an event like the sun has 1 trillion years of life left).I think the Greeks say imerominia for date. ime Rominia means something to me. Could you please fire up your Super Computer and run some numbers for me.

Computers were made to compute. Compute means to make a Mathematical calculation, or computation. What calculations has your Super Computer made with all the information, about information it has collected about people. Does it have a simulator.

Can we talk about conflict:- an incompatibility of dates or events.

Blagoja73 (talk) 03:26, 16 February 2015 (UTC)



Contents

From LeeHunter[edit]

While i make a practice of quite precisely keeping visible the text of all talk-page discussion of the articles (i.e., of our meta-editing), i'm cutting some slack for myself (and any other close-readers of talk pages) by -- in the course of keeping the evolution of the discussion -- not making the evolution of my own meta-discussion visible, but leaving it to those who care to check the talk-page edit history, on my contribs where i leave two timestamps on a single passage of meta-talk.
--Jerzyt 05:22, 11 June 2013 (UTC)
The contributions to this talk section have been so chaotic that i am isolating, in pink boxes, all my commentary on omissions and confusing formats (other than the {{uns}} tags that are ubiquitously familiar on talk pages).
--Jerzyt 19:25, 10 & 05:22, 11 June 2013 (UTC)

The following chunk makes no sense to me and I think it should be deleted. --LeeHunter 21:23, 23 Aug 2004 (UTC)

The following four-and-a-fraction sentences, which i'm marking in green, were part of LeeHunter's signed 21:23, 23 Aug 2004 contrib, removed by 217.218.78.197 (talk · contribs · WHOIS) at 12:53, 14 October 2005, and now restored for this record.
--Jerzyt 05:22, 11 June 2013 (UTC)
  • Manually-created metadata adds value because it ensures consistency. [manually creating metadata certainly does not 'ensure' consistency!] If one webpage about a topic contains a word or phrase, then all webpages about that topic should contain that same word. [??] It also ensures variety, so that if one topic has two names, each of these names will be used. For example, an article about sports utility vehicles
    would also be given the metadata keywords '4 wheel drives', '4WDs' and 'four wheel drives', as this is what they are known as in Australia. [it's debatable whether the metadata really needs to have every synonym]
    — Preceding unsigned comment added by LeeHunter (talkcontribs) 21:23, 23 Aug 2004 (UTC)
LH's 2nd 'graph above (which follows their sig) incorporates the whole of a 'graph from the 09:16, 22 November 2001 page-creating edit of the accompanying article, by 210.49.109.xxx. [Sic! Is the "xxx" final byte a product of early WP practice re IPs??] In that 'graph, the square brackets are LH's means of marking the comments they have added here to the article-derived text in question.
--Jerzyt 19:25, 10 & 05:22, 11 June 2013 (UTC)

The example of metadata given looks more like a symbolic link.— Preceding unsigned comment added by 141.150.98.169 (talk) 14:27, 26 September 2004

The content from the end of this pink box to the start of the following one is the sole contribution to date of the user whose (misplaced) sigs it includes.
--Jerzyt 19:25, 10 & 05:22, 11 June 2013 (UTC)

120.16.36.147 (talk) 09:47, 13 August 2010 (UTC)From Shaq120.16.36.147 (talk) 09:47, 13 August 2010 (UTC) That information is correct and is in the easiest form of HTML Meta Data. But it should be titled under Meta Data (Search Engine Optimization)

As of now, the "From LeeHunter" talk-page section ends here.--Jerzyt 19:25, 10 June 2013 (UTC)

Is "computing" the right subheading?[edit]

Since this article (correctly in my view) begins with a non-computing example, I wonder if the title might not be better as "Metadata (information)". This would keep it in line with entries such as Table.— Preceding unsigned comment added by 209.107.97.72 (talk) 16:01, 13 April 2005‎

I think it would be fine simply at Metadata. Michael Z. 2005-05-23 06:19 Z—Preceding undated comment added 06:19, 23 May 2005‎

looking for info about image metadata - new feature?[edit]

I just noticed that photos I have been uploading recently contain metadata, but I can't find anything about this upgrade. Should I re-upload my old photos so that they contain metadata? Please direct me to the discussion of this new Wikipedia feature. Cacophony 01:49, September 7, 2005 (UTC)

The main encyclopaedia article is Exchangeable image file format. I couldn't find anything on the specific feature. Note that many image manipulation programs will lose the metadata (most images need manipulating before uploading). Older cameras won't generate it. --David Woolley 09:39, 12 November 2005 (UTC)

Link purge[edit]

Embedding comments (signed or not) within other's comments is consistent with responsible discussion only when thot thru and with "lines drawn". (I wouldn't have done it Ray's way, but that's just me.) Bravo for Ray's care and restraint.
--Jerzyt 19:44, 10 June 2013 (UTC)

This article has suffered from people adding their own external links which don't do a good job of providing further information on the topic. Remember that Wikipedia is not a web directory. If anyone should know the difference, it's metadata people. So here's a review of all the external links. Feel free to disagree with my assessments.

This was actually an interesting paper. Please restore. RayGates 02:31, 31 January 2006 (UTC)
This is extremely boring, but then I'm not a MAC user :). I'm surprised you kept it. RayGates 02:31, 31 January 2006 (UTC)
  • Meta Meta Data Data - Ralph Kimball is notable.
  • Rationales for using XMP metadata - Too specific. This is about digital photography metadata and nothing else. Deleted.
  • Metadatarisk.org - this site is run by a document security company. Wikipedia is not their free advertising. Deleted.
  • ISO/IEC JTC 1/SC 32 N 1102 - really, extremely boring ISO draft of something. Deleted. It could perhaps be restored if someone puts it in context and clarifies who would actually want need to read it.
This was supporting documentation for the formal definition which you summarily removed previously. Its only purpose was to show the source of the definition. RayGates 02:31, 31 January 2006 (UTC)
  • All blog links: deleted for non-notability.

rspeer / ɹəədsɹ 20:11, 30 January 2006 (UTC)

Formal Definition[edit]

While I did not insert the formal definition originally, I did find it relevant. I am inclined to restore it, but would welcome other views. RayGates 02:31, 31 January 2006 (UTC)

desktop search programs and metadata[edit]

Since desktop search programs such a Google Desktop catalog metadata from files on one's computer then I would suggest that they be mentioned somewhere in the article just as Spotlight and WinFS are. --Cab88 22:33, 12 March 2006 (UTC)


Ok, I added my definition at the beginning, along with those of Bracket, Marco, and Tannenbaum under "Warehouse metadata". --DaveHay 22:15 (CST) 23 March, 2006. — Preceding unsigned comment added by Dave Hay (talkcontribs) 04:17, 24 March 2006‎

Added[edit]

I have added much and merged few. Someone do that for me, I have no more time at the moment. --Θ~ 17:44, 26 May 2006 (UTC)

Enterprise Metadata[edit]

I have rewritten the Enterprise Metadata section and incorporated it under General IT metadata. It had some good points once I was able to get past some of the grammatical issues. Apologies for my initial reactiveness.

Charles T. Betz 00:24, 27 May 2006 (UTC)

Digital Camera Metadata linking across all applications?[edit]

I just noticed my Netscape 7.2 program is now automatically displaying metadata for email message composition with digital camera images. By symbolic link?

John Zdralek what does god want with my quantum thoughts? 10:02 30 May 2006 (UTC)

Disambiguity of the MetaData virus listed on a webpage of symantec.com 31 May 2006?

John Zdralek what the heck? 17:52 04 June 2006 (UTC)

Weasel Words[edit]

Although the majority of computer scientists see metadata as a chance for better 
interoperability, there are some critic voices whose main arguments must be taken seriously:

The above sounds like weasel words too me--Greasysteve13 07:24, 6 August 2006 (UTC)

"must be taken seriously" has no business in Wikipedia. However, the criticisms belong if they are sourced. Unless the criticisms are clearly at the fringe, the benefits of metadata must also be qualified (e.g. "The benefits of metadata are" -> "Supporters assert that the benefits of metadata are") If the criticisms cannot be sourced, they do not belong at all. Simple! Notinasnaid 08:05, 6 August 2006 (UTC)
Ever thought about gettin' your own hands dirty and change anything yourselves? --Θ~ 21:25, 11 August 2006 (UTC)
I agree, each of these criticisms needa citation.

Lexicography of term "metadata"[edit]

I have tampered with the contents of the following blue box in the following ways:
  1. I replaced two "top"-level (i.e. "=="-delimited, which is the default provided when "Start a new section" is invoked on a talk page) with "lower-level" sections, and made them both subordinate to a new "top"-level section entitled
    Lexicography of term "metadata"
  2. I consulted the edit history, and inserted a "sig line" ({{Uns}}) for each of two contribs that were left unsigned by the respective editors
  3. I moved one (signed) contrib, which had been positioned between another user's contrib and that second user's sig line.
(These are all routine cleanup measures falling short of Refactoring; in addition preserving these routinely corrected neglects of our prescribed practices for attribution of discussions, i am also about to refactor a copy of the corrected contribs, below the box. The discussion remains open, but responding after the appropriate yellow box (rather than in or immediately follow the blue one) should promote clarity.
--Jerzyt 09:17, 10 June 2013 (UTC)
 

Etymology?

In the first sentence in the brackets it is stated that "metadata" comes from greek meta=after .... But meta can also mean "about", and I think "metadata" is much more "about information" then "after information".
— Preceding unsigned comment added by 193.170.82.198 (talk) 11:56, 21 August 2006‎

On Etymology and Plural

Meta in Greek means after. Just that. About is not a correct translation. I noted that this was written in the Meta article as well so I corrected it. Another issue we should consider is if metadata is plural or singular. Data is plural for datum (latin). So my opinion is that the word metadata should be plural as well. Information is singular for different reasons. Metadata are different from information. Share your thoughts on this.— Preceding unsigned comment added by 193.192.38.242 (talk) 09:48, 3 November 2006‎

Although I agree with your reference to the Latin, I believe that the common use is based on considering data as the aggregate, i.e. a shorthand for set of data. I don't feel that strongly which way the article goes but I would like to see some consistancy. The second sentence states "Metadata are...". The third sentance states "In library science metadata is..."
151.118.160.214 20:59, 4 December 2006 (UTC)Hoyt L. Kesterson II
I concur with the person above and am editing to remove the reference to the singular in the first sentence. There are almost no references to metadatum as a singular for metadata online (only about 4,000 hits in google for metadatum). Also Dictionary.com does not list a singular for metadatum. I think it is misleading to indicate a singular for a word that in English is its own sigular. 152.120.255.250 (talk) 18:18, 18 June 2008 (UTC)
The two yellow boxes amount, together, to a refactoring of the material in the blue box. The purpose of that is to facilitate discussion of the two tangentially related issues, each on its own terms and merits, despite their having been mixed together in one paragraph by one editor. Further comments on the two issues are encouraged, following the respective yellow boxes.
 

What is the etymology of "metadata"?

In the first sentence in the brackets it is stated that "metadata" comes from greek meta=after ....
But meta can also mean "about",
and I think "metadata" is much more "about information" then "after information".
— Preceding unsigned comment added by 193.170.82.198 (talk) 11:56, 21 August 2006‎

Meta in Greek means after. Just that. About is not a correct translation. I noted that this was written in the Meta article as well so I corrected it. — Preceding unsigned comment added by 193.192.38.242 (talk) 09:48, 3 November 2006‎
  193... states elsewhere "I am Greek and I can assure you that Meta does not mean about." IMO it is a fault of English-language lexicography (which perhaps WP should labor to remedy) that "Greek" is usually stated without qualification, but with the meaning "Ancient Greek, a 2300-year-old language about which most Greeks perhaps know less than most native speakers of English can understand of the 600-year-old
"Whan that Aprill with his shoures soote/The droghte of March hath perced to the roote,/And bathed every veyne in swich licour/Of which vertu engendred is the flour"
and the 1000-ish-year-old
"Hwæt! We Gardena/in geardagum,//þeodcyninga,/þrym gefrunon,//hu ða æþelingas/ellen fremedon.//Oft Scyld Scefing/sceaþena þreatum,//monegum mægþum,/meodosetla ofteah,//egsode eorlas./Syððan ærest wearð//feasceaft funden,/he þæs frofre gebad,//weox under wolcnum,/weorðmyndum þah,//oðþæt him æghwylc/þara ymbsittendra//ofer hronrade/hyran scolde,//gomban gyldan./Þæt wæs god cyning!"
(in which only the meanings of "we", "in", "oft", "he", "under", and "him" have survived unchanged).
--Jerzyt 09:17, 10 June 2013 (UTC)
  The question for purposes of the accompanying article is not translation of "meta" from Greek (of any period), but rather the meaning of "meta" in English. While i am not a specialist, i can tell you that the English meaning of "meta" is heavily influenced by the human practice of handling the simple stuff first, and then tackling the harder (more abstract) when the simple is out of the way. The ancient compilers of Aristotle's works either
§ shelved his τὰ μετὰ τὰ φυσικά (ta meta ta fysika; literally, "the [writings] after his Physics)", next further from the library entrance than Physics, or
§ used "meta" in this case with sense of "beyond, transcending", bcz the work in question focuses largely on
"being qua being", or being understood as being[, and] examines what can be asserted about anything that exists just because of its existence and not because of any special qualities it has.
  In any case, many words, perhaps most, are individually rich in metaphorical power, and it would be a bit surprising if "data after data", "data beyond data", "data upon data", and "data about data" did not have meanings more similar to one another than the meanings of "after", "beyond", "upon", and "about" are to one another.
--Jerzyt 09:17, 10 June 2013 (UTC)
 

What is the grammatical number of "metadata"?

Another issue we should consider is if metadata is plural or singular. Data is plural for datum (latin). So my opinion is that the word metadata should be plural as well. Information is singular for different reasons. Metadata are different from information. Share your thoughts on this.— Preceding unsigned comment added by 193.192.38.242 (talk) 09:48, 3 November 2006‎

Although I agree with your reference to the Latin, I believe that the common use is based on considering data as the aggregate, i.e. a shorthand for set of data. I don't feel that strongly which way the article goes but I would like to see some consistancy. The second sentence states "Metadata are...". The third sentance states "In library science metadata is..."
151.118.160.214 20:59, 4 December 2006 (UTC)Hoyt L. Kesterson II
I concur with the person above and am editing to remove the reference to the singular in the first sentence. There are almost no references to metadatum as a singular for metadata online (only about 4,000 hits in google for metadatum). Also Dictionary.com does not list a singular for metadatum. I think it is misleading to indicate a singular for a word that in English is its own sigular. 152.120.255.250 (talk) 18:18, 18 June 2008 (UTC)

Besides "after," "meta" can mean "next to" and "beyond," e.g., the word, metastasis, a deposit of a cancer from cells that have migrated from a primary tumor elsewhere in the body, is comprised (both components from the Greek) of "meta," beyond, and "stasis," the [original] location. "Metadata" can be thought of as "next to the data," e.g., the phone numbers from conversations of millions of people collected by the National Security Agency are "next to" the actual conversations.Stubbycat (talk) 18:41, 17 January 2014 (UTC)

criticism of the article in the media[edit]

The january 12th issue of computable contains a column by Rick van der Lans about the meta-data article, which criticizes this article:

  • the author didn't know the terms 'back room' and 'front room' metadata, and couldn't find any one that did know on a conference devoted to meta data (apparently he didn't Google it), the article lacks an explanation. The only online source I can find to attribute an explanation to is in Dutch, so maybe some one can find an English source and expand on this section?
  • "a reasons give [for the drawback that metadata is 'too complex'] is that users don't create metadata because existing formats, MPEG-7 in particular, are too complex. Pardon me?" I have no idea how to place the author's surprise at this statement. Maybe he has never met anyone who's opined that users don't add meta data (though I'd be hard pressed to find a word document with any of the semantic metadata fields filled). The criticisms listed on the page aren't sourced other than "some critics say" though, that could be fixed.
  • the author calls for 'some one' to 'fix' the article, but apparently hasn't done so himself, nor added a section on this talk page (the article's metadata), which I personally find somewhat ironic.
  • the author's byline describes him as"specialized in software development, datawarehousing and internet"

(The quotes attributed to Rick van der Lans are my crappy translations)

85.144.113.76 10:53, 13 January 2007 (UTC)

   At a glance, i think that in the preceding contrib, "the author" referred to is Rick van der Lan, a columnist for a periodical named Computable (and apparently not recognized on our Computable page), apparently published in a foreign language (with the surname, and perhaps our colleague's knowledge of Dutch, suggesting that language as a leading possibility.
   And that "the meta-data article" and "this article" both refer our page Metadata, tho i've made no effort to check that.
   Similarly, i think "criticisms listed on the page" are negative assessments of metadata that our article attributes only to unspecified critics of metadata usage whom our article evokes (and not to critics of, or cited by, Van der Lan).
   YMMV, including the possibility that you are more interested in offering evidence and perhaps less in perceiving and pointing out the possibility of ambiguity, than i!
--Jerzyt 06:13, 11 June 2013 (UTC)

Where this page is now...[edit]

This page seems to me to be suffering, relatively speaking, but I'd like other inputs before trying to improve the whole (especially because I'm a wikipedia newbie). I see a lot of detail that takes away from the whole (partly just because it's detailed, and partly because it's not consistently written and presented). For example:

  • Definitions are sprinkled throughout the document in support of descriptions of particular domains of metadata. Some of these definitions contradict, duplicate, or support definitions in he Definitions section. Proposal: All definitions in the Definition section.
  • There's a section for "Types of metadata" and a section for "Types". Both of them list types of metadata, though the first is more categorization and the second is more application domains.
    • Re the first: There are a lot of ways people subcategorize metadata [example])-- I'm willing to go there, but should we be trying to be comprehensive? Proposal: Only list those that are referenced to a paper.
    • Re the second: We could go on listing metadata applications until the cows come home, but will that help people understand what metadata is? Is there a criteria that can be applied to help figure out if a particular application adds value to the article? A lot of this information is good, but some of it is noticeably weaker or secondary.

Many other comments, but mostly more minor. I'd appreciate seeing feedback on whether they see the same issues, and what the best way for me to contribute to addressign them is (fell swoop or piecemeal or...?).

--Metajohng 21:34, 6 February 2007 (UTC)

Metadata trademark usage[edit]

Note: Metadata is not a trademark in France, neither in many other countries where the term is used. Please, Mr. Metadata Company, do not internationalize yourself wherever "metadata" are used in Wikipedia pages, which are not US pages, but international pages. Thank you. Jeansoulin 15:33, 7 February 2007 (UTC) University Marne La Vallee, France.


Two new sections - What is "Metadata" and "Levels"[edit]

I agree with the previous comment that the article is a bit disjointed. Much of the content is okay in isolation but as a whole it tends to confuse rather than enlighten. To address this, I have rewritten the Introduction and added two new explantory sections to get the key concepts across. Following this it is useful to talk about Definitions, Types, Uses, Issues etc. but the current content could usefully be revised to remove duplication and make less disjointed. I might have a go at this later.

Pete S


The difference between data and information has no practical use?[edit]

"As for most people the difference between data and information is merely a philosophical one of no relevance in practical use, other definitions are:"

I believe this phrase is at least short sighted and should be rephrased. There is huge difference between information content and data and this has incredible impact on practical applications, probably the most common and practical being compression. Dpser 10:14, 14 March 2007 (UTC)

It may be of small difference to most people but is significant in specific fields. The difference is comparable perhaps to the difference between units of measurement. Consider, an American (US) travelling at 50 is travelling far faster than an Englishman assuming each is travelling that value in his own country. Data is on the level of facts and probably meaningless without context e.g. 50. Without the context of a speed sign (& a location) it's meaningless. Information is context plus data i.e. a speed sign with a 50 on it means a limit of 50kph, assuming you're not in the US etc. Knowledge, or possibly wisdom, means knowing you shouldn't travel faster than the 50K limit and the consequences if you do. I'm a little rusty so if someone can provide a more official definition or a reference that could be useful. 203.25.1.208 (talk) —Preceding undated comment added 03:48, 12 March 2010 (UTC).

ZIP Code: new example please[edit]

A data definition such as "ZIP Code" in the cited text is hardly a useful (first!) example of metadata. What is basically a column name is either edge-case metadata or arguably not metadata at all. Metadata does not exist to give data meaning; rather, it exists to describe data. Meaning is derived: from metadata, context, presentation, personal bias... whatever.

Metadata is easier to explain in unstructured data contexts. An article titled "Solar Power Generation Today" might be assigned the metadata, "alternative energy," with a metadata categorization of, say "subject." On the other hand, if "12345" must be chosen as our, then a nice made-up metadata might be "Processing Center assigned 1987" or something like that.

I'd suggest the first of those two. And, in any case, the ZIP Code example should be archived. (Status:Deprecated) ;-)


  • Example: "12345" is data, and with no additional context is meaningless. When "12345" is given a meaningful name (metadata) of "ZIP code", one can understand (at least in the United States, and further placing "ZIP code" within the context of a postal address) that "12345" refers to the General Electric plant in Schenectady, New York.


67.149.104.192 02:14, 18 February 2007 (UTC)


Please to be more explicit as to why/how 12345/Zip Code (I'm happy with Aus Postal code example too) is a bad example? If it's no good, please to provide a suitable substitute. I was attempting to offer a beginners example from the systems perspective. Your "Solar Power Generation Today" comment is from the world of publishing (I think). As far as I know, neither the systems domain, nor the publishing domain can claim exclusive ownership of metadata... it rather depends on one's perspective/context/experience. DEddy 02:46, 18 February 2007 (UTC)

I think the Zip Code example is perfectly good. The name and definition associated with a data element are the most important items of metadata. RayGates 21:54, 18 February 2007 (UTC)

Ray - Thanks for the vote of support for "my" simple definition/example of metadata. I'm sure I copied this example from someone else 10-15 years ago, & I've yet to see anything that comes remotely close to giving a view into how metadata fits into helping represent the real world via data. Total agreement that a good name goes a long way to resolving lots of metadata ambiguity. DEddy 00:52, 19 February 2007 (UTC)

Not sure why we are making this personal and defensive. It wasn't meant that way. I disagree with the example, I feel that it is not terribly illustrative (since nearly everyone already knows about data element names), and I disagree that there is any domain-specific definition of "metadata." I suggested some alternatives. None of these are personal attacks. 66.93.3.210 20:24, 23 February 2007 (UTC)


With all due respect, I too feel that the Zip Code example is inappropriate. This is because "zip code" tells us what 12345 means. It completes the statement "12345 is a ________", effectively turning a string of numbers (data) into information. This is almost the definition of that object within a specific context. Metadata should provide peripheral information about an object - information that is (generally) not critical to the existence/interpretation of the object.

Also, I'm confused as to why we need another example when we can stay with and explore the very good example provided earlier on in the article - the digital camera JPEG. The JPEG metadata stores the timestamp, shutter speed and aperture among other things. It does not try to store the fact that "This is a picture file". We derive that fact from the structure of the file or the extension of the file-name, both of which contribute to defining the context within which we look at the file. Both are also critical to the existence of the file. If there was absolutely no way of knowing what type of file it was, the only thing one can do with that object is destroy it.Ulric 16:43, 20 March 2007 (UTC)

Field name is not metadata IMHO[edit]

The reason why I think the ZIP code is a bad example the way it is now is the heart of the sentence "12345" is data, and with no additional context is meaningless. I am only a computer engineer, not a computer science theorist, but to me data has meaning; if it does not have meaning, then I call it a string of characters or bits or I call it garbage, or a cryptographic challenge (as you might try to decode meaning just by looking at the shape of data.) There's no difference between 12345 and ABCDE if I don't have any clue about what it means. A field name is what turns a string into data. Clearly, for me the field name isn't metadata, but what qualifies the string as data. Your mileage may vary. So, data has meaning, and metadata extends what I know about the data and may help me work it better. For instance, you may keep the example and make it less US-centric by saying that the Postal Code field may have the value 12345, and metadata about it would be that it refers to a USA ZIP code. So we already had a meaning for the field (it's a postal code), but we're extending it by knowing it's not a French postcode (which it could be). We can work with it better, because we won't try to validate it against a Portuguese postcode mask (9999-999), or Australian (AAA 9999), South African (9999), whatever. – Tintazul msg 10:00, 22 May 2008 (UTC)

I agree zip code is a bad example. 12345 is data, zip code is a field name. 12345 can only be metadata when applied to an external object e.g. the zip code attached to an image of a house. 12345 in this case describes the houses location. Not quite longitude or latitude but headed that way. Given that I can't find the zip example in the article I'm guessing it's been cleaned out anyhow. Where we differ is, as per my comment above, data is meaningless information is not. A field name should help turn data into information but a field may already be obviously information e.g. a book synopsis or a real estate style spiel. Metadata doesn't extend information it compresses it, tries to summarise something or provide details not otherwise provided e.g. background of an image on your screen, synopsis of a book in a catalogue record. 203.25.1.208 (talk) 03:57, 12 March 2010 (UTC)
In this attempt to describe ONE form of metadata (I doubt if there is any sort of universal form of metadata) 12345 is NOT metadata, it is data. The metadata is the name or label—ZipCode—that gives the 12345 meaningful context. Without the context of the name how would one know what 12345 meant? DEddy (talk) 21:53, 24 August 2015 (UTC)
if I had two fields (columns, data elements, attributes, etc), say Column A & Column B... both contain the identical data value of "12345". How do you know what the meaning of "12345" is? If I have column/attribute 1 is labeled (named) ZipCode and column/attribute 2 is labeled AnnualSalary... now simply by applying useful labels (column A & column B are NOT very useful)—metadata—about the data, the hitherto meaningless data takes on useful meaning. DEddy (talk) 15:40, 17 May 2010 (UTC)

I agree column name and description are metadata. They are extra information that is used to understand the data stored in the column. I disagree with the view that metadata doesn't extend information. For example, a column of data might contain date & time . The column name or definition could tell you whether the date/time is UTC or local. Tonyjkent (talk) 22:37, 24 July 2014 (UTC)

Mild cleanup of talk page[edit]

Before adding my own comments, I moved the zip code discussion to the end of the document (consistent with Wikipedia guidance for Talk pages) and made a few other tweaks, hopefully considered minor. Apologies if anyone is vexed. --Metajohng 18:58, 6 April 2007 (UTC)

Quick Link Purge[edit]

I removed a couple of links that were pointing directly to a specific metadata removal tool from the "Document Metadata" category, changed around some of the wording and added a link to E-Discovery. Not sure what you guys think of that, but it seemed like the iScrub links were blatant advertising. I'm not sure if the document-metadata.com links should be removed too. Any thoughts? --TheDude813 19:55, 2 July 2007 (UTC)

Did you know...[edit]

..that The European Library has a handbook and it gives open access to the Metadata Registry it is developing?

greetings, 82.156.209.165 21:28, 1 August 2007 (UTC)

Need for section on html[edit]

It would be very helpful if someone knowledgeable added a section on html metadata. I was very surprised that a quick scan of the article seemed to find no reference to web searching. Soler97 (talk) 22:45, 31 December 2007 (UTC)

Poor wording in opening paragraph[edit]

"metadata about a title would typically include a description of the content..." - "title" can be a piece of metadata, so using the word "title" here instead of "book" or even "information object (such as a book)" is needlessly confusing. —Preceding unsigned comment added by 131.216.164.187 (talk) 18:49, 14 January 2008 (UTC)

Agreed. Even the topic of a camera's metadata is misleading or wrong, as the examples are really talking about the actual photograph's metadata. The camera's metadata would items such as its dimensions, weight, manufacturer, et cetera.
Also, the old, popular "data about data" tag-line is rather lame, because there is also metadata about processes, motivations, etc... In a computing context, I've heard metadata better described as "information resource data". I think that could be attributed a lecture by Larry English, an Information Quality consultant. He also drew some challenge to the word "about" in this context, as the Latin for meta is purported to mean "along side of". —Preceding unsigned comment added by 69.140.220.35 (talk) 22:42, 8 February 2008 (UTC)
Considering the camera comment above - the examples in the article don't even describe the photograph's metadata, but rather the file's metadata. This associated metadata is just normal attribute data from the viewpoint of the "photograph"? But through the article, the same metadata is referred to as if it is associated with the "photograph" the "jpeg" file and the "camera"!!
The article is slightly clearer regarding audio metadata, since it refers only to "content" or "file": obviously track information (artist, title, duration, etc.) is metadata for the .wav or .mp3 data file but is not metadata for the song (it's normal direct attribute data for the song). Think the whole article needs restructured, clarified and simplified.
81.144.152.8 (talk) 11:37, 31 March 2009 (UTC)

Can metadata have a useful definition?[edit]

The usage of the term recommended by this article is so broad as to render the term 'metadata' useless. When it started gaining traction in the 70s, the meaning was confined to data that constrained or organized attributes of entities (E. F. Codd coined the term relational database in 1970, so there was no practical implementation of a relational schema back then). Saying that a library card contains metadata seems highly suspect as librarians have a terminology for the descriptors of their books that was developed long before 'metadata' was coined.

It's disappointing to see that this article has grown to ensconce the prolific use of 'metadata' to include annotations of photographs. It's Jack Meyers' word, though his company doesn't seem to have the chutzpah to follow through on threats of trademark infringement suits. If we're appropriating someone's invention, it seems it should be because there is not a better (or perhaps adequate term) available.

The deep need was (is) for a term that denotes the information that must be present in order to have a definition of the properties of some entity. In the case of a library system, metadata stored would include the string 'title' to name the property of a book by which we call it, the string 'call number' to name the property used to locate the book, 'author' to name the person credited with the book, etc. The point is that the call number is not an example of metadata. The term becomes useless if the one can choose some particular entity and then decide that some of its properties are data and some are metadata. The author is a property of a book. Of course, to be useful in a computing system, metadata must include more than just the name of a property: the metadata should encompass the domain and range of the property (characters with a maximum length of 100) or more elaborate datatype representations if appropriate.

As mentioned in the article, data and document schemata constitute metadata stores as do ontologies, and this is the area in which the term metadata fills a real need. When it's claimed that the term should be applied to things like document properties that are stored and changed with the file (as in the section of 'Important Issues'), it loses value. The application creates property fields in the file for its own purposes (or for no good reason in the case of MS Word); why should those properties be called metadata? Those properties may have little to do with the 'real' content of the file or may be an innate part of the content and in either case can often be removed without any effect on the usefulness of the file.

In the section of 'Important Issues' we find another interesting example regarding digital photography: the need to include interesting properties along with the image data. The term 'metadata' would be most useful in identifying the data needed in order to recognize and separate (parse) or locate the various properties of the image in the image file. The exposure settings, date and time of the event are attributes that may be attached to the image in a number of ways - why should we apply the term metadata to these attributes? The terms 'properties', 'descriptors' and 'attributes' can be applied more meaningfully. On the other hand, the data need to find these properties lacks a term if 'metadata' means the properties themselves. In the case of JPEG images files, the metadata includes the JFIF format rules and possibly the information about the location and format of descriptors such as the date, etc.

I think it's not too late to hold the line. The Metadata Company registered the trademark 'metadata' in 1986. Maybe the wikipedia community can help it retain some meaning so that it doesn't become just another synonym for 'property', 'descriptor' or 'attribute'. JWBito (talk) 06:51, 17 May 2008 (UTC)

Disagree on several fronts. Metadata is not any one person's word, even if someone originated it; it belong to all the users. It is used pervasively to mean 'data about data'. Since documents are data, the properties that describe them perfectly fit that definition. We should apply the term metadata to attributes like date, time, and exposure settings because they are, in fact, data about data. That line was crossed a long time ago.
But I do agree on one point. The term 'metadata' is not actually useful in any broad sense. A discussion about which things in a data system are metadata, and which are data, is never helpful for designing the system. In such discussions, people are using 'metadata' to stand in for other concepts, like "things that don't change often", or "things that the user enters." The term metadata won't mean anything by itself, only when the real concept of the discussion is understood can the system be properly designed. metaJohnG (talk) 06:26, 16 August 2009 (UTC)
This comment is revealing because it illustrates something that the article doesn't really bring out: the term "metadata" is used quite differently by database people and library people, and neither community seems familiar with the other usage. Database people use "metadata" to mean, primarily, a schema describing and constraining the types of data to be held in a database. Library people use the word to mean annotations attached to instances (documents, articles) to give information about a document that is not necessarily available in its content: for example, accession date, provenance, reliability, condition, shelf location, you name it. Neither usage is right or wrong. Mhkay (talk) 11:20, 21 July 2010 (UTC)
the term "metadata" is used quite differently by database people and library people, and neither community seems familiar with the other usage. Database people use "metadata" to mean, primarily, a schema describing and constraining the types of data to be held in a database. Strong agreement on database & library people using metadata with different meanings & each set of people think they have THE correct definition. In the IT field, metadata also has the meaning for relationships of things... e.g. a list of the programs in a system; a list of the schema definitions used in a system; and many other such lists of related things. DEddy (talk) 12:50, 3 August 2010 (UTC)
There are of course hybrids, librarians responsible for a database e.g. an Image Library. Of course such people usually fall within the IT or library paradigms. In this instance the metadata is the contents within the metadata fields - description, file size, keywords etc. How the divide can be bridged I'm not sure. A colleague providing a glossary for a metadata policy was referred to another glossary. I looked at what I believe to be it and eek! The definitions are clearly absolutely invalid - for our context. The definitions given would leave persion totally bewildered as the concepts don't synch e.g. records - transaction\activity tracking + assets - resources expected to provide economicx benefits. Um no assets are the pictures etc and they won't earn money. A record is an instance of metadata associated with an asset. 203.25.1.208 (talk) 07:23, 3 August 2010 (UTC)

Documentation metadata library[edit]

Just one question, whoever that put Documentation Metadata Library, are you referring to DLL? --Ramu50 (talk) 23:03, 13 July 2008 (UTC)

Criticisms[edit]

The 'criticisms' section links to a piece by Shirkey, ostensibly discussing why metadata is "too expensive, time-consuming, subjective" and context-dependent. But in fact Shirkey's piece argues almost the opposite: supporting metadata-based folksonomies over rigid classification systems.

I think the article is worthwhile, but not useful in this section. Also, the criticism section needs an overhaul; it's certainly not written in encyclopedic style.

CRGreathouse (t | c) 17:02, 12 September 2008 (UTC)

This article needs expert attention[edit]

I've flagged it for expert attention because I believe that it needs to be tightened up, not that I think it is a poor article.

I think the style needs to be brought out of original research phrasing and into line with an article that is an encyclopaedic report. If this were just a matter of copyediting I could do that myself, but there is a real risk of losing content if a non expert handles this. Fiddle Faddle (talk) 09:49, 27 November 2008 (UTC)

Poorly named / Too technical[edit]

This article is mostly concerned with highly technical IT industry metadata. is should be renamed, a general metadata category created, and related articles on metadata created to cover broad areas.

Sections 8 through 8.12 are heavilly skewed to the IT industry.

StephenDeGabrielle (talk) 21:14, 2 January 2009 (UTC)

Agree with the complaint about "Information Technology and Software Engineering metadata" but not (quite) the solution.
The overall article is skewed toward computers/IT because metadata is fundamentally an IT-driven concept (notwithstanding the 10000 years of library applications that illustrate it). So I think that general focus is appropriate and the name is correct.
But everything inside this IT/SE section is specialist information that is too specific for a general-interest article about metadata. Can we spin this into a separate topic? (And if so, what do you call that topic?) Or do we really need to delete it, even though it has some potentially useful content? metaJohnG (talk) 00:29, 16 August 2009 (UTC)
I agree that some of the more technical information should be removed. It will show in the history and can be retrieved later if need be. 124.171.193.25 (talk) 12:08, 15 December 2009 (UTC)

It does not only need attention, it is a complete catastrophy from beginning to end. I will return and delete it in its entirety, because it fails to correctly define what metadata is. Metadata is one of two basic components of all information systems. Metadata have three functions: indexing, description and process support. Metadata is what makes a system a system, they create order and make efficient management of data possible. Metadata are to a system what an engine is to a car. They are embedded in and defined by the system requirements and system architecture. In other words, there is metadata in every system, which makes it unnecessary to list different examples of metadata. In such usages, documentation and standards are conflated with the more general and correct concept of metadata. And, professor Bo Sundgren at statistics sweden coined the term, and attempts to copyright it have failed after legal action from him, so in that case this article is simply a lie. [[[Special:Contributions/79.136.76.102|79.136.76.102]] (talk) 18:51, 16 January 2010 (UTC)]

This is the first I have heard of Bo Sundgren but there seems to be a lot of information out there about his contributions to metadata. I would be interested to read some more about the legal action around copyright of the term if you have any references. I revised this article based on older versions of the article which pointed to the Metadata company but agree with you that this needs to be changed. SallyRenee (talk) 09:18, 4 March 2010 (UTC)

Minor tweak, Metadata is not metadata are. It's a plural not singular concept. And the article could use a rewrite. It's too IT and not enough IM (Info. Management) equivalent information specialists. 203.25.1.208 (talk) —Preceding undated comment added 04:04, 12 March 2010 (UTC).

Metadata in Television[edit]

I'm pretty sure that it is used in Television. TiVo uses metadata when finding shows. The DVB standard has metadata. I'm pretty sure that it is in many other standards too. --Matthew Bauer (talk) 19:38, 13 March 2009 (UTC)

Confirmed and added. metaJohnG (talk) 23:55, 15 August 2009 (UTC)

Metadata confusion[edit]

In the Book description it says things like author, publication date, ISBN, etc. are metadata. Why are these things not just regular ol' attributes (data elements, columns, etc)? DEddy (talk) 22:27, 4 June 2009 (UTC)

Simplest explanation is the "sliding window" view of metadata. They _might_ be metadata, it depends on who's looking at it and why. There is no rigid split into "data" and "metadata", nor is this split a characteristic of the properties themselves - instead they're distinguished by the use being made of them at the time, and even this changes through the life of the entity. The data / metadata split is fluid and subjective.
When I read "a book" then the content text is data and ISBN is metadata. When I order the same book on-line, then the ISBN that I search for is instead treated as data and the content is just a block of opaque data that's transported, but not seen as interesting at that particular moment. Andy Dingley (talk) 22:41, 4 June 2009 (UTC)
Andy... WHY would the text of a book be data & the ISBN be considered metadata? I'm having a very difficult time accepting ISBN, author, publishing house, publishing date, etc. as anything but data about the book. DEddy (talk) 14:49, 6 June 2009 (UTC)
If I'm reading the book, I care about the text content. I don't care much about the author or the number of pages. I care even less about the ISBN, publisher or printing house. I certainly don't care about the size and weight. They're just not important to me at that time. In other uses (ordering it online) then the stock code / ISBN is crucial and the shipper also cares abut things like size (what package it will fit into) and weight. They don't care about the text at this time though.
Well (and this may be the whole gist of this wrangle about what is/isn't metadata) when I'm reading a book, I care very much about publisher, printing house, data of publication, etc. Stuff that seems to be described here as metadata, & what I would simply call "attributes" ("data elements" in techi-talk). In (digital) reality, ISBN happens to be 13(?) digits long, while the body (text) of the book is simply a single(?) very long text field. I would therefore argue that the text of the book is just a run-of-the mill (but long) attribute of the thingamee, "book." DEddy (talk) 14:58, 17 August 2009 (UTC)
So we have a whole bundle of properties that "belong" to books, and remain the same for "books" no matter what. Within this overall bundle though, there's only a small set that are of interest to particular tasks, depending on the task, and this set shifts around as different tasks do their work on the book entity. We conventionally (if we're making the distinction) call this relevant set "data" and we distinguish the other properties (those that aren't data right this minute) as "metadata".
So that's the simple sliding window view of metadata, which is how it's sometimes thought about in the field, but isn't the most common way. What normally happens is that most things have a "primary use" (reading books) and we think about how data / metadata splits in that scenario, then assume that remains the same every time we talk about a book.
In one sense, you're right and these "metadata properties" aren't "anything but data". Of course they're still data. However it's useful to us to split these into "metadata data" and "data data" sub-groups too. As those terms are clunky we shorten them to just "metadata vs. data", but it doesn't mean strongly that metadata "stops being data". Andy Dingley (talk)
Early use of metadata meant those characteristics that allow one to manage data. Metadata about books include ((Number-of-pages: a positive integer}, (ISBN: 10 numeric digits for books published prior to 2007; since then 13 numeric digits, formatted as NNN-N-NN-NNNNNN-N)). In other words, the user of metadata in this sense is concerned about how the book's descriptors may be represented. The usage has migrated so that metadata is a synonym for 'descriptor' (a book's ISBN number and number of pages are descriptors of the book). This leads to the need for a new word to designate the description of the descriptors.JWBito (talk) 05:18, 10 July 2009 (UTC)
A descriptor is not always metadata -- it is only metadata if it is about data. So if you point to a tree, and I say it is 10 feet tall, I am not giving you metadata. (I am hoping we are all in agreement so far.) On the other extreme, if you point to the entry for Tree in wikipedia and I say that description was last edited by Joe-Bob, that clearly is metadata. The in-between case -- if I have a database of trees on my property, each one with a unique name ("Front yard spruce"), we will probably have to agree to disagree about whether the fields 'tree height', 'tree ID', and 'entry last updated' are descriptors or metadata (I'd say descriptor, metadata, metadata, but it really depends on many things and is not a distinction worth arguing about, in the end). metaJohnG (talk) 23:54, 15 August 2009 (UTC)
I disagree, a descriptor, attribute, label, element, column or property is always metadata as long as they are collected in a structured and uniform syntax. In its broadest sense metadata is 'data about data' or a description of an 'object'. The object may be a tree, a book, a photograph, a digital file or any other physical or abstract 'thing'. Therefore, if you point to a tree and say it is 10 feet tall, you are identifying metadata about the tree. I think the issue here is what you do with the metadata after it has been identified. Typically what we think of as metadata is stored in a computer database (or in a paper 'catalogue' in the pre-computing era). In order to store this metadata in a systematic and standardised manner (so that the 'objects' can be easily located), a vast array of metadata schemas (or data models for database admins) have been developed. Each metadata schema serves a different purpose for a different group of people. Depending on which metadata schema is in use particular metadata becomes relevant. SallyRenee (talk) 23:08, 14 December 2009 (UTC)

Origins of DB Metadata?[edit]

Anyone know who did this first? I ask because I suspect my development team may have. Way back when NOMAD was a cutting-edge 4GL database engine running on VM/CMS, I ran a development team that created an enterprise directory management application for centralizing and publishing user and email directory information for some 60 different email sub-systems used in our corporation. Everything from, MS Mail, SMTP, X-400, mainframe and desktop. Nightly we would accept updates and republish directories in all the needed formats. I contend that this application was the inspiration for Active Directory after we showed it to Microsoft but that is another story. And you might wonder where Outlook came from while you're at it. I know where the file extensions .ost and .pst come from.  ;)

Anyway, we were well into the use of this when we realized we were spending too much time redesigning the schema - then we had a collective lightbulb moment. We embarked upon a project to describe the schema of our Enterprise Directory in its own database. If we wanted to redesign the directory, all we did was update the directory schema database that contained the metadata, and our process would go through the steps of transforming the directory into a new master, building the schema according to the updated metadata in the schema db. This automation was a great timesaver and greatly excited the Dunn and Bradstreet account management we shared our work with.

So much so that we were awarded a D&B Innovator's Award at their yearly user conference - well, we were mailed it as our mgmt would not fund the trip. However, I understand that the CEO or VP someone made a big deal of describing our work to the session and presenting it in absentia. I still have it somewhere - little grey marble tablet but the best award I ever received.

So there's a short history story - I'm certain someone will correct me about the origins. And to tell you the truth, we never used the term "metadata". —Preceding unsigned comment added by Xtss33 (talkcontribs) 00:46, 7 August 2009 (UTC)

When was the NOMAD work being done? I worked with a Brit who went through Cincom in the early 1970s. He claimed (and I have every reason to believe him) that he coined the terms "active vs passive." Active meant that programmers HAD to go through the directory/dictionary in order to change schemas. Passive meant such changes were to be performed as a separate documentation step. Naturally there were great debates as to resources consumed & "unnecessary" overhead. Active had its day in the sun, particularly with the success of Cullinet's IDMS/IDD. DEddy (talk) 02:14, 7 August 2009 (UTC)
If you were already using email, then you certainly don't have a first here. Email only hit most corporations in the early/mid 1980s, whereas data dictionaries for holding system/database metadata were really big in the mid-late 1970s, especially in the UK. Mhkay (talk) 11:26, 21 July 2010 (UTC)
MHkay - In its day, the gold standard for data dictionary products, was DATAMANAGER by Manager Software Products (MSP) in the UK. It came to the states in about 1973. It's origins were some sort of UK government grant in what I have to assume was the 1960s. Do you have any view into history as to when DATAMANAGER went on the market in the UK & who the first customer was? DEddy (talk) 13:19, 21 July 2010 (UTC)

Status of Edits and A Proposal[edit]

I addressed a lot of the individual topics/concerns above in a recent edit, it's been a week and no one's complained (yay).

Now here's the deal: The article is waaaay too long. How would people feel if most of it after Levels went away? I would keep Uses (edited) and Criticisms, get rid of Definitions, Types, Risks, Lifecycle, Storage, Types (again!), all the IT section, Digital Library Metadata (which is actually a list of Types again!), Image metadata, Geospatial metadata, and metametadata.

In most cases I consider the content either redundant, too detailed, or highly arguable. (Many of the characterizations of metadata and metadata practices are at best specific to certain circumstances, and others are just wrong.) Key points from those sections could be integrated with existing sections if necessary.

I think we're all pretty shy about editing each other's work but my sense is we need to bite the bullet and do major surgery. I'll give this comment a few weeks to address attention (though presumably you're watching this page if you care) and then sometime after that may dig in.

metaJohnG (talk) 00:29, 25 August 2009 (UTC)

Couldn't agree with you more that this piece is far too long & jumbled to be effective. For me a major issue is that difference professions (say my systems/software/applications vs books) use the language in conflicting ways. For me author/ISBN/publisher are simply attributes of a book/magazine/publication and are NOT metadata. But what do I know... my experience & interest with metadata is in source code, programs, & portfolios of systems. As has been pointed out... it all depends on the context of one's perspective. There is no absolute (although many would like such simplicity).
I'm also not going to attempt to pull the piece apart to provide for neutral ground to describe the various flavors of "metadata"... pulling the piece apart is probably what needs to be done. As it stands (by my admittedly slanted perspective ) there is far too little quality description of systems metadata. Describing stuff like author & creation date of a Word document as metadata is simply something I do not grok. Those are simply attributes (data elements/columns) for a document. But again... that's just my perspective & plenty of people (who've clearly come to the game recently) are happy to think of such attributes as metadata.
Yet another beef... I do not know how long the profession of "document management" has been around, but probably pretty close as libraries (e.g. maybe 5,000 - 6,000 years). Last time I looked the definition of "document" did NOT include source code/programs. If someone wants to argue that XML is a document & a FORTRAN program is not a document, please step up to the debating podium. DEddy (talk) 16:37, 25 August 2009 (UTC)

Metadata Registry[edit]

The there have been robust central (IT systems) metadata "repository" (same as "registry" as far as I know) products around since the early 1970s. The enterprise (as opposed to single/few applications) success rate of these efforts has been about 5% (if one wants to be generous).

Please to offer how this "new" metadata registry is going to be any different?

Also... using the analogy of book libraries... the ones I know of all work off a distributed model. The directory (card catalog) of a library consortium appears to be central (e.g. you can search the entire catalog at one swell foop, but the physical collections (librarians speak of "collections" not books) are obviously distributed across multiple locations. Basic reality... when I'm in Location A, I'm most concerned with my collection, not the stuff over in Location B. Virtually merging the index/directory makes it appear to a "customer" that collections A & B are essentially one.

So let us not speak of A metadata registry, but multiple metadata registries. (Always with the caveat of... "if such things actually exist." Plus gently stepping around the thorny issue of will they be managed as an enterprise rather than local resource.) DEddy (talk) 15:10, 1 September 2009 (UTC)

Added a Legal Section[edit]

I just added a legal section, which I hope others will expand upon. Thanks. Gautam Discuss 01:33, 14 October 2009 (UTC)


Complete Revision[edit]

I have just uploaded a complete revision of the metadata page. Much of the technical stuff has been removed in favour of a holistic approach to the topic. Please feel free to re-add things that I took out in this revision and if you don't like it at all please revert to old version although given that there have been so many comments about what needs changing I would suggest that we update this new version as we go along rather than revert back. What do others think?. SallyRenee (talk) 01:07, 7 January 2010 (UTC)

Speaking as a newcomer to the article, I'm afraid that I must say that I consider the version before the above-mentioned change [2] to be very much better than the current one; mainly because the "technical stuff" has been removed, but also because it's now written like an essay or thesis rather than an encyclopaedia article. The most obvious departure from our usual style is the use of questions for section headings, and the consequent frequent use of "Metadata" in the headings, which is in contradiction to the MOS - "Section names should not explicitly refer to the subject of the article." (see WP:HEAD). The second sentence of the lead ("Metadata is an emerging practice") is meaningless ("The use of metadata" would be better), and the lead as a whole doesn't tell the reader what metadata _is_. There's virtually no discussion of the actual types of metadata found in computer files in the real world; the reader might come away with the idea that it's some arcane metaphysical system used by librarians, rather than an integral part of their digital photos and web pages. I would recommend a wholesale reversion to the December 23rd version, and that that version be used as the basis of subsequent development. Tevildo (talk) 20:57, 28 February 2010 (UTC)

Thanks for the feedback. I've updated the headings and changed the second sentence of the lead as per your suggestions. As for the technical information the article is now lacking, if you read the rest of the discussion page you will find that there has already been discussion about whether the very specific technical information belongs in a 'general-interest' encyclopaedia article. SallyRenee (talk) 09:23, 4 March 2010 (UTC)

This is a 10 year old discipline?[edit]

Most of the information presented leaves the impression that metadata (whatever it is) is maybe 10 years old, primarily has relevance to things like library books, photo images & HTML pages, and has no/minimal relevance to information systems. Seems pretty lame. DEddy (talk) 00:47, 17 March 2010 (UTC)

Actually the concept is thousands of years old the term however is much younger. Obviously it depends on the field you're in though. This article is still heavily IT focused which is a new field whereas Information Science/Info. Management etc are older fields and have used the concept, if not the term, for a long time. 203.25.1.208 (talk) 03:17, 18 March 2010 (UTC)

Libraries section[edit]

I don't want to be overly critical but the section about libraries here is inaccurate. DDC is not about small 5x3" cards and is not an alphanumeric system. I would change it myself but am unfamiliar with protocol here. —Preceding unsigned comment added by Therestlesskaiser (talkcontribs) 00:06, 3 October 2010 (UTC)

Thank you for your suggestion. When you believe an article needs improvement, please feel free to make those changes. Wikipedia is a wiki, so anyone can edit almost any article by simply following the edit this page link at the top. The Wikipedia community encourages you to be bold in updating pages. Don't worry too much about making honest mistakes—they're likely to be found and corrected quickly. If you're not sure how editing works, check out how to edit a page, or use the sandbox to try out your editing skills. New contributors are always welcome. You don't even need to log in (although there are many reasons why you might want to). --M4gnum0n (talk) 09:14, 3 October 2010 (UTC)

There are a lot of things to fix[edit]

Reference to IT: "Metadata" is an IT term - it does not clarify anything to say that the concept existed for 10000 years. The term has arisen out of the need to formally describe digital resources. Bill Inmon, et al, in the book "Business Metadata" provides a history of metadata, and it is clear there that the context is always IT. He sums up with: "Metadata is very valuable to the business and helps facilitate proper understanding of the enterprise data assets. Without this understanding, the data would be relatively useless."

More confusion: The discussion about books and trees highlights the confusion which is also pervasive throughout the IT industry. Metadata started out as being data about data, and then all of a sudden it jumps to be data about objects, and then to be an alias for "description". With this kind of thinking it means that every word is metadata because every word describes something. I doubt that anybody would regard a dictionary as being an organised collection of metadata. If we take a library index as being a collection of metadata then we should regard a department store's catalog as also being a collection of metadata, similarly a telephone book, an organisation's list of personnel, its list of customers, the Oxford English Dictionary, my shopping list, wikipedia itself, news bulletins, and so on. I don't think it has occurred to anybody that any of these things can be regarded as metadata - they are merely lists or registries or catalogs of descriptions.

A better definition Adrienne Tannenbaum, in the book "Metadata Solutions" leaves it until page 90 before attempting a definition. It begins with:

What a mess! In many cases people cannot agree on the value of information ... and now we are fighting and misinterpreting metadata. One definition that is never questioned or doubted, however, is the following:
Misinterpreted metadata Data about data.

Adrienne then continues, but first she defines instance data, as data on its own is too generic, and suffers from the "sliding window" problem mentioned above. She says:

Instance data That which is input into a receiving tool, application, database or simple processing engine.

(I would add "user" to the list of receivers). And then, she says:

If we adopt the philosophy that instance data is that which is input into a variety of receiving buckets, many of them used almost exclusively by application developers or by end-user data analysts, a flexible definition of metadata is:
Metadata The detailed description of the instance data; the format and characteristics of pupulated instance data; instances and values dependent on the role of the metadata recipient.

Perhaps, some may find simpler the following definition from Informatica's white paper Manage Your Metadata to Better Manage Your Business:

More than just its standard definition of “data about data,” metadata facilitates the understanding of the characteristics and usage of data. From a technical perspective, metadata is used to help IT organizations better manage and maintain their data assets. From a business perspective, metadata provides context to data, acting as the semantic layer between a company’s IT systems and its business users.

Here are two even better definitions - they are more general, even though they were developed in different subject areas:

from Gartner's Mike Blechar, Mark A. Beyer, Jess Thompson, Anne Lapkin, Nicholas Gall, August 2010 - copy held at the Institute of Metadata Management:
Metadata is information that describes various facets of an information asset to improve its usability throughout its life cycle
from Andrew Westlake Association for Survey Computing, 2007 ASC Metadata Doc
Metadata is the information that the owner of a resource needs to supply to potential users of that resource, so that they can use it correctly.
In this last one I would probably replace the word "resource" with "digital asset", and then I would need to define that, perhaps along the lines of Adrienne's definition of "instance data". But that is my POV. MetaWorker (talk) 00:09, 3 March 2011 (UTC)

Big gap Whilst the page provides many examples of usages of metadata, it does little to describe how it works, ie the theory behind it. Unfortunately this is again in the IT field, but it cannot be avoided. The biggest obvious gap, perhaps, is the lack of a reference to the OMG's MOF - this is probably the most thorough work available about what makes metadata work. MetaWorker (talk) 00:32, 3 March 2011 (UTC)

Wiki vandalism I noticed above a comment that the 23 December 2009 page was more informative. I have just looked at it and I concur that a lot of information has been lost. It is not good enough to say that one can easily reinstate it - there have been hundreds of edits and it is not feasible to go through all the edits to see if something significant was dropped. I just hope that editors who remove content do it from a position of understanding the content and be absolutely sure that it is not needed either because it is irrelevant or because it is covered in some other way. For example, these discussions, which to me seem very relevant even though they are heavily IT oriented, have been dropped: data dictionaries, CMDB, ITIL, OMG, and particularly bad is the removal of most category tags and all tags related to software engineering. This is gross vandalism. Here is a different organization's entry on metadata which has a lot of text copied verbatim from the December 2009 page.

Dead Link I noticed that under the "Metadata Syntax" section of the article that the sentence containing "whether for indexing or finding, is endorsed by ISO-25964" that the link within to the PDF document citing "ISO-25964," is dead. Chrisbear68 (talk) 17:51, 3 January 2012 (UTC)

Metacontent[edit]

hello, i have today tried to elucidate the concepts of metadata and metacontent in an attempt to provide some basis for the remediation of this page which is, and has for the last two years or so, been chaotic. please dont just delete my contribution without trying to think it through. thanks. info. otto. imom. bicycle repairman and many other epithets which are mine. —Preceding unsigned comment added by 81.245.193.93 (talk) 20:01, 26 February 2011 (UTC)

This is ridiculous. I have been working in this field for decades and I never heard the term "metacontent" used this way. You have updated the document by placing the word metacontent in brackets immediately following the word metadata - this gives the impression that metacontent is an alias for metadata. This is clearly your own invention. From the description you gave it seems metacontent is the data stored in a metadata repository. If so then the descriptiion about Apple's work belongs elsewhere. I think you should undo your changes. MetaWorker (talk) 23:30, 2 March 2011 (UTC)
Please correct errors when you see them. This is what Wikipedia is all about. If you know better, please, enlighten us all. As an expert in the field, I hope you will consider giving the article even further attention. Metadata is an important concept in the information age, so this is a relatively important article. As a general term (as opposed to jargon), it should be written with a lay audience in mind. It would be nice to see it be more useful to the average reader. As it stands, there are plenty of specifics but a poor overall structure and a very poor introduction. A person will look here to find out what the heck "metadata" is, and the first sentence they read is "The term Metadata is an ambiguous term which is used for two fundamentally different concepts"?!?! The whole introductory paragraph is about the semantics of the term? How about something useful? I will take a stab at correcting this (that is what wiki is all about) but I certainly second the call for the attention of an expert. __ø(._. ) Patrick("\(.:...:.)/")Fisher 12:42, 26 April 2011 (UTC)

You have asked for an expert, and here i am. please carefully consider your own status. the key to the whole introduction is exactly "semantic" ie. about "meaning" and if you don't think meaning is important, and you cannot see how meaning is useful then i am at a loss. the point is that this page has been a battle ground for years because the term is "ambiguous", it is one term for two different concepts, and as long as this, the most important aspect of the term is not understood, then the word will never be of use to anyone and the fruitless battles between those who think it is one thing and those who think it is the other will continue endlessly. i would appreciate it if you would focus on things you know something about and let some other more knowledgable contributor comment. note that on this most volatile of pages the intro has stood unchanged for a surprising length of time. this is the first systematic definition of the meaning of the two concepts represented by the single term that has appeared on this page. the reason the intro starts "The term Metadata is an ambiguous term which is used for two fundamentally different concepts" is because IT IS !!! i do not understand your ?!?! as if there is something surprising or incoherent about pointing out this most important fact. i have no real interest in rewriting the whole page to untangle all the different strands which are hopelessly interwoven but the intro should set the scene for others who can see the point. —Preceding unsigned comment added by 91.183.39.190 (talk) 13:43, 27 April 2011 (UTC)

I'd like to open a dialogue and avoid a revert war while we work toward an introduction that is agreeable to all. Firstly, who are you? Please sign in, and contribute with a name if you want authority based on who you are. Your ad hominem attacks on me don't further your cause. I used the word "semantics" to evoke its connotation of quibbling over details, not the meaning you are using; I clearly care about the meaning of words, as I spent quite a bit of time crafting a readable and clear definition of the term. My desire is for the introduction to be useful to the lay reader. I can understand that there are opposing viewpoints, and I do think that wikipedia should address each viewpoint. My guess is that the intro was left alone because other potential contributors like me had an awful experience with the editors involved in this page and decided they had better things to do. I don't think the solution is to remove contributions wholesale, but rather I hope you will take the time to craft an equally clear explanation of a competing definition and present each in turn. I suggest that the intro start with the elements that all camps agree on, perhaps something like:

Metadata is information about other information or information systems. The precise meaning of the term is controversial. Though a common definition of metadata is that it is "data about data", in technical disciplines the term is used to describe data structures and systems which contain data, rather than the data itself. Thus it may also refer to "data about data structures" or "data about database systems".

Since you care very much about this article, I hope you will take the time to integrate my suggestions into an introduction that is readable, enlightening and accurate. I hope you will restore the specific examples (admittedly related to the "data about data" conception, as this strikes me as more relevant to the lay reader than the more technical definitions) that I took great care to present in a reader-friendly way, while also presenting the alternative usages to your satisfaction. Please try to avoid original research (such as neologisms). The intro you have restored does not seem neutral, either. Don't get me wrong, I agree with you that there are competing definitions, here. I just think the intro should calmly present that fact, rather than forcefully argue for a viewpoint. My addition of a high-level summary of fields in which metadata are prominent was also removed, and I do not understand why. The intro should orient the reader, and I don't believe you disagree with this:

Metadata is a central concern in information science and database design. It is widely used on the World Wide Web for search engine optimization and the Semantic Web.

Why did you just trash everything I added? That's the easy option for you, but it doesn't sit well with me, when it took quite a bit of effort to craft good prose. See this comment? It's long. That's easy to write. The concise intro I wrote took much more effort. __ø(._. ) Patrick("\(.:...:.)/")Fisher 02:17, 8 May 2011 (UTC)

"Metadata types" Section[edit]

I don't feel qualified to fix this, but the first sentence of this section seems awkward. Maybe "a manifold" and "where there" might be a better alternative? Or "fields of application:" ? Gcbound (talk) 22:24, 22 May 2011 (UTC)

Metadata - Definition Dilemma[edit]

I am new as an author, but I would still like to make a contribution, because the subject is important:
Metadata are defined as "(1) data about (0) data", (0)- the data which are described and (1) the data which describe. Therefore the (1) and (0) define the role of the data in the definition.
An example: (1) Exif-Data describe a (0) Picture. if you want to store Exif-Data in a database you need a description (Create table or so) to describe the structure of the Exif-Data. Among these data there is also the ISO-Speed-Attribute. Of course we need a description of the ISO-Speed-Attribute (textual description, Valid Numbers, Purpose).So we have now several levels where each level describes the level below, the upper level being the "metadata" and the lower level the "data".
This seems to be an endless recursion, the same problem OMG was facing when they started to describe -> UML (the unified modeling language). After lengthy discussions they settled to a 4-level model (-> Meta Object Facility MOF). The same idea of levels of abstraction can also be applied in a more general way to data
Example: Ms Jane Windfield own the dog charly
Level (0) contains the data about Jane Windfield (Jane;Windfield;F;1985-08-09;Dublin;...), the doc Charly and the ownership ownership is especially interestin, because it describes the relationship between Jane and Charly. The data of the relationship: eg. since date; only or joint owner etc.
Level (1) contains the description of the data Firstname=String:40;Lastname=String:40;Gender=String:1;..
Level (2) contains the description of the Attributes of (2): Gender: The meaning of Gender is ..., Values M/F/D (in France Demoiselle!), Determination of the value.., Representations (M..Male;Männlich;Masculino) etc.
Level (3) most abstract level describes generic properties of Entities, Relationships and Attributes.
To resolve the problem we can (and should) stick to the most common definition of metadata as beeing "data about data".
Then we should introduce the concept of "role" and "level of abstraction".
We define metadata as being "data about data". In this definition we distinguish implicitely between two roles of the data - the role of data being described and the role of the data which describe. Describing data in a given context are themselves "described data" on a higher level of abstraction. Levels of abstraction are important to reduce complexity.
Metadata are especially important for automatic processing of data:
In the old days programs contained data in the data section. (Basic).
Then the data were removed to file and the programs contained data descriptions.
Then parts of the data descriptions were removed and stored in the catalog of a relational database. (The program contains and SQL-command which "knows" about the relationships)
Model / data driven systems remove even more information about data - e.g. the presentation, the validation rules etc.
We are dealing with increasing masses of data: To manage and control them - we need deep structures of Metadata.
THis is just a talk page and I would be happy to receive any comments or elaborate on a subject. I wrote much more in the homepage of my company. I am not sure about the rules - I am prepared of course to disclose the information and I look forward to get a feedback.... Metasafe Metasafe (talk) 16:46, 3 December 2011 (UTC)

Folks,
I'm unsure of the impact this thinking may have on the definition of metadata but it would be helpful to look at other concept definitions that participate in the metadata world as well. The question of whether a thing is data or metadata resides in the relationship between the two. The ISO 2382-1 01.01.02 defines data as "A reinterpretable representation of information in a formalized manner suitable for communication, interpretation, or processing." Data in this respect has a few fundamental qualities "representation" and "reinterpretation" amongst them. Representation seems pretty straight forward - some physical manifestation. The reinterpretation quality however seems more a mechanism or set of rules that enable the representation to be understood. It is descriptive. To that extent metadata can be considered a reinterpretation mechanism or set of rules that exposes the semantics of a representation. It also constrains the meaning particularly when used with controlled vocabulary. Much of the reinterpreation mechanism related to representation is inherent from our cultural context so the rules are internalized. But this is not always the case particularly where disambiguity is required. I think that the notion can be extended to an explanation of the representation as well, which would provide for metadata in lifecycle management of the representation or transformation from one representation to another (human to machine readable, French to English sentences) or the integration of a number of representations to create progressively enriched information structures. By approaching metadata as the reinterpretative mechanism of an information representation it may help with the definition.
Cecil Somerton
198.103.161.1 (talk) 17:46, 1 March 2012 (UTC)

Unclear opening paragraph[edit]

I had hoped to use this article to help clarify metadata concepts but the initial sentence and opening paragraph are immediately confounding.

  The term metadata is an ambiguous term which is used for two fundamentally different concepts (types).

Don (talk) 01:39, 2 April 2012 (UTC)

And what did you think after you read the article? DEddy (talk) 02:27, 2 April 2012 (UTC)

I agree. The intro is barely comprehensible, particularly to someone who doesn't know much about the topic. 93.96.236.8 (talk) 13:11, 3 April 2012 (UTC)

If you can stand it, read through the Talk section. You will see Charlie Betz comments. He knows the topic & is a published author. He took a stab at making thing comprehensible, but from what I can see, all his work has disappeared. Yes, Charlie's "Enterprise Metadata" is gone.
The unfortunate fact is that "metadata" (what ever it is) has become "essential buzzword bafflegab"... meaning, when one has a gap in the thought process, say "metadata" to impress people.
Duly note (see above in Talk) that my attempt to offer "Zip Code" (or Postal or Post Code) was rejected as not a good example. DEddy (talk) 15:30, 3 April 2012 (UTC)
Metadata has a definite meaning, whether people misuse it as a buzzword or not. Also, a zip/post code is data, not metadata. — Preceding unsigned comment added by 109.145.165.125 (talk) 00:33, 2 August 2012 (UTC)
Metadata does not have A definite meaning... there are multiple conflicting, ambiguous meanings. DEddy (talk) 19:42, 8 June 2013 (UTC)

I was unclear. "Zip Code" is a metadata label that identifies data 12345. Two parts. The label & the data. A perfectly good metadata label could be M0435 (as soon as you memorize it, it's meaningful). Everything in a computer is labeled even if it's by base+displacement notation. DEddy (talk) 14:17, 27 February 2015 (UTC)

First paragraph[edit]

The first paragraph refers to "the application", without establishing any context. It's pretty bad form for an encyclopaedia article to talk in terms of implicit context or make assumptions about the readers awareness of such.

In fact, the first paragraph is rather a mess as a whole. It reads like it's been hacked together by 10 different authors. — Preceding unsigned comment added by 109.145.165.125 (talk) 00:30, 2 August 2012 (UTC)

No mention of the ISO 23081 series of metadata standards?[edit]

Seems odd that the ISO 23081 series of metadata standards is not referenced, also though the metamap might have been linked to as well.

Stephen — Preceding unsigned comment added by Steffclarke (talkcontribs) 03:42, 3 September 2012 (UTC)

file system metadata[edit]

The solid-state drive article mentions "file system metadata", which I assume is referring to file system#Metadata.

Is that something this metadata article should mention in Wikipedia:Summary style, with a link to the details? --DavidCary (talk) 19:51, 3 July 2013 (UTC)

Wikipedia reference in opening section[edit]

I added a reference to how Wikipedia uses metadata. I know this may seem to violate WP:SELF, but since it's a common example to readers of the site, I thought it was worth including.Timtempleton (talk) 13:42, 22 May 2014 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just added archive links to 2 external links on Metadata. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Question? Archived sources still need to be checked

Cheers.—cyberbot IITalk to my owner:Online 22:48, 24 January 2016 (UTC)

Libraries[edit]

Hello to all interested in Metadata!

I'm here as part of a Wikipedia edit-a-thon, entering info on Library and Information Science.

I feel as though there are a few sections in the definitions at the top that could be moved down to Metadata Usage. I've gone ahead and moved the Libraries section down to Library and Information Science.

I'll then be adding to that section specifically. Wish me luck and let me know if you have any comments. — Preceding unsigned comment added by Binnorie (talkcontribs) 20:33, 27 February 2016 (UTC)

Added Some Citations and Removed Warning[edit]

I added some citations to the top of the article and I removed an old citation needed warning form 2010 that seemed to have been overtaken by events. There are some reasonable citations in that section (they could be better but they aren't terrible) and so I removed the warning. If anyone has an issue with that - let me know or fix it. Thank you! Alex Jackl (talk) 15:43, 17 June 2016 (UTC)