Jump to content

Wikipedia talk:Persondata/Archive 2

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

The {{Pharaoh Infobox}} template is including {{Persondata}} at the top of pharaoh articles... Mike Dillon 00:46, 13 January 2007 (UTC)[reply]

yes, we need to resolve this. I asked on the Pharaoh infobox talk page that the Persondata template be removed from the Infobox. We need to follow up on this. --Rajah 22:05, 1 April 2007 (UTC)[reply]

Template:Persondata edit request

[edit]

Could a sysop please add the line <!-- Metadata: see [[Wikipedia:Persondata]] --> to the usage section on {{Persondata}}, right after the <pre> tag? This would make it in line with the example given in the Wikipedia:Persondata#Using the template section of this page, and would make it easier for those who don't know about this system to figure it out. Thanks. Picaroon 04:25, 14 January 2007 (UTC)[reply]

Done. Luna Santin 19:39, 15 January 2007 (UTC)[reply]
Thanks. Picaroon 20:40, 15 January 2007 (UTC)[reply]
Adding the comment to the template doesn't actually do anything as the comment is not viewable either in the article view or the editing view. Perhaps it would be useful to add an actual note into the template that is not an HTML comment. Kaldari 23:18, 24 January 2007 (UTC)[reply]
An HTML comment is the only way to handle it. Persondata is not visible; it's just a textual note within the window as to what it is. Ral315 (talk) 00:35, 25 January 2007 (UTC)[reply]
My point is the HTML comment is only useful if it is outside the template, rather than inside. If it's inside the template, you'll never see it since HTML comments in templates are not displayed in editing mode. Thus the recent edit to the template should be reverted. Kaldari 02:52, 25 January 2007 (UTC)[reply]
Compare it yourself: before after
It's useful for copy/paste. Editors who are unfamiliar with {{persondata}} know where to have a look at for more information (because HTML comments _are_ visible in edit mode). --32X 05:07, 25 January 2007 (UTC)[reply]
My mistake. I thought the usage notice had been added to the template itself. Kaldari 18:14, 25 January 2007 (UTC)[reply]

Hi Kaldari. Could you edit the template to link to Template:Persondata/doc, following the template doc page pattern? Mike Dillon 18:46, 25 January 2007 (UTC)[reply]

Where's the actual benefit of it? The doc page contains less information. --32X 22:22, 25 January 2007 (UTC)[reply]
I'm not sure what you're asking, but the benefit over the current situation is that the doc portion will be editable by anyone while keeping the template itself protected. Mike Dillon 22:53, 25 January 2007 (UTC)[reply]
Ok, that's an argument. But wouldn't it be better to set a redirect to Wikipedia:Persondata since that page is all about the template? (If one knows about the template, the short form for copy/pasting is enough; otherwise the introduction is a "must read".) --32X 23:41, 25 January 2007 (UTC)[reply]

Siblings and parents

[edit]

Can we add siblings and parents as a cat? That way if the info is removed from the article, at least the info will be easily found by those who need the info. The info doesn't have to display, but its a good place to store it. The biography infobox has this information but it displays all answers. This way the info could be not displayed and still be available for researchers. Answer at my page please. --Richard Arthur Norton (1958- ) 20:45, 24 January 2007 (UTC)[reply]

This seems like a bad idea; persondata has been standardized for the most part. Ral315 (talk) 23:37, 25 January 2007 (UTC)[reply]

hCard microformat

[edit]

It should be relatively trivial to arrange to have "Persondata" published with hCard microformat mark-up, simply by applying some standard class names to its containing elements. The data coudl then be extracted by a variety of parsing tools. Please see also Wikipedia:WikiProject Microformats Andy Mabbett 20:59, 28 January 2007 (UTC)[reply]

[edit]

Anyone interested in a proposed project to link to WorldCat Identities is invited to leave comments or sign up at the project proposal page. WorldCat Identities provides pages for 20 million 'identities' (authors and persons who are the subjects of published titles in WorldCat). Several thousand of these pages provide links to Wikipedia biographical pages: providing links in the other direction would allow readers of Wikipedia biographical articles to move straight to associated library information held in WorldCat libraries. Dsp13 15:17, 20 February 2007 (UTC)[reply]

Template:Birth date and age

[edit]

For the "Date of Birth" parameter, should we use {{birth date and age}} or should we stay clear of this? --WillMak050389 01:10, 5 March 2007 (UTC)[reply]

I would avoid it. Any application using Persondata is likely to be working with the wiki-text directly, which means it will see {{birth date and age|1967|07|15}} rather than July 15, 1967 (age 39). The idea with Persondata is to make it easier for automatic extraction of data; either of these is yet another format your parser has to handle. In any case the age is more useful to human readers; given the birthdate any program can easily calculate the age. Dr pda 01:39, 5 March 2007 (UTC)[reply]
Thanks, I wasn't sure, but this makes sense. I'll change the ones I've edited. --WillMak050389 01:43, 5 March 2007 (UTC)[reply]

Half-automatic tagging with persondata-tool

[edit]

I come from the german Wikipedia. At January 24th 2007 126.332 from 133221 (94,8 %) persons are tagged with persondata. A very useful utility is the persondata-tagging-tool from Apper. It extracts automatically birthdate, birthplace etc. from the article and the only thing the user has to do is to check if it's correct and then save it. If someone of your project asks him, maybe he will help you with his tools so you can tag your articles much easier and faster. Bones 77.180.105.11 22:57, 15 March 2007 (UTC)[reply]

I'm actually almost finished writing a script to do a similar thing, although it requires the article to have an Infobox from which the data is then extracted, rather than getting the data from the lead of the article. However there are still around 50 000 articles using one of the top 20 or so people-infoboxes (e.g. {{Infobox Football biography}}, {{Infobox musical artist}}), which is about ten times the current number of articles with persondata.
It is more difficult to extract the information from the text of the article (i.e. without an infobox) compared to the de wiki, since on the en wiki the birth/death places are typically not given in a predictable place, i.e. the opening sentence. Compare the first sentences of de:Alfred Hitchcock and Alfred Hitchcock
  • Sir Alfred Joseph Hitchcock KBE (* 13. August 1899 in Leytonstone; † 29. April 1980 in Los Angeles) war ein Filmregisseur und Filmproduzent britischer Herkunft.
  • Sir Alfred Joseph Hitchcock, KBE (August 13, 1899 – April 29, 1980) was a highly influential film director and producer who pioneered many techniques in the suspense and thriller genres.
Hopefully I will have time this weekend to get the script finished. Dr pda 01:27, 16 March 2007 (UTC)[reply]
OK, I've finished the script now. Instructions for use are at User talk:Dr pda/persondata.js. It also includes a tidied-up version of the javascript above for turning persondata on/off without editing your monobook.css. Sample results of using the script are here.
This is a very nice tool - thanks! However, at present it seems to insert the persondata at the top of the article, rather than before categories. No, sorry, it puts everything in the right place! Dsp13 12:17, 19 March 2007 (UTC)[reply]
Or rather, it puts the persondata in almost (but not quite) the right place whenever there is a defaultsort template introducing the categories - see my query below. Dsp13 21:51, 19 March 2007 (UTC)[reply]
By the way I've also got the extraction from the XML dump more-or-less working by modifying the scripts linked at WP:PDATA#Extraction from the XML dump (the last step is deciding whether to write code to parse the dates which are currently giving errors, or just change the data in the article). I don't have an appropriate place to put the scripts on the web, but if anyone wants a copy email me. Dr pda 01:50, 19 March 2007 (UTC)[reply]
User:SEWilco has left this plea, which sounds reasonable, on my talk-page: 'Please do not have your script call itself "this script". That makes reading and searching edit summaries much more difficult.' Could a simple alteration to the script be made? Dsp13 09:14, 1 April 2007 (UTC)[reply]
I've changed the edit summary; it now reads adding persondata using User:Dr pda/persondata.js. I'm not entirely convinced the previous edit summary was difficult to read (compare 'reverted vandalism using popups', 'renaming category per CFD with AWB' etc); anyone interested in knowing which script would click the link, anyone not interested would just be able to see it was done with a script. As for causing difficulty in searching through edit summaries, there should only be one instance of it in an article's history. Users of the script will need to refresh their monobook.js to pick up the change. Dr pda 12:24, 1 April 2007 (UTC)[reply]
thanks! agree with you, but nice to keep everyone we can happy! Dsp13 13:06, 2 April 2007 (UTC)[reply]

Query re positioning of persondata before categories

[edit]

Where categories are immediately preceded by a Template:DEFAULTSORT, should the persondata go between the defaultsort template (which seems the strict reading of 'immediately before categories', but confusingly splits the defaultsort template from the categories it is concerned with) or immediately before the defaultsort template (which seems more natural to me, but should be specified if that is what is to be recommended)? Dsp13 12:33, 19 March 2007 (UTC)[reply]

In my opinion {{DEFAULTSORT}} is not a real but a meta-template which directly belongs to categories. I don't see the problem here, but to avoid any confusion I've added a comment. --32X 21:19, 19 March 2007 (UTC)[reply]
Thanks. I've modified the script to place the persondata before the {{DEFAULTSORT}} template if it exists. You may need to refresh your monobook.js to pick up the changes. Dr pda 23:04, 19 March 2007 (UTC)[reply]

If you see someone removing persondata templates...

[edit]

...you can now tell them not to do it again, by putting {{subst:pdataremove-warn}} on their user talk page. They will also be pointed here for more information on persondata. Resurgent insurgent 03:33, 25 March 2007 (UTC)[reply]

Why is persondata separate to infobox?

[edit]

Further to my above comment about hCard, please can someone explain to me the purpose and advantage of having persondata in a separate, hidden-by-default table instead of having the same, standard fields in the output of the various infobox templates? What tools exist to parse persondata, inside or outside Wikipedia? Andy Mabbett 00:59, 26 March 2007 (UTC)[reply]

The {{persondata}} isn't a real information box but meta data. It was introduced for the first DVD of the German Wikipedia. The data field is pretty easy accessible with direct SQL (when you have downloaded an image) and therefor allows search operations. With a large article base (more than 100,000 in de.WP) it allows you to do SQL operations like f.e. to search for articles of birth places which aren't written yet. Some time ago I've read about several tools, but because I didn't felt the need I haven't used them. --32X 18:50, 28 March 2007 (UTC)[reply]
Thank you for the explanation. The use-case makes sense, but it seem to me that this could be achieved just as easily, by using hcard, and hCard-like classes, in infoboxes, instead of repeating the information separately; and that that would have additional advantages for readers and editors, through greater interoperability with other tools and websites and ease of authoring. It would also facilitate persondata-like metadata for organisations and venues, though their infoboxes. I'm happy to advise further, if anyone's interested in pursuing this possibility.Andy Mabbett 19:16, 28 March 2007 (UTC)[reply]
To clarify issues in my own mind, I've drawn up a comparison of persondata and hCard properties, on the microformats wiki. Andy Mabbett 19:49, 28 March 2007 (UTC)[reply]
A good reason is that someone using Pesondata usually has read this page and knows what they're doing. It is far more common for people to mess up and misuse infobox, which would garble the metadata.Circeus 19:03, 28 March 2007 (UTC)[reply]
Like any bad edit, surely that can be remedied? Andy Mabbett 19:16, 28 March 2007 (UTC)[reply]

The issue of persondata vs infoboxes has been raised several times on this talk page, see #Use inside implementations of other templates, #Not picked up by Google?, #Hidden Metadata, #Revisiting Infobox Person and #Why is this seperate from Infobox Person?. Some of the main arguments given against combining them are

  • This would require every biography to have an infobox, which many editors are opposed to.
  • There are a large number of different infoboxes (approx 160), not all of which have all the fields of persondata, and which currently vary greatly in the names for the fields they do have.
  • Persondata takes names in the format surname, firstname in order to be able to create an alphabetical list by surname.

There are examples at WP:PDATA#Extraction of persondata of how to extract persondata from an SQL database, or scripts to extract and parse it from the WP XML dump and insert it into a mySQL database, on which you can then run all kinds of queries (these scripts are written for the de wiki but I have more or less adapted them to the en wiki following the hints there, see my comments above).

I notice that your comparison of infobox/persondata/hCard at the microformat wiki is expressed in terms of the rendered (X)HTML of the page; both the previous methods for extracting persondata work with the raw wiki markup, i.e DATE OF BIRTH = 22 May 1977 rather than 22 May 1977. Using hCard would then seem to imply a lot of HTML-scraping to get the data, rather than using the periodic database dumps. (there are over 200,000 biographies, though admittedly only a quarter or so have infoboxes and only about 7000 currently have persondata.) Looking at the list of hCard implementations here it seems that most of these implementations deal with recognising hCards on an individual webpage/converting to vCards/adding to address books etc, rather than dealing with large collections of hCards (which would be the end goal of an equivalent to persondata), although I suppose some of the PHP tools could also be used to populate a database. I also notice that hCard does not yet support the date of death and place of birth/death fields, which would seem to argue against its immediate implementation in place of persondata. Perhaps the best way of combining persondata with hCard (if you want to go there at all) would be, as you originally suggested, adding extra class tags in the persondata template itself. Dr pda 15:10, 31 March 2007 (UTC)[reply]


Thank you for your detailed response. I appreciate that this must be old ground for some people, but I trust that you will agree that consideration of microformats makes it worth revisiting/ I'll address your points as bullets, for the sakes of convenience and clarity:

  • "This would require every biography to have an infobox, which many editors are opposed to" - I would question why they're opposed, and whether they're perhaps putting personal (aesthetic?) preferences before the convenience of users. That said, perhaps, one day, it might be possible for user preferences to include a "do not display infoboxes" option, like the current "do not show TOCs" option.
  • "There are a large number of different infoboxes (approx 160), not all of which have all the fields of persondata, and which currently vary greatly in the names for the fields they do have" - I think there's a case for some standardisation here; perhaps a root "persondata" template, to be included in other biographical infobox templates, in the same way that "coor" is included in a number of other location- related infoboxes.
  • "*Persondata takes names in the format surname, firstname" It's possible for software to convert for one format to the other; or for the data entry to be in to (or more) fields (there's experience of doing this for the name field in hCard).
  • It should be possible for XML to be dumped from infoboxes/ hCards if required.
  • it seems that most of these implementations deal with recognising hCards on an individual webpage" - most, but not all, and thee just the "early adoptions" there's - deliberately - plenty of scope for other use cases.
  • I also notice that hCard does not yet support the date of death and place of birth/death fields" - yes but the comparison page you cite suggests a work-around for that.
  • adding extra class tags in the persondata template itself" hCards (indeed, all microformats) are intended for data that is visible on the page; not for hidden metadata

Finally, being naturally lazy, I believe strongly in both not reinventing the wheel, and not doing work (i.e. entering data) twice.

Cheers, Andy Mabbett 19:33, 31 March 2007 (UTC)[reply]

P.S. Even while I was typing the above, The Anome was adidng, on the Microformats Project talk page:

This a bootstrapping effort at the moment, and you won't see any extra utility in the very short term: but once there's a substantial amount of semantically-tagged content on Wikipedia, some very interesting things will start to happen...

Andy Mabbett 19:39, 31 March 2007 (UTC)[reply]

Persondata box & succession box display

[edit]

In the case of Victor Hugo, the displayed persondata box gets mixed up together with an immediately preceding succession box. Anyone know why, or how to fix it? Dsp13 12:12, 31 March 2007 (UTC)[reply]

There was a missing {{end box}} template after the succession box. It's fixed now. Dr pda 15:10, 31 March 2007 (UTC)[reply]

Gregorian/Julian calendar shift

[edit]

How best to handle old-style dates? At the moment with Samuel Johnson I've left a template for old-style dates in his birth year, but (per discussion of dates above) I'd rather leave something more transparent in the wikitext. Dsp13 12:54, 31 March 2007 (UTC)[reply]

I think the way you have handled it is best for now. --Rajah 05:30, 2 May 2007 (UTC)[reply]

Transcluded persondata?

[edit]

Ramesses II has persondata somehow 'transcluded' onto the page. I'm not quite sure how this works, or it it's desirable. Any thoughts? Dsp13 21:28, 1 April 2007 (UTC)[reply]

It is because the Pharaoh Infobox contains the persondata template. Wikipedia_talk:Persondata#Template:Pharaoh_Infobox --Rajah 23:25, 1 April 2007 (UTC)[reply]

Automatically adding Persondata from German Articles

[edit]

Wouldn't it be possible (and infinitely faster) if we had a bot/script that could just translate the German persondatas into English? The German articles are already mapped to the English ones, the fields are the same (converting dates should be a breeze) and the only hard parts would be locations/descriptions/names. Does this sound like a good idea? --Rajah 06:42, 4 April 2007 (UTC)[reply]

Yes. No point in duplicating effort, and only the name and description fields really need translation, although putting it all together (following interwikis, extracting persondata, converting, and inserting) does sound kind of troublesome. --Gwern (contribs) 15:26 4 April 2007 (GMT)
Yeah, that's a great idea. Sounds like a challenging bot to write though. Kaldari 15:30, 4 April 2007 (UTC)[reply]
Sounds like a great idea! Should be mentioned on Wikipedia:Bot requests. MahangaTalk 22:58, 14 April 2007 (UTC)[reply]

Other metadata information

[edit]

Is there any other metadata templates? Is there any project to make more relevant metadata templates or does the microformats project pretty much take up this area? Remember 16:13, 7 April 2007 (UTC)[reply]

No, sadly, an organized metadata does not yet exist on wikipedia as far as I know. The microformats, persondata, geodata, etc. movements are all balkanized at present. (Not that that is a bad thing necessarily.) I'm actually in favor of articles having a separate metadata page a la talk pages, but that's just my two cents. --Rajah 05:26, 2 May 2007 (UTC)[reply]
You may want to look at Extension:Semantic_MediaWiki depending on your level of interest. --Rajah 22:18, 9 May 2007 (UTC)[reply]

Use on pages listing multiple people...

[edit]

I'm looking at Delirious?_musicians and wondering if PERSONDATA can appear multiple times on the same page and not cause problems. Any thoughts? Dan, the CowMan 03:17, 10 April 2007 (UTC)[reply]

For now, I would stick to adding persondata to people with articles about themselves. --Rajah 05:24, 2 May 2007 (UTC)[reply]
You can, though, use {{Hcard-bday}} to generate in-line hCard microformats for each person. Andy Mabbett 08:58, 16 June 2007 (UTC)[reply]

hCard microformats in infoboxes

[edit]

Further to earlier discussions, a number of biography-related infoboxes now produce an hCard microformat. Please feel free to add the necessary mark-up to more. (Cheifly, that's class="vacrd" on the whole infobox and "class="fn" on the pagename or name field.) Note that the date of birth is only included if {{Birth date}} or {{Birth date and age}} is used. Andy Mabbett 17:14, 19 April 2007 (UTC)[reply]

This is total madness. How did someone think it was a good idea to have redundant metadata? The hcard functionality should have been implemented in Persondata, not in 500 different infoboxes. What a mess. Kaldari 22:44, 27 August 2007 (UTC)[reply]

NAME attributes

[edit]

A couple of questions:

- Are nicknames acceptable within the ALTERNATIVE NAMES attribute?

- Could it be further clarified as to what should populate NAME and ALTERNATIVE NAMES? For example, for Tony Blair, his full birth name is in ALTERNATIVE NAMES, and his familiar name in NAME, but for Steven Gerrard it is the other way around.

Thanks, --Jameboy 16:16, 20 April 2007 (UTC)[reply]

The Tony Blair example is how it is supposed to work. For Steven Gerrard, his full name in the name field is enough. Having his name sans middle name in the Alternative field doesn't add any information. (If anything you would put Gerrard, Steven in the name field and Gerrard, Steven Middlename in the Alternative field. --Rajah 20:32, 20 April 2007 (UTC)[reply]
Thanks. What about nicknames though? --Jameboy 14:28, 22 April 2007 (UTC)[reply]
I would say it would depend on the nickname and how uniquely identifying it is. e.g. "Honest Abe" shouldn't be in Abraham Lincoln, but Splendid Splinter could be in Ted Williams. For nobility, nicknames are sometimes the first name in the persondata, e.g. Catherine II of Russia Generally, if a nickname universally and uniquely identifies someone, I think it should be listed in the Alternative Names section, if it fails to meet those criteria it should be omitted. Do you have a specific example? --Rajah 20:33, 28 April 2007 (UTC)[reply]

Colors

[edit]

Is light gray really a good color to put on a white background? Maybe it could be darker and in bold. ~ EdBoy[c] 03:15, 12 May 2007 (UTC)[reply]

I guess most editors who use Persondata are familiar with the fields. Since the data is only a set of meta-data without any relevance for the article at all, it's not that bad to see it in a decent colour scheme. IIRC the colours are defined by CSS, so you should by able to define your own CSS rules (dark, bold, blinking, CAPS, ...). --32X 20:23, 15 May 2007 (UTC)[reply]

Hispanic Surnames

[edit]

Could the instructions please be specific, that the surname generally used in Spanish language names is the first surname where two are given. For example, I have just become aware of this template because of one of the pages I have on my watchlist, Ecuadorian footballer Ulises de la Cruz. His mother's surname is Bernal, and so the full, formal version of his name is Ulises de la Cruz Bernal, but this is not in common use. He has,however, been given a persondata box showing him as

|NAME= Bernal, Ulises de la Cruz

There will be many errors of this type if this is not made very clear. I am unsure as to how scripts to automatically extract information might avoid this error. Kevin McE 16:47, 15 May 2007 (UTC)[reply]

Yes, the script is a guide. The human being who was using should have realized that Bernoid was the maternal surname and that it was not Ulises' surname. That's why the name of his article is Ulises de la Cruz, with no Bernoid. Generally, I think editors should either stick to names in languages with which they are somewhat conversant, or learn the rules for the language/culture they are editing, so that errors of this type don't propogate. --Rajah 01:48, 16 May 2007 (UTC)[reply]
Interestingly, the german wikipedia gives this names as: NAME=de la Cruz Bernardo, Ulises , while the Spanish wikipedia has to have his persondata added. --Rajah 01:50, 16 May 2007 (UTC)[reply]

German Wikipedia

[edit]

How is telling people how many articles on the German Wikipedia have persondata useful information on the English Wikipedia? Voretus 17:00, 17 May 2007 (UTC)[reply]

One way it is potentially useful is that a skilled programmer could transfer the persondata wholesale in the same fashion as the Interwiki bot does with interwiki links. --Rajah 18:47, 17 May 2007 (UTC)[reply]
Motivation. Compare it with my answer for So What Does This Do Now. You can't work with only a few articles, you need a larger base. --> "So they've reached > 150k? Wow. We'll try to be better in a few months." (hopefully) --32X 23:18, 17 May 2007 (UTC)[reply]

My two cents on Wikipeda's handling of metadata

[edit]

Why not put Metadata in it's own tab for any given article? Thus all articles would have an "Article" tab, a "Discussion" tab and a "Metadata" tab. This would keep the article area clean of meta data, the tab could be (hypothetically) be limited to more advanced users. I realize this might be a bit of a pain in terms of extending the MediaWiki software, but long haul would this not be a major improvement? —Wikijeff 16:26, 12 June 2007 (UTC)[reply]

I totally agree, as I mentioned earlier [1]. For now though, this is the best compromise we can make. Mostly the reason this issue isn't that dwelt upon is that only a very small minority appreciate it. I'm slowly working on some offline wikipedia data mining/visualization tools that should, hopefully, get people fired up about it. --Rajah 00:37, 13 June 2007 (UTC)[reply]
I also agree that this would be a great step, but it needs to be raised with the devs (who for all I know may already be working on something of this nature). Community support is only marginally relevant for MediaWiki software issues such as this. -- Visviva 02:27, 13 June 2007 (UTC)[reply]
Until there is a change in the software, it seems to me that the best option is to store metadata on a subpage (I think this has been mentioned before). I discuss this below for the Persondata template, but in fact, the current version of my demonstration allows for arbitrary metadata: the data needed for a particular purpose (such as Persondata) are selected using a key. Geometry guy 20:07, 15 June 2007 (UTC)[reply]

Persondata on a subpage

[edit]

There has been some discussion here about why Persondata should be separate from the infobox, how birthdates and names are formatted, problems with entering the same information several times, and so on.

A lot of these issues would be easier to deal with if the Persondata were stored on a subpage of the article talk page. (It would make more sense to store it on a mainspace subpage, but these don't exist.) With a small modification to the Persondata template (see User:Geometry guy/Persondata) it is possible query the Persondata via straightforward transclusion of the subpage. I have made a "proof of concept" at Alexander Grothendieck and Talk:Alexander Grothendieck/Persondata.

Straightforward transclusion of the subpage produces the Persondata table. This may be a problem for search methods which query the wikisource of the article, but I would question whether the latter is the best way to query this data, especially if it involves downloading the entire article.

On the other hand, transclusion of the subpage with a key allows for easy extraction of the data. For example

{{:Talk:Alexander Grothendieck/Persondata|key=birthdate}}

produces

(1928-03-28)March 28, 1928Expression error: Unrecognized word "march".

with not an SQL query in sight. This can be used to transclude DEFAULTSORT and infobox information into the article, allowing these data to be combined with the Persondata without requiring editors to use infoboxes if they don't want to.

Furthermore, the data on the subpage could be richer than in the Persondata table. I have illustrated this by allowing both the sortname and the usual name to be transcluded. The latter is often the name of the article, so this may not be so useful, but it is not difficult to imagine other applications of the same idea. Indeed one can imagine the infobox template automatically transcluding almost all of the infobox information from this subpage, removing clutter from the wikisource of the article. Geometry guy 14:43, 15 June 2007 (UTC)[reply]

I've now also produced a {{ReadPersondata}} template to make it simpler to include Persondata into an article: in the article itself (or on its talk page) one can use

{{ReadPersondata|key=birthdate}}

instead of the above. Geometry guy 16:05, 15 June 2007 (UTC)[reply]

It looks good. Effectively, you are implementing the "new tab" thing discussed above, but putting it on a talkpage subpage instead. One thing though - on the subpage, the metadata is not visible. Is there a way to make it visible so people don't have to click "edit" to see what is there? Also, please see Wikipedia:Bots/Requests for approval/Polbot 3 and User:Polbot/ideas/defaultsort for the rapidly advancing ideas of using a bot to standardise the existing data. That will still encounter the problem of location, as people will still have to update article metadata and sort keys in different locations, and would need to be re-run at intervals. Your proposal would solve this. The problem is which to do first. I'd say do the bot run first (which will also show the scale of the problem), and while that is happening, get this idea of your advertised more widely. Who knows, if the right developers hear about it, they might implement a metadata tab so we don't have to use subpages of talk pages! Carcharoth 16:55, 15 June 2007 (UTC)[reply]

There should also be a way to add references to confirm that the metadata (such as birth date and place of birth) is correct. How to do this? Carcharoth 16:56, 15 June 2007 (UTC)[reply]
Actually, this is one reason why I think the data should be in the article. People need to be able to edit things directly. If they press edit and instead of "15 April 1955", they see {{ReadPersondata|key=bithdate)), then that will be very offputting. It is offputting enough for infoboxes at the moment. More templatization of articles would be bad. I can see why consolidating the sortkeys would be a good idea, but I think that it should all centre on DEFAULTSORT. Not sure quite where the solution lies. Carcharoth 17:14, 15 June 2007 (UTC)[reply]
I don't see any easy way to extract the sortable name from DEFAULTSORT: I view this as an application of the sortable name, rather than its source. For example, sortable names can also be used in tables, not just categories. Geometry guy 18:05, 15 June 2007 (UTC)[reply]
Yes, you are right, forget that quibble of mine. Carcharoth 23:16, 15 June 2007 (UTC)[reply]

It is no problem to make the metadata visible on the subpage. (Actually it is already visible to those who've customised their CSS to view person data.) References could also easily be added on the subpage. However, in my view, any information in infoboxes requiring verification should also be in the body of the text. It should be possible to make the infobox not at all offputting: it might just be {{Infobox_President}} for example, with all the data automatically transcluded from the subpage. There could be an edit tab on the infobox which links to edit the subpage.

Concerning what to do: the first thing is not to rush to conclusions, but to think through the various ideas before doing anything; it is probably also a good idea to separate (at least mentally) information gathering from manipulating data. I noticed the rapidly advancing plans you mention already; they look very interesting and I was intending to comment further there soon. Some bot work will surely be needed both for information gathering and data migration, but there are nearly 400000 articles to play with here, and when planning a journey of such a scale, it is invaluable to have a clear idea of the destination. Geometry guy 18:05, 15 June 2007 (UTC)[reply]

I've now made the subpage visible. For this I needed to make the use of the subpage for generating the (invisible) Persondata table explicit. Together with the above discussion, this suggests to me that the subpage could be used to store arbitrary metadata, and the key parameter can be used to extract the data which is needed for a particular purpose (such as the Persondata table). Geometry guy 20:12, 15 June 2007 (UTC)[reply]

Is there any reason that the sub-page (via the template) couldn't also include an hCard microformat? I'd be happy to supply the necessary mark-up. Andy Mabbett 20:34, 15 June 2007 (UTC)[reply]
What would you think about having it at Persondata:Alexander Grothendieck instead? – Quadell (talk) (random) 22:42, 15 June 2007 (UTC)[reply]
These are not valid namespaces at the moment, and so I expect they are viewed as articles (and so artificially inflate the number of WP articles). Geometry guy 23:00, 15 June 2007 (UTC)[reply]
That's fine. It will certainly be made into a valid namespace if this proposal is widely followed. And it will only be widely followed if it's intuitive and easy to use. Metadata:Alexander Grothendieck is a lot more intuitive than Talk:Alexander Grothendieck/Persondata. (Besides, wouldn't this artificially inflate the talkpage count?) – Quadell (talk) (random) 23:08, 15 June 2007 (UTC)[reply]
Sorry if my comment gave the wrong impression, Quadell, as I'm definitely with you in spirit. For instance, I think that disabling subpages in the mainspace is the wrong way to enforce the (sensible) policy of non-hierarchical article format. (Talk page subpages are not disabled, and so they don't inflate the talkpage count.) I agree entirely that this is about metadata in general, not just Persondata: the latter is just one application: I hope you notice this in my more recent comments and edits.
The ugliness of Talk:Alexander Grothendieck/Persondata was the main reason I introduce {{ReadPersondata}}! ({{/MetaData}} would work for me as an article subpage if these were acceptable.) But I am a pragmatist, and we have to build our ideas within the current framework. As you say, if they are successful and intuitive, they may attract a wider attention and a cleaner formulation. Geometry guy 23:38, 15 June 2007 (UTC)[reply]

My answer to Andy would be that hcard format could be included in the processing of the data, but not on the subpage itself, since this data needs to be updatable by any editor. It would be easy, however, to build another subpage which transcluded primitive data into the hcard format. Geometry guy 23:00, 15 June 2007 (UTC)[reply]

I'm not sure why you think that using a microformat would affect an editor; all a microformat is is HTML classes in the rendered out put, they do not appear on the page when editing - have a look at any page using {{infobox biography}} for instance, which has hCard markup in the template, where it is invisible to anyone editing such articles. (I also like the idea of the page being called Metadata:Alexander Grothendieck, BTW). Andy Mabbett 08:22, 16 June 2007 (UTC)[reply]
Thanks for the explanation, although I'm not completely sure I have understood. Most editors do not need to edit infobox templates, so they can contain all sorts or markup. However, editors will need to edit metadata, so this needs to be stored in a simple format somewhere (not in the article). This simple format will be something like {{MetaData|data1=|data2=...}}. The {{MetaData}} template now has quite a lot of work to do. First it must display the data in a simple form on the metadata page itself. Second it must allow queries to extract individual data items. Third (optionally) it could allow queries to output some or all of the data in a particular format (such as a Persondata table). The third of these obviously allows any format.
If I understand correctly, however, you are asking for the display on the metadata page itself to be wrapped in hCard markup. This could certainly be done if it is useful. Geometry guy 10:17, 16 June 2007 (UTC)[reply]
"you are asking for the display on the metadata page itself to be wrapped in hCard markup" - yes, that's right. Andy Mabbett 10:23, 16 June 2007 (UTC)[reply]
Thanks Andy. I will try it out. Could you explain to me (perhaps on my talk page) what the hCard markup is for and how it is used? Geometry guy 10:27, 16 June 2007 (UTC)[reply]
Please try What are microformats? and hCard, and let me know if you have further questions. Andy Mabbett 10:33, 16 June 2007 (UTC)[reply]
  • I think it's important to state some things about the philosophy of metadata out front. Like:
    • Visible information in (non-template) article-space should never come from metadata. The text "Born {{Getmetadata|birthdate}}" should never appear, for instance. It is only used for categories, or in templates and such.
    • The Wikipedia article is the source of the metadata, by definition. There should never be metadata information which is not mentioned (and, ideally, sourced) in the article itself. That way we don't have to worry about cites in metadata -- the source for the info is the Wikipedia page. Conversely, if there is a discrepancy (not caused by vandalism), it's safe to assume that the article is right.
    • Templates such as infoboxes could be simply transcluded into articles without parameters. These templates would use magic words to point to the metadata. The [edit] link on infoboxes should go to the article's metadata, not the template (since these templates are, let's be honest, to complex for most users to edit anyway.)
    • It should be as simple as possible for users to find and edit metadata.

I'd actually suggest moving this discussion to someplace more centralized. Maybe Wikipedia:Separate metadata? Perhaps categorized under Category:Wikipedia proposals, with links from Wikipedia:Requests for comment/Style issues and Wikipedia:Centralized discussion? I don't want to get the conversation bogged down with too many opinions and ideas, but on the other hand, this would effect a huge portion of Wikipedia if implemented. – Quadell (talk) (random) 23:08, 15 June 2007 (UTC)[reply]

I agree entirely with your bullet points about metadata. The metadata should be taken from the article and stored in one place (so that if the article is wrong and needs to be updated, it is easy to fix the data). Then it should be used to generate persondata, infoboxes, etc. And of course edit links should point to an easy-to-comprehend metadata page, not a complicated template! Geometry guy 00:01, 16 June 2007 (UTC)[reply]
  • Responding to Quadell about namespace: do you mean a new namespace with its own talk page, or do you mean a new tab associated with articles (like the current talk pages?). If a new namespace, you have the problem of what happens when the article is moved to a new name. If a tab, then that would move with the page, and issues would be discussed on the talk page as normal. If you are going to have a new tab or new namespace, why not just go the whole hog and make it available for all metadata (including the hCard format Andy Mabbett mentioned above)? BTW, is that really a new namespace, or have you just created an article page with the title "Persondata:Alexander Grothendieck"? :-) A think a new tab is the most viable method, but unfortunately that would also be most developer resource-intensive. Carcharoth 23:16, 15 June 2007 (UTC)[reply]
At the moment this is simply a new article page in the mainspace. I imagine the intention would be to have a new tab associated with articles. Meanwhile, we have to develop the concept with the software as it is. Geometry guy 10:24, 16 June 2007 (UTC)[reply]
  • Responding to Geometry guy's comments, thanks for making the subpage visible. One query though - why isn't the sortkey parameter visible? Is that because the original persondata template doesn't include that parameter yet? About the references, you are quite right, for this sort of data that should (in fact must) be in the main text of the article as well as an infobox, the references stay with the article. I was thinking more of the kind of data included in infoboxes for things like articles on chemical elements or planets. See hydrogen and Earth for examples, though they have their own ways of dealing with their data. We should probably stick to biographical data for now! The idea of an edit tab linking to the subpage is a great idea. You've actually answered all my worries so far, and I agree entirely about taking it slowly and getting an idea of what is needed first. So, what next? Carcharoth 23:16, 15 June 2007 (UTC)[reply]
The sortkey parameter isn't visible because I was lazy and just copied the format from the persondata table. However, now that the persondata behaviour is separated from the subpage data, anything is possible. The template can display the data on the subpage however you want it to, and still provide a query mechanism to access the individual fields, and also constructions built from these fields, such as the persondata.
Regarding infoboxes, I think we have to rely on the specific infobox to decide how to handle the data. They are all very diverse, but they might all benefit from transcluding subpage data. The wikilinking and formatting of this raw data, should, however, be left to the infobox template.
The subpage idea is still rather dominated by the initial motivation from the Persondata template. It is becoming more flexible now, and I will continue to work in that direction. Geometry guy 23:53, 15 June 2007 (UTC)[reply]
  • Responding to Quadell again, after edit conflict, I agree a separate page to discuss the wider issues of metadata is needed, but surely this has been discussed elsewhere before? This may even be a perennial proposal, though maybe no-one's ever taken it this far. Carcharoth 23:16, 15 June 2007 (UTC)[reply]

Update

[edit]

I've now added more data to Talk:Alexander Grothendieck/Persondata and shown how this data can be used to generate the infobox. There is also a link in the infobox to allow the data to be edited. This is all very much an experiment and a work-in-progress, so please add comments. Geometry guy 22:09, 16 June 2007 (UTC)[reply]

There's now an hCard, too. Andy Mabbett 22:13, 16 June 2007 (UTC)[reply]

Tidying up a few pages

[edit]

Should the pages at Template talk:Persondata/Removing data have persondata or not? If not, please help tidy them up. Thanks. Carcharoth 01:03, 16 June 2007 (UTC)[reply]

Just remove them. Persondata belong only to biographys (only one instance per page) or on redirects if there's only one article covering several people (f.e. twins, who're only notable as twins but not as single persons). --32X 01:30, 16 June 2007 (UTC)[reply]

Persondata tagging script

[edit]

In case anyone missed it, see Wikipedia talk:Persondata#Half-automatic tagging with persondata-tool for details of a script to help extract and add persondata. Carcharoth 01:10, 16 June 2007 (UTC)[reply]

Link to the archived discussion: Wikipedia_talk:Persondata/archive2#Half-automatic_tagging_with_persondata-tool --Rajah 08:09, 17 June 2007 (UTC)[reply]

Metadata in biography infoboxes

[edit]

What's to be done about projects like WikiProject Composers and WikiProject Opera, where a cabal are insisting, against the evidence, to have a consensus for the removal of biographical infoboxes from all of "their" articles? Andy Mabbett

Forum-shopping, Mabbett. Isn't it time to let this go? Moreschi Talk 09:02, 16 June 2007 (UTC)[reply]
Not forum shopping; this is perfectly on-topic - and relates to previous discussion - here. Have you read that discussion, or are you just following me around? Andy Mabbett 09:34, 16 June 2007 (UTC)[reply]
The advantage of a separate MetaData page is that if editors of an article want to remove the biographical infobox, they can. The data remains available on the MetaData page. Hopefully this will reduce conflict of the kind that I am sensing here! Geometry guy 10:36, 16 June 2007 (UTC)[reply]
I thought that the proposal was to generate the metadata page using data from the infobox? Andy Mabbett 10:59, 16 June 2007 (UTC)[reply]
Initially, there may need to be a data migration exercise to create metadata pages from current infobox and persondata tables. However, thereafter, the proposal is to generate the infobox (if desired) and the persondata (if needed) from the metadata. Indeed, the whole point of the proposal is that it is much easier to generate the infobox from the metadata than vice-versa. Geometry guy 12:54, 16 June 2007 (UTC)[reply]
That seems reasonable, so long as there is only one place where the data is entered or amended. Andy Mabbett 13:16, 16 June 2007 (UTC)[reply]

Granularity of fields

[edit]

Is there any reason why a person's name is held as a single field, and not as, say, "family name" "given name", etc? Andy Mabbett 10:25, 16 June 2007 (UTC)[reply]

No. This is an example of what I had in mind by the potential for the Person/Metadata subpage to refine the information in the standard Persondata table. The dates of birth and death could be refined in a similar way, if desired. Geometry guy 10:39, 16 June 2007 (UTC)[reply]
Good, separate day, month, season and year fields would greatly facilitate display of date of birth (and of death, as planned) in hCard, which requires the YYYY-MM-DD format; and which, if DoB is given as, say, "spring 1456", should only be output as "1456". perhaps we might also consider adding "honorific prefix" and "honorific suffix" fields (for use in , for example, "Sir Jim Smith, OBE"). A handy list of hCard properties may be found at http://microformats.org/wiki/hcard-cheatsheet Andy Mabbett 10:56, 16 June 2007 (UTC)[reply]

Location

[edit]

What was the rationale behind telling people to put it before the cats and IWs? Seems far more logical to be at the very bottom of everything (ie the Cats and IWs are displayed by default, and are more directly related to Wikipedia readers--metadata would make more sense separate from all the content, just like dab headers should be on the first line, ahead of all article content). It also often gets used in a way that adds excess whitespace to the displayed article--this could easily be avoided by putting it at the bottom. I do agree that the first directions (between Cats and IWs was worse, but I don't think before the Cats was the best "fix". Sohelpme 01:43, 17 June 2007 (UTC)[reply]

Presumably, that was because of technical reasons. See Wikipedia_talk:Persondata/archive1#Location_in_article, Wikipedia_talk:Persondata/archive1#place_inside_the_article. Hopefully, metadata templates can get out of the main article in the future... --Rajah 15:53, 17 June 2007 (UTC)[reply]
It only adds whitespaces when stub templates are following. ("By convention this is placed at the end of the article, after [...] the category tags, so that the stub category will appear last.") So better start changing stubs to articles. ;) --32X 20:23, 17 June 2007 (UTC)[reply]
All other Wikipedia language editions (I think) place cats and interwiki-links at the very bottom. Interwiki-links are frequently added/changed by users from other language editions, or by interwiki-bots. It would be quite difficult for them if they had to look up the individual rules for placing the interwiki-links in each language edition. --88.134.44.255 02:18, 19 June 2007 (UTC)[reply]
[edit]

Within the persondata template, should we wikilink things like the name of a state or country or dates?Rlevse 02:41, 19 July 2007 (UTC)[reply]

Yes, you should, if possible. From the project description: "Wikilinks in the persondata are not currently necessary; however, they may be useful for some future application." I almost always add them, usually because I am just copying the already wikilinked first few lines of the bio. Having these named entities wikilinked will allow for cool applications in the future. --Rajah 16:39, 21 July 2007 (UTC)[reply]
Interesting question, since at the German Wikipedia the rule's a bit different: Link only the place of birth/death, because someone's birthplace might be Madrid, but not Spain. (Actually the rule currently isn't enforced very well.) --32X 00:15, 22 July 2007 (UTC)[reply]
I'm not sure what you mean, by "only the place of birth/death". If someone is born in Madrid, Spain, then they were born in Spain too. Yes, Istanbul was Constantinople was Byzantium, but whoever adds the persondata for people born in that city in the 1900s, the 900s and/or antiquity will add the correct name. No ancient Greek was born in Istanbul after all. For Mikhail Baryshnikov, his birthplace seems right. Riga, Soviet Union, even though Riga is now in Latvia and the Soviet Union no longer exists. If he were to return to his birthplace and expire, his deathplace would be Riga, Latvia. Right? Rlevse's question also touched on dates, and I'd like to add the point that retaining the wikilinking for professions and nationality in the short description field will also retain info that can be used for future applications. --Rajah 00:32, 22 July 2007 (UTC)[reply]
The argument was while Madrid is the birth place, Spain is not a place in that meaning.
For the other example, you've mentioned it how it should be done. My favourite example: It's not interesting if someone was born in Berlin in the 1960s/70s/80s. In this case in which part of the city is was is the important information. --32X 15:11, 22 July 2007 (UTC)[reply]
Yeah, I think we're basically in agreement here. For those Berliners born in those decades, seeing birthplaces as Berlin, West Germany or Berlin, East Germany would seem to clear it up. (and shows that the political entity does store valuable information. --Rajah 06:31, 23 July 2007 (UTC)[reply]

Granularity of names

[edit]

FYI: WikiProject Infoboxes: Granularity of people's names. Andy Mabbett | Talk to Andy Mabbett 14:59, 31 July 2007 (UTC)[reply]

Resting places of dead people

[edit]

The templates {{Infobox person}} and {{Infobox actor}} now have parameters for resting place and resting place_coordinates (see Marylin Monroe for an example using both). These are included in the hcard microformat. I think we should add them in PERSONDATA, too. Andy Mabbett | Talk to Andy Mabbett 12:12, 19 August 2007 (UTC)[reply]

That's not a very good idea. The persondata is a set of few but rudimental information about a person. The resting place isn't rudimental and not even known for the majority of people covert by biographies. I also fear the result of such an information for Frederick I, Holy Roman Emperor. --32X 22:03, 19 August 2007 (UTC)[reply]

Metadata standardization

[edit]

Please see Wikipedia talk:Metadata standardization. Thanks! Kaldari 23:25, 27 August 2007 (UTC)[reply]

Too much deference to computers -- Death data fields visible for living people

[edit]

I understand the efficiency, etc., of not removing the "DATE OF DEATH" field for people are living, but it strikes me as unseemly.

It's easier to write software to extract data from data sets where a keyword exists for all expected data fields. But aren't humans the primary audience for Wikipedia?

It may serve computers well for the form to say the equivalent of "DATE OF DEATH= PENDING", but I think that it's a downer. -- Ac44ck 22:57, 11 September 2007 (UTC)[reply]

Respice post te! Hominem te esse memento! Dsp13 00:08, 12 September 2007 (UTC)[reply]
I don't know Latin.
It seems to mean:
Look behind you! Remember that you are but a man!
According to this:
http://en.wikipedia.org/wiki/Memento_mori
Two comments:
1. I prefer to look ahead, but not _that_ far ahead.
2. I hope that "but a man" is not suggesting that machines are to be elevated above humans.
When reading up on a noteworthy contemporary (or a vibrant, world-class athlete half my age -- which is where I first encountered the visible "DEATH PENDING" notice), I don't necessarily want to be reminded of their (or my) eventual death. I don't see that it does anything for the human-readable content of the article -- it only serves a data-parsing function for computers.
What reader-focused purpose is served by bringing attention to the mortality of someone who is at the top of their game?
-- Ac44ck 01:32, 12 September 2007 (UTC)[reply]
Sorry for the flippant reply. I guess personally (living in a rich bit of the world, where death's not very visible) I'm more worried that I might forget the general fact of human mortality than that I might be reminded of it too much. (I certainly don't rate machines above humans: only humans can be conscious that they will die!). But people's sensitivities vary (& my talk page would be a better forum to have that conversation) - so sorry if I caused offence. As far as persondata death fields go, though, persondata (unlike other bits of the page) is indeed basically meant for machine reading rather than human reading - which is why the default stylesheet preference is to make it invisible. Dsp13 15:14, 12 September 2007 (UTC)[reply]
So the millions of readers of countless biography pages will encounter non sequitur "DEATH PENDING ... just you wait -- if you don't go first" notices among the praises for some noteworthy, living person for the sake of the occasional programmer who can't be bothered to write a "If it isn't there, don't read it" routine.
It seems like a bad trade off to me.
Ac44ck 20:34, 13 September 2007 (UTC)[reply]
Maybe I'm missing something, but persondata should not be visible to someone reading an article unless they have specifically modified their user CSS file (e.g. monobook.css) to make it visible. Which article did you first see the DEATH PENDING (or whatever) field in? Maybe there's a an error in the syntax on that page making it visible. Dr pda 22:06, 13 September 2007 (UTC)[reply]
It does, indeed, disappear _if_ I allow the CSS file to be rendered. I wasn't aware of that, but it doesn't change my position.
I usually browse with as many bells and whistle turned off as I can manage -- webmasters frequently use unwelcome color schemes. Such is the case on the page where I first encountered the persondata template: bright white background if the CSS file is allowed to be rendered.
Who says that a bright white background is "normal"? It's glaring and promotes eye strain. As I understand it, an amber screen provides the optimum clarity and comfort. I have configured my system to show text on an amber-ish background. I find that works well for me. The Windows duh-fault is a glaring white background -- and many lemming webmasters follow suit when specifying colors in their CSS.
So I browse with CSS rendering turned off. Hence, after reading about the impressive list of accomplishments by a young, healthy, world-class athlete here:
http://en.wikipedia.org/wiki/Anna_Kournikova
The parting shot was, "Oh, yeah ... and we're waiting for her to die."
Eewwww!
The "Infobox" template has a place for a death date, and it is waiting in the wings to be filled in -- but it does so discretely.
I think it is inconsistent with other Wikipedia behaviors for the persondata template to await the inevitable so obviously -- and in ALL CAPS.
Perhaps the current situation is helpful to computers. But computers don't write checks to the Wikimedia Foundation; people -- who have "sensitivities" -- do.
Ac44ck 03:46, 14 September 2007 (UTC)[reply]
Yeah. But most of them won't see it. Most people go for the duh-fault. Carcharoth 04:01, 14 September 2007 (UTC)[reply]

Added instructions for extraction from database dump

[edit]

Hi all, I've added instructions for how to extract persondata from the database dump and load it into a MySQL database. I've also compiled some lists of articles with problems in their persondata syntax, these can be found at User:Dr pda/Sandbox. Feel free to help tidy these up. There are over 1000 articles where the name has been entered in the form John Smith rather than Smith, John, so perhaps further education is needed somewhere. Dr pda 23:45, 11 October 2007 (UTC)[reply]

Nice work! I totally agree that we need to make it easier for the people who are new to Persondata. --Rajah 06:20, 16 October 2007 (UTC)[reply]

We need to rewrite/revise/update the instructions

[edit]

Especially, how to deal with things like inexact birth and death dates. e.g. Are "1480/1", "late 1500s", "c. April 25, 1994", etc. acceptable? If not, what should be there? (I know a lot of this answered in the archives to this talk page. When I get time, I'm gonna go back and collate all the suggestions/pronouncements and place them where they should be (IN THE INSTRUCTIONS).) Also, reminding people that Last name first holds in alternative names as well. (I've been guilty of that myself.) Etc. I think the best way of showing those new to Persondata how to add it would be about 4-5 examples of common things that occur: People who are still alive, people with unknown birthdates/years, people without simple "LastName, FirstName" orderings, etc. --Rajah 06:19, 16 October 2007 (UTC)[reply]

General questions on usage

[edit]

Great idea, but I've no idea how the search software handles things or where the boundary lies between helpfulness and standardisation & formality, so some general questions -

  • Name / Alternative Name - Should professional titles eg Professor be included ? Is my handling of Charles Gordon-Lennox, Earl of March and Kinrara reasonable ?
  • Short description - Should this be prose or keywords ?
  • Place of death - Charlotte Long died on the M4 motorway, location unknown. Should POD be M4 motorway ? Even if POD comes to light should M4 remain because it may be searched on ?
  • Place*; Short description To aid searching, contents be expanded to place, county, country, nation ?

I'll add more to this list as I come across them.

Thanks -- John (Daytona2 · talk) 20:50, 2 December 2007 (UTC)[reply]

As far as I know, no one has actually written any application to search persondata from the English wikipedia.
Project templatetiger of Kolossos on de wiki has allows basic searching of selected templates in database dumps from several languages, including en persondata, (see de:Wikipedia:WikiProjekt_Vorlagenauswertung/en) but this does not parse the data fields (i.e. treats dates as text strings). Another German user JakobVoss has written scripts to extract and parse persondata from de wikipedia (which I translated for the English wikipedia a couple of months ago), and used these to develop a tool which shows people who were born/died on a given day. (see http://tools.wikimedia.de/~voj/pd/). In principle this same tool could be used for en wiki.
Regarding your specific questions,
  • Professional titles should not be included, only titles of nobility, in which case the name is usually given as lastname,firstnames,title, e.g. Gordon-Lennox, Charles Henry, Earl of March Darnley and Kinrara. Regarding Charles Gordon-Lennox, Earl of March and Kinrara I think the persondata's changed since I first looked. I would be inclined to put the name as given in the previous sentence as the NAME, and under ALTERNATIVE NAME put any other names he is known by, i.e. non-contiguous substrings of this (e.g. Gordon-Lennox, Charles Henry, Earl of March Darnley and Kinrara), earlier courtesy titles (such as Lord Settrington), etc.
  • Short description should be keywords, e.g. British artist and photographer. Often this is what occurs in the first sentence of the article
  • Re Charlotte Long, if the place of death does become known, it could still be expressed as M4 motorway, place, county etc
  • If locations were fully expressed as you suggest then it would obviously be trivial to search for all places within, say, a given country, however I suspect it might require a lot of effort to add this information to all current articles with persondata lacking it. (At User:Dr pda/Sandbox I've got a list of ~2000 articles which currently have problems with their persondata syntax; I and others have only got through a few hundred over the last couple of months.) It might be possible to write a clever search which would get around this (e.g. using a database of places). Using full locations for any persondata you add in future is probably a good idea though.
Dr pda (talk) 02:47, 9 December 2007 (UTC)[reply]
Thanks Dr -- John (Daytona2 · talk) 19:48, 9 December 2007 (UTC)[reply]

Editprotected

[edit]

{{editprotected|Could a Category:Articles with Persondata be added to this article to help with the coordination of a new WikiProject?}} Auroranorth (!) 11:33, 10 December 2007 (UTC)[reply]

The project page that this talk page discusses isn't protected. Try making the request at Template talk:Persondata. --ais523 11:35, 10 December 2007 (UTC)
that page redirects to this one at present! Dsp13 (talk) 11:38, 10 December 2007 (UTC)[reply]
I see. There may be objections to that change, though; Special:Whatlinkshere/Template:Persondata, or the 'embeddedin' API query, already provide the information that the category would provide, and avoid cluttering the Categories box at the bottom of the article. --ais523 11:53, 10 December 2007 (UTC)
My script at User talk:Dr pda/generatestats.js (which uses the embeddedin API query) can be used to generate a list of articles which transclude {{Persondata}}.
On another note it would perhaps have been nice if the new Persondata WikiProject had been discussed or at least advertised here prior to creating it, to get input from the people who have been adding persondata over the last couple of years. I can see how it could potentially be useful, e.g. by attracting more people, and by coordinating the addition of persondata to articles in a systematic way. However it's not clear to me what the plan is at the moment. There is a request on this page for a category to help "coordination of the project" (presumably just keeping track of which articles have it?), there is a bot request to tag all articles in subcategories of Category:People (which I think is unnecessary, as all biographies should already be tagged with {{WPBiography}}), there are announcements in various places of the creation of the project, however the todo list is just "tag any biography article with persondata". I would like to see some more information of how the project plans to operate. (It would also have been nice to discuss redirecting the shortcut WP:PDATA from this page to the WikiProject page). Dr pda (talk) 23:07, 10 December 2007 (UTC)[reply]

Progress stats

[edit]

How easy would it be to see a monthly figure of how many bios now have persondata? I've seen the figure of 15,000 as at Nov 07, but could it be done along the lines of x for Nov, y for Dec, total = Z? Lugnuts (talk) 18:09, 30 December 2007 (UTC)[reply]

It should be pretty easy. A simple way would be to screenscrape the pages you get from What Links Here in the mainspace. For historical data, you can check out User:Rajah/persondata#Persondata_count --Rajah (talk) 22:37, 31 December 2007 (UTC)[reply]