Wikipedia talk:Categorization/Archive 1

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Wheeee!

Everyone else seems very excited about this categorization. I, however, oppose it. I think it is overkill to try to categorize articles when we should try to categorize in the articles' contexts or material(s). Are we still using "List of topics in whatever" lists? And whatever happened to the MediaWiki boxes? I don't know, maybe there is some good justification for this system. --MerovingianT@Lk 17:35, May 30, 2004 (UTC)

Mediawiki boxes were *never* supposed to be used as a substitute for categorization (which has been in the works for quite a while now)- people went a bit overboard with them. That's why the first thing I did was start this page, so we can control the phenomenon. →Raul654 17:55, 30 May 2004 (UTC)
Well, thanks Raul, that puts a whole new light on things. I will now want look into contributing (i.e., starting a new category). --MerovingianT@Lk 18:20, May 30, 2004 (UTC)

I have another question. I looked at Category:Monarchs. I want to start my own subcategory. How does the subcategory appear in Category:Monarchs? --MerovingianT@Lk 18:32, May 30, 2004 (UTC)

Forget it. --MerovingianT@Lk 18:36, May 30, 2004 (UTC)

I am not thrilled either. So far I have seen categories which disrupt (top right) image placement and people putting in categories while not noticing that a vandal has partly translated a page into Italian. Previews do not seem to show the category text, the categories seem to be much less tolerant of non-standard coding than other functions, and those putting in categoroies do not seem to be looking at the overall impact on a page. Is this really ready for real world application? --Henrygb 23:47, 1 Jun 2004 (UTC)

I think you've confused bugs in the Monobook skin with bugs in the category system. Lots of people hate Monobook. Hopefully, it'll be fixed soon. I agree that MonoBook wasn't really ready for production use. --ssd 23:54, 1 Jun 2004 (UTC)

I was also rather dismayed when I suddenly noticed these "categories" appearing in various pages I've been editing and following closely. They reminded me of the old "namespaces" argument, that basically went - should we have a "Transvaal" article or a "South Africa/Transvaal" article? The decision was that namespaces aren't good, because encyclopedia entries are not hierarchical, and rarely belong to only one category. So, now these multiple categories were invented. So, you could say that Transvaal is in "South Africa" and a "Province" and a "Place" and a "Non-existant province" and a .... you get the drift - when does the list of categories stop? What's wrong with the old way of having the first paragraph say something like: The Transvaal was one of the provinces of South Africa? It's much more general, much more useful, and actually much clearer, because the relationship of the article to each of its categories is spelled out in English, rather than being implied (a category of "South Africa" doesn't say if the article is a place in south Africa, a former place in South Africa, a South African person, and so on). Moreover, it's not always clear to which category a certain article belongs. Does the Daytona race track article belong to the "NASCAR tracks", "U.S. tracks" or just "tracks" category? What was wrong with the old way of just saying that "Daytona ... is a race track in the U.S., ...", with certain key words linking to the appropriate entity or list of entities (such as list of nascar or us race tracks)? I have to admit, I thought the old way was much better, and I wish these categories go away. Nyh 06:45, 2 Jun 2004 (UTC)

By the way, I am also opposed to these long boxed lists people started to put in pages (see, for example, the end of South Africa), for exactly the same reasons. These boxes are not more useful than just linking to the "list of countries in Africa", and only clutter the article with irrelevant information (does a person looking up South Africa needs to see the list of other African countries?) and makes the whole link structure of wikipedia more rigid and less flexible. So, I wish those boxes go away too.... Nyh 06:45, 2 Jun 2004 (UTC)

What I don't like is the text opposite the Other names heading appearing over the horizontal line in the monobook skin, and I don't know to correct that at the mo. -TonyW 14:41, Jun 4, 2004 (UTC)
Now that I have enabled Show table of contents in Preferences, this seems to have cured that problem. -TonyW 00:35, Jun 5, 2004 (UTC)

Real world example of categorisation

I don't know if this is a good ide but I thought of categorize fundamental constants of physics, chemistry, biology, matematics, etc. How could such a structure look like or should they just go under dicipline instead of "fundamental constants" ?

Fund. Const. of Phys --
                        \
Fund. Const. of Chem ----+
                         |
Fund. Const. of Math ----+-- Fundamental Constants of Nature -- Mathematics and Nature Science
                         |
                         .
                         .
                         .
                        etc


or should it be

Fundamental Constants of Physics --
                                   \
                  Other category -- + Physics -+
                                               |
                                               |
Fundamental Constants of Chemistry             |
                                  \            |
                Other category -- + Chemistry -+
                                               |
                                               +- Mathematics and Nature Science
                                               | 
                                               . 
                                               . 
                                               . 
                                              etc 

or any other ideas ? // Rogper 22:41, 30 May 2004 (UTC)

You can do both at the same time. One subcategory can belong to several categories. Andris 00:32, May 31, 2004 (UTC)
To be more explicit about it (remember to prefer lowercase for titles...
 /--Physics<------Fundamental constants of physics-----\
 |                                                     |
 +--Chemistry<----Fundamental constants of chemistry---+
...                       ...                         ...
 +--Foo<----------Fundamental constants of foo---------+
 |                                                     |
 V                                                     V
Natural sciences                         Fundamental constants
 |
 \---->Science
I enjoy these ASCII diagrams far too much. grendel|khan 00:47, 2004 May 31 (UTC)
Me too. Its some kind of art besides its content! ;) // Rogper 14:01, 1 Jun 2004 (UTC)

Using other sites as a guide to category structure

  • To minimise reinvention of wheels, consider the category structures of Web directories such as www.zeal.com, which have been painstakingly thought out over long periods. Some of them cater well for the "converging path" problem, eg "Country and Western Dancing" should be locatable under "Music" and under "Folklore" and under "Dance Styles". See, as a starter, Zeal's main top-level categories (Entertainment, Work & Money, Computing, Shopping, People & Chat, Sports, Lifestyle, Travel, Library, and Personal), and follow a few down: http://www.zeal.com/category/preview.jhtml?cid=302562 - and/or see the three paths that lead to the "Country and Western Dancing" category at http://www.zeal.com/category/profile.jhtml?cid=225504 (two of them are not the main line and are listed at "Symlinks to this category"). Wikipedia can improve on those Web directories with its upward links (Zeal doesn't let you easily browse up in any line except the main path; others are probably the same). - Robin Patterson 01:07, 2 Jun 2004 (UTC)

Category format options?

Is it possible to format the listing of the sub-categories and the contained articles? E.g. if I add Albert Einstein to the Category:People, could I use [[Category:People|Einstein, Albert (1879-1955)]] not only to sort him by Einstein, but also to display the results with the year? The current list in the categories are a little though to read.

Also, how much introduction text is suggested for a category scheme? One line like e.g. Category:Harry Potter, none like Category:Films, or a lengthy introduction that makes the category more like an article like Category:Japanese culture (made by me for demonstration purposes)? -- Chris 73 | Talk 07:04, 31 May 2004 (UTC)

For topmost catagories (such as Category:World War II), it's a good idea to give some discussion. For subcatagories (such as Category:World War II people) you probably don't need as much. →Raul654 07:25, 31 May 2004 (UTC)
That [[Category:People|Einstein, Albert (1879-1955)]] trick is supposed to work, but my own experiment with it says it doesn't work properly. -- Cyrius| 07:32, 31 May 2004 (UTC)
As I understand it, it only affects the sort order, not the display. Bummer. -- Chris 73 | Talk 07:57, 31 May 2004 (UTC)
Well, that's more or less useless. -- Cyrius| 17:02, 31 May 2004 (UTC)
It would be nice to specify what the link should look like with the pipe, so that [[Category:Television Shows|Firefly]] in the Firefly (television series) shows Firefly in the Television Shows category, instead of Firefly (television series). - Jeandré, 2004-05-31t20:57z

Categories link and picture placement (DELETE?)

'precomment': I think this bug has been fixed. All the workarounds are now unnecessary, unless there are huge warts in something other than monobook. Can we delete this section now? --ssd 03:57, 8 Jun 2004 (UTC)

I've noticed that the current placement of the "Categories" link at the top-right corner of an article will force a picture in the article into the center of the screen. I'm trying to format the Batman article, for instance, so that the image is on the right side of the screen and not in the center. However, I can't seem to figure out how to place the picture without it being forced into the center by the categories assigned to that article. Please advise. --Modemac 11:08, 31 May 2004 (UTC)

Looks good to me. The picture is a bit lower because the category links are above it, but otherwise it looks just like i think it should -- Chris 73 | Talk 12:31, 31 May 2004 (UTC)
I've seen it mentioned elsewhere that it is due to a minor bug in the category tag formatting, and I've been having a bit of bother with it entering categories in the [[category:sculptors]] section. Basically the first line of and article has a block of right aligned white space the width of the category block. If the first line of an article is all text, this is fine (even desireable as it avoid a visual collision with the category block). However if you have a right aligned picture at the top of the article it can't take a line break so tends to get moved into the centre of the page. The work around is to either;
  • move the image down a paragraph
  • move the image to left align
  • add two line breaks (<br><br>) to the top of the article
Non of which are brilliant. You have got round it in the batman article because the first line is disambiguation text. -- Solipsist 21:19, 31 May 2004 (UTC)
I really think serious consideration needs to be given to moving the location of the Categories links on the page. As mentioned, it causes problems in any article that begins with an infobox or an image that's setup to be displayed on the right. Why not put the list above the header tag used for the article name? I can see hundreds (thousands?) of articles needing to be updated if the current placement is not changed. RedWolf 22:17, May 31, 2004 (UTC)

I think I've fixed the batman problem. Workaround is to put <br clear=right> before the image. Any text before that will still be placed left of the category list, but stuff after the br will be below it. This seems to be a problem only in the MonoBook skin, as the standard skin does not seem to try to float the category list, and has it flat on one line instead. --ssd 13:12, 1 Jun 2004 (UTC)

Just add the following to your monobook.css (for example you can view mine at User:Tobin Richard/monobook.css):
#siteSub {
  display: block;
}
The problem should then go away. This was mentioned by someone on meta yesterday. I'd imagine it might be added to the main.css sometime soon. - Tobin Richard 01:15, 2 Jun 2004 (UTC)

Changing category names

How is this done? I feel that Category:Actors and Actresses should not have Actresses capitalised. It would also be good to rename the singular categories like Category:Poet. It seems I'm not allowed to move the category page. Lupin 11:30, 31 May 2004 (UTC)

Shouldn't it just be Category:Actors? Actress redirects to Actor after all. Jake 00:41, 2 Jun 2004 (UTC)
I agree. The only way to do this is to individually rename all the tags. Shall we?--[[User:HamYoyo|HamYoyo (Talk)]] 00:59, Jun 2, 2004 (UTC)

See Also

Wikipedia talk:Category

User page in category??

I was surprised to see that it seems possible to add a user page to a category. See http://en.wikipedia.org/wiki/Category:Harry_Potter for an example: there is a load of HP articles, then an individual's user page. Surely this is not a great idea?? --Nevilley 11:33, 31 May 2004 (UTC)

I fixed it for now. It was a link under the section Pages I've made significant contributions to, which was done incorrectly. I changed the link from [[Category:Harry Potter]] to [[:Category:Harry Potter]]. I usually do not touch other peoples user pages, but I hope adding the : is OK in this case. -- Chris 73 | Talk 12:22, 31 May 2004 (UTC)
I could see categories of Wikipedians emerging, like the various voluntary pages over on meta (m:Wikipedians categorized by sub-cultural affiliation). But users shouldn't be in the same category with articles. -- Cyrius| 17:05, 31 May 2004 (UTC)
We should also make sure that talk pages do not get added to categories by the same mistake. I made some comments on this talk page about Category:Chess players earlier and forgot : at the beginning. This page was then added to chess player category. Andris 17:36, May 31, 2004 (UTC)
Yeah, that was my bad; forgot the :. Thanks for fixing it. grendel|khan 21:47, 2004 May 31 (UTC)

Category redirects

Redirects don't work properly with categories, and I think we will need them to work properly.

Already we have Category:Sport and Category:Sports. What is needed here is:

  1. For Category:Sports to automatically redirect to Category:Sport, so that someone trying to visit Category:Sports gets redirected to Category:Sport, just as with Sport and Sports, and
  2. for all the pages categorized into Category:Sports to be listed on page Category:Sport as if they had been categorized into Category:Sport in the first place.

-- Dominus 11:40, 31 May 2004 (UTC)

Shouldn't it be "Sports"? -- User:Docu
In British English at least, Sport sounds better. Sports might be prefered in American English. Hence we need category equivalence/category redirection. Pete/Pcb21 (talk) 12:58, 31 May 2004 (UTC)
I just encountered this same sport/sports thing and suggested the same feature, before I found this discussion. And you'll note that, at the moment, the plural form exists & has entries whereas the singular form doesn't & doesn't, so clearly the redirect is needed. Elf | Talk 05:13, 4 Jun 2004 (UTC)
I will fix it. Someone must have changed all links to Category:Sport ignoring Wikipedia:Tutorial_(Keep_in_mind)#US_English_vs_British_English. -- User:Docu
Here's an idea: how about having a general category:sport within which we have category:sports which would have articles on all the individual sports. The first category would also have articles on sportspeople, tournaments, equipment, etc.--[[User:HamYoyo|HamYoyo (Talk)]] 08:41, Jun 4, 2004 (UTC)

The category guides say categories are plural. If british english says sport then make a subcategory British sport instead of sport and quit messing up the rest. --ssd 14:58, 5 Jun 2004 (UTC)

The sport/sports thing is just an example, I agree with the general remark: we need category equivalence, sth. like a hard link vs. a soft link in a fs. Flyingbird 04:40, 9 Jun 2004 (UTC)

People by name

Is there a way to build a list like List of people by name ? -- User:Docu

My guess is add them to the appropriate categories, like The Beatles or Australian Prime Ministers or Canadian hockey players or whatever. Eventually, the parent groups will lead back to People. And then the admins can autogenerate some list from all that data?
Then again, you have (for example) Silverchair in Australian musicians in Australian people in People. Silverchair is a band, not an individual, suitable under Australian musicians, but not under People. So now I am also confused! --Chuq 12:12, 31 May 2004 (UTC)
.. and Category:The Beatles doesn't include only bio articles about each member, e.g. John Lennon. Maybe we could include a category like [[Category:People by name|Lennon, John]] in his article. "People by name" doesn't look that nice on his article though. "Biographies" might do? -- User:Docu

I think definitely that articles that are ultimately included in Category:People should be about individuals only. In terms of bands, how about something like this?

John Lennon---The Beatles members---The Beatles-----Musical groups---------Music
       |                     \                                              /
       |                      British musicians--British people--People   /
       |                              \                          /      /
       -Vocalists----------------------Musicians-----------------     /
       |                  /                 \                       /
       -Guitarists------/                     \-------------------/

(does that look right?) Or is that too much? - Lee (talk) 14:04, 31 May 2004 (UTC)

You omitted Category:John Lennon ;) -- User:Docu
Oh, don't tempt me... ;)
           /                                /
  Alternative musicians   Births by year--Births
   /     \                  /  \          /
Beck-------\--------1970 births--\------/-------------1970---1970s
  \          \                     \--/-------\       /        \
   July 6      \                    /           \   /            \
   births--------\-------Births by day           |/            21st century
                   \                            /|
                     \                        /  |
                    Alternative music        |    \---------------Events by year
                         /        \          |                          /
  Beck albums---Alternative music---\-------/---------\               /
     \                albums          \---/--------\    \           /
       \                                /            \    \       /
         \       /---------------1970 albums-----------\----\--Albums by year
           \   /                                         \    \         \
            |/             /--Rock and roll albums---------\---Albums by | 
           /|            /          /     \                  \  genre    |
         /   \---------/----------/------\  \--------------\   \    \    | 
       /             /          /          \                 \   \    \   \
Let It Be----------/--The Beatles albums---Albums by artist----\---\---Albums-\
                 /        \                   /                 |    \          \
        John Lennon albums--\---------------/   /------\         \     \          \
                     \        \               /  /-----Rock and roll--Music genres  \
251 Menlove Avenue-\   \        \           /  /                                 \  Modern music
                     \   \        \       /  /  Musical groups by genre            \  \
                       \   \        \   /  /      /                 \                \  \
                         \   \        \|  Rock and roll groups  Musical groups-----\   \  \
           /---------John Lennon------\|\            /           /                   \   \  \
         /                             |\ \        /    Musical groups by nationality  \   \  \
       /                               |  \ \    /                    /                  \   \  \
John Lennon------The Beatles members---|-The Beatles----British musical groups             \-Music
  \   \                 \        \     |                     \                               /  
    \  Vocalist  Rock and roll-----\--/                United Kingdom                       |
      \   \        musicians         \                    /                                 |
        \   \       \          British musicians---British people--People by nationality   /
          \   \      |            \                                   \                  /
            \   \----|---\   Musicians by nationality---Musicians------People          /
              \      |     \                           /  /   \        /             /
         Guitarists--|----Musicians by instrument----/  /       \----/-------------/
                      \                               /            /
                 /--Musicians by genre--------------/            /
               /                                               /
             /                                /--------------/ 


These are too much fun. - Lee (talk) 21:00, 31 May 2004 (UTC)

And I thought I had too much time on my hands!! Mike 05:24, 14 Jul 2004 (UTC)
You could probably squeeze a "British musical groups" between "The Beatles" and "Musical groups" :P --Chuq 22:12, 31 May 2004 (UTC)
fixed ;) - Lee (talk) 23:10, 31 May 2004 (UTC)
Hahha.. oh dear.. nice work. What can I say. England, English people, and English musical groups? No, thats going a bit too far.. or is it? --Chuq 01:09, 1 Jun 2004 (UTC)
Ain't they just? grendel|khan 07:33, 2004 Jun 1 (UTC)
Updated it again. I think I may have strayed slightly from the point I was trying to make, though. Basically, no matter what route you take downwards from Category:People, you should only ever get to a person (John Lennon) and not, say, his childhood home (251 Menlove Avenue) or an album he worked on (Let It Be). I need to lie down for a while. (And, yes, even I think this one is a bit excessive). - Lee (talk) 12:04, 1 Jun 2004 (UTC)
Awesome work, and I get your point now that you mention it. Now, about 1970 ... :-) --Zigger 21:42, 2004 Jun 1 (UTC)

I admit it, now I'm just being silly. (It now wraps around, top and bottom) - Lee (talk) 22:45, 2 Jun 2004 (UTC)

Naming conventions

Removed from the page:

Non-gendered terms are required. Category:Actors and Actresses is used instead of Category:Actors or Category:Actresses.

I fail to see why we couldn't have gender specific categories, e.g. there is a list of famous women in history, and we could also have a similar category. -- User:Docu

The idea that categories should be plural really should be re-examined. Shakespeare was a playwright, not a playwrights. Elizabeth II is a monarch, not a monarchs, a queen, not a queens, a female, not a females, a woman, not a women. John XIII was a pope, not a popes. - Nunh-huh 20:52, 31 May 2004 (UTC)

Well, the category links to what group(s) the people belong to. Elizabeth II isn't the only monarch, but is a member of the group of monarchs, John XXIII isn't the only pope, but is one of the many people who make up the group of popes. I can see your argument, as well, so I have no prefrence. Gentgeen 23:21, 31 May 2004 (UTC)

Non-existent articles

I propose that non-existent articles in a category should always be listed on the category page itself. Example: Category:Documentary films. This ensures that we can completely port over existing lists.--Eloquence* 15:58, May 31, 2004 (UTC)

I agree, this is an excellent idea! Andris 17:44, May 31, 2004 (UTC)
I think it should be automated, as mentioned in Lists v. categories - Omegatron 01:09, Jun 11, 2004 (UTC)

Countries vs. occupations

I just categorized chess players and I am wondering if they should be categorized by country (with players page linking to, say, Category:Latvian chess players and then Category:Latvian chess players linking to Category:Chess players). This page seems to imply so but I do not quite like this idea.

Right now, a person can go to Category:Chess players and use that to look up players. That is very convenient. If players are subdivided by country, he will have to click on each country. And, if a person is looking for a player but does not know from which country that player is, he will have particular difficulty finding the player. So, I think it might be better to have links from players page to both Category:Chess players and to country's category (Category:People of Latvia).

I am fine with dividing politicians into categories by country, since, if you are looking for politicians, you most likely are looking for politicians from a particular country. I am not quite fine with doing that for occupations like chess or mathematics, where national identities are less emphasized. Any thoughts? Andris 16:25, May 31, 2004 (UTC)

Just call them chess players. If you can find some useful subdivison of "chess players", then use that. But I am in full agreement that dividing by country is not a useful way of dividing that category. Hopefully somebody will come up with a tool that makes it easy to perform bulk recategorization so decisions like this can be easily changed. -- Cyrius| 16:36, 31 May 2004 (UTC)
For politicians grouping by countries seems indeed a better idea compared to chess players. Maybe we should update the sample to reflect this. -- User:Docu


Silly question...lists of Categories?

Silly question here...why on the Categorization page are we making lists of categories? I thought that's what the categorization system was for? --ssd 17:33, 31 May 2004 (UTC)

(Edited the title) -- User:Docu
The idea here is that one can list which categories one is working on, and display a simplified tree structure for the categories. You can't do that with a category page. -- Cyrius| 17:43, 31 May 2004 (UTC)
I'm confused. Clearly there are categories being discussed on this Talk page that are not on the Project page. How does one find out if a category already exists or not? --bodnotbod 21:34, May 31, 2004 (UTC)
I usually type http://en.wikipedia.org/wiki/Category, a and then the name of the category preceded by a colon (:) (case sensitive with _ instead of space). There ought to be a search though.--[[User:HamYoyo|HamYoyo (Talk)]] 21:42, May 31, 2004 (UTC)
You can find them on Special:Categories, but that doesn't tell you how many elements they have nor which ones don't have a parent category. -- User:Docu
OK, I've found the page that lists existing categories. And I've, perhaps rashly, created Category:Beat writers. I now need this to be a sub category of Category:Writers by genre. How do I do that?
In general I am finding the category pages are not helping me cope with this new institution. For example, the Special:Categories page gives no indication of what one should do if they wish to create a new category: the intuitive method of editing the page is not available. --bodnotbod 01:36, Jun 1, 2004 (UTC)
I added Category:Beat writers to Category:Writers by genre for you. (yes, there is an edit page, you must have missed it??) Part of the purpose of this discussion page is to help decide how to pick names for new categories.  :) --ssd 05:39, 1 Jun 2004 (UTC)

Singular writers

It looks like there is a slight problem over in the writers category where some subcategories have been created in the sigular version. For example:

  • Category:Children's writer
  • Category:Playwright
  • Category:Poet

I've already seen that it is non-trivial to move/rename a category, and Playwrights and Poets already contain quite a few members. Is this a case were category redirects are required? -- Solipsist 18:27, 31 May 2004 (UTC)

The naming convention indicates it should be singular, I think. Also, there are catagories called writers and catagories called authors, so the singular/plural thing is not the only problem. I was going to move one of these category names and edit all the articles under it, but wiki wouldn't let me, so I've left the names as is. --ssd 20:48, 31 May 2004 (UTC)

If you look at Wikipedia:Categorization you will find that all the examples are given in the plural. We're trying to revamp the system by renaming all singular titles to plurals.--[[User:HamYoyo|HamYoyo (Talk)]] 20:53, May 31, 2004 (UTC)
The trouble is that the tags are applied to singular items, suggesting they should be singular. Milton was a poet, not a poets. - Nunh-huh 20:58, 31 May 2004 (UTC)
Thanks HamYoyo for sorting out the poets. And I agree with Nunh-huh that the plurality often seems back to front. Someone was suggesting it was a US <-> UK thing. I end up having to phrase it as a TV game show like The $25,000 Pyramid — "And the category is..."
They're not actually 'tags', but 'catagories' and catagories are made up of many constituent parts; e.g. poets, dinosaurs, countries. The word 'tag' is only used to refer to the written script which places an article in a catagory. Could the author of the last message remember to sign?--[[User:HamYoyo|HamYoyo (Talk)]] 22:16, May 31, 2004 (UTC)

Categories and the Wikipedia namespace

I've noticed people adding Wikipedia: namespace articles to categories. I don't think this is correct. E.g. someone just added Wikipedia:WikiProject Automobiles to three categories, one of which, Category:Automobiles, is a definite 'encyclopedia space' category. In general, we follow a 'don't link to Wikipedia: from the article namespace' policy, for good reason, and this appears to violate it.

I can see the point of categorising Wikipedia: namespace articles too, but I have my reservations about having these categories in the same namespace as the regular categories.

Thoughts? —Morven 20:04, May 31, 2004 (UTC)

I hesitated to do it. Finally, I added a WikiProject as a see also to the description of the category only. On the other hand, we'd probably avoid some problems, if the category feature would sort articles by namespace or only include pages in article namespace.
BTW If we want to create a category for WikiProjects, can we do it? Should it be Category:WikiProjects or Category:Wikipedia:WikiProject ? -- User:Docu
Category:WikiProjects, with Category:Wikipedia as a supercategory. -- Cyrius| 00:59, 1 Jun 2004 (UTC)
Agreed.
James F. (talk) 12:14, 1 Jun 2004 (UTC)
We have a use/mention problem here. Templates for infoboxes often appear on WikiProject pages. These templates can have Category links. If we put the template on the page verbatim, then the WikiProject page becomes a member of the category. If we mention it using, e.g.,Category:Automobiles, then I guarantee that people will cut-n-paste the templates and forget to remove the colon, which means that the pages get left out of the category. How to fix? -- hike395 07:29, 6 Jun 2004 (UTC)

Hierarchicalization

Should we add the people category to articles about people, until the software automatically adds entries from subcategories (and avoids duplication)? -- Jeandré, 2004-05-31t21:31z

No. Put people in the most specific catagory that applies. In general, each article should fit into only a few catagories. →Raul654 21:27, May 31, 2004 (UTC)
If each personal entry was under Category:People, it would be useless as there'd be so many.--[[User:HamYoyo|HamYoyo (Talk)]] 21:29, May 31, 2004 (UTC)
What tags should biographical articles get? One for each profession? One for a sex? One (or more) for a sexuality? One (or more) for a nationality? One for a cause of death? One for a burial site? It seems like it would be good to agree what's useful before adding (for example, isn't "Austrian Nationality" a better tag than "Austria", etc.) Does "Nobel Prize winner in physics" make a "scientist" tag superfluous (probably, if "Nobel Prize winner in physics" has "scientist" as a super-class, but is there an easy way to find all the superclasses a class is in?). What tags should articles on elements get? compounds? medications? animals? plants? etc. - Nunh-huh 21:43, 31 May 2004 (UTC)
Probably, they should be put in the categories for which they are famous. An otherwise anonymous scientist that wone a Nobel Prize might only go in that category, while one that is very famous in a particular field should be listed in a category for that field. Very few people are famous merely for being men or women or sexually aberant, but there are some. I'm not sure why you would put someone in a scientist category unless you were just totally clueless as to what field they did their research in, or perhaps they research all fields? --ssd 05:43, 1 Jun 2004 (UTC)
Like many areas, the biographical articles really need a good metadata capability. There should be a good way to specify all of this information about a person, such as gender, occupation, birth date, death date, nationality, and so on. Categories certainly can be used for this. In order to decide what should be categorized, we should look at what has been manually done in lists. We don't see people going working diligently on List of people who are buried in Paris France. Yes, there are some silly lists, but not that are being focused on as a major effort. The dates, nationality, and occupation are the big ones, with gender a useful one behind that. Considering the types of lists people like to create and maintain here, I personally would suggest that a bio article shoudl containt something like this: [[category:people]], [[category:musician]], [[category:Swedish]], [[category:male]], [[category:day of birth: September 5]], [[category:year of birth: 1945]], [[category:day of death: April 12]], [[category:year of death: 1997]] or similar. The point is not that any category needs to be "small"; it is that data needs to be identifiable. Or is retrofitting proper bio metadata into categories using the wrong tool for the job? If so, what's the right tool? --Amillar 05:40, 1 Jun 2004 (UTC)
[[category:people]] is certainly redundant in this example. It will be so huge that no one will use it. Except for a program that extracts all biographies, but, in that case, [[category:musician]] is a subcategory of [[category:people]] and the program could just as well go through subcategories of [[category:people]].
I would say, [[category:musician]] and [[category:Swedes]] are necessary. [[category:people]] is redundant. The rest is not redundant, but will we need to list all people born on September 5? Might not be worth the effort.Andris 06:22, Jun 1, 2004 (UTC)
Sure, but do we need to list any people born on September 5? Somebody does think it is worth the effort, because such listing pages are currently being maintained. My point is that such information should be maintained in just one place (the actual article), and let the list be generated automatically from it. It sure isn't worth the effort to maintain the individual listings in my opinion, even though we're doing that. --20:00, 1 Jun 2004 (UTC)
People born on a specific date are already listed (or shold be) on the page for that date! Let's not try to re-invent the wiki here ;-) The more I've been looking at the cetegory stuff the more I'm thinking that we need some sort of parser to show the tree we are working on; the flat list available presently is useless for determining any structure or logic. Will have a think on methoodogy ... --VampWillow 00:20, 3 Jun 2004 (UTC)
Actually, that's exactly what we are doing. Categories are an octagonal wheel to replace the square ones we've been using with the list pages. And the three column sorted list with letter headings is a vast improvement IMHO. We'll see how it works when the lists get bigger. And don't forget to list them as [[category:musician|Last, First]] so they get alphabetized correctly. --ssd 03:33, 8 Jun 2004 (UTC)

Actors and actresses

Okay, someone draw up a map for Category:Actors_and_actresses. I'd like to suggest that we scrap that category in favour of Category:Actors as it is much too long, and include both sexes in it.--[[User:HamYoyo|HamYoyo (Talk)]] 22:36, May 31, 2004 (UTC)

Alright, I'll do it:

      /People by nationality-----Actors by nationality\          /Stage actors
     /                                                 \        /                         
    /                                                   \      /                            
People--------------People by profession-----------------Actors                         
    \                                                   /      \
     \                                                 /        \
      \People by period---------------Actors by period/          \Screen Actors

--[[User:HamYoyo|HamYoyo (Talk)]] 23:04, May 31, 2004 (UTC)

Someone once made various lists for actors .. I suppose they are at list of actors. -- User:Docu

Many women describe themselves as "actors"; for them, the current heading isn't NPOV! - Robin Patterson 00:53, 3 Jun 2004 (UTC)

I'm sorry... NPOV?--[[User:HamYoyo|HamYoyo (Talk)]] 10:52, Jun 3, 2004 (UTC)

Pages with different entries

Should we split articles that deal with more than one thing with the same name (where one is not obviously more important than others), eg William Baldwin?, before adding categories to the page? - Jeandré, 2004-05-31t23:00z

Well, eventually such articles will need to be split anyway. If you want to split them as you're adding categories, that's fine, but if not, I don't see any harm in adding categories before splitting. --Camembert
On unsplit pages, it might be a good idea to add the category markers at the end of the text section they are relevant to, rather the end of the article itself, to make splitting easier and less error prone in the future. If the category applies to both sections then...well...uh...just don't list the category twice. 8-> --ssd 03:43, 8 Jun 2004 (UTC)

Some questions

  1. Will someone write a bot to make this easier?
  2. can things be in two categories? e.g. Richard Dawkins is in writers, but he's also a biologist... ? Dunc Harris | Talk 22:44, 31 May 2004 (UTC)
I'm tempted to say that for occupations, people should be in the occupation that most closely fits what they do. Richard Dawkins is a biologist first and foremost (ditto for Stephen Hawking). →Raul654 22:51, May 31, 2004 (UTC)
Whatever the specifics on Dawkins, in the general case it's fine for things to be in more than one category, yes. Arrigo Boito, for instance, is notable both because he wrote the music for operas and because he wrote the words for operas; therefore, he's in both Category:Opera composers and Category:Opera librettists. --Camembert

Where to put lists?

Example problem: Should the List of Japanese people go under Category:Japanese people or its parent category Category:People by nationality? Or should the whole list be copied to the Category:Japanese people and made a redirect? -- Chris 73 | Talk 23:48, 31 May 2004 (UTC)

If it's strictly A-Z you could replace it with a link to Category:Japanese people (I'd avoid a cross-namespace redirect). Otherwise, you could added it to Category:Japanese people, Category:lists, and other categories where it may be helpful. -- User:Docu
People are looking for articles. Lists are one system of helping people find articles by subject. Categories is another. It doesn't make sense to assign a list to a category. Think how someone would use the encyclopedia -- they use the list of categories and click on Category:Japanese people to get a list of them. Then on that list, they click on List of Japanese people, and they get another list of them! Why? If the lists are identical, you are putting the article they're seeking one step further away. GUllman 20:57, 2 Jun 2004 (UTC)
I agree. Articles that are ultimately included in Category:People should be articles about people, not lists of people. List of Japanese people should be in Category:Lists of people, or even Category:Lists of people by nationality (neither of which should be in Category:People). - Lee (talk) 21:47, 2 Jun 2004 (UTC)
Thanks for your agreement, Lee, but you still don't quite understand. No categories should be called "List of" <something> or "Lists of" <something>. A category IS a list. The subcategories REPLACE the "Lists of" lists. In the above example: move all articles in Category:Lists of people to Category:People and delete the former. All the articles in List of Japanese people would be in the Category:Japanese people, which is a subcategory of Category:People. You don't need a Category:Lists of people by nationality at all, because now they are in the list of subcategories on Category:People. GUllman 23:10, 2 Jun 2004 (UTC)
I don't think you quite got what I meant. List of Japanese people is not an article about a person, therefore it does not belong in a path that leads to Category:People. That's my reasoning at least. Category:Lists of people categorizes ("lists") articles which are in themselves lists of people. I agree in most cases the "List of people" style pages will be redundant in favor of categories, but if the pages stay, then the category they should be in is Category:Lists of people and not Category:People. - Lee (talk) 23:25, 2 Jun 2004 (UTC)
Sensible.--[[User:HamYoyo|HamYoyo (Talk)]] 23:18, Jun 2, 2004 (UTC)
That should be Category:Lists of Japanese people (but I think you knew that). It might be nice to link (or copy) the article List of Japanese people in the body of Category:Japanese people with the intent to eventually convert it. A trend I have seen with this sort of thing is to list in the category body articles not yet written, and categorize the articles that have been written. --ssd 21:54, 5 Jun 2004 (UTC)

Breaking the heirarchy on purpose

I have placed Pudding, Pie, and Yoghurt into both Category:Desserts and Category:Food and drink, even though Category:Desserts is a subcategory of Category:Food and drink. That's because these items are each really two closely related ideas: a food item which may be savory, and a dessert made with that food item prepared sweet.

I'm pointing this out just so that we don't get too strict on the heirarchicalization policy. Sometimes it needs to be broken.

If we did require strict heirarchicalization, the alternatives I see would be:

  1. Create a new subcategory of Category:Food and drink, called, say, Foods that are often but not always desserts. Yuck.
  2. Just put these things into Category:Desserts. This is easy, but it's also just wrong. A chicken pot pie is not a dessert, and no amount of stamping our feet will make it one.
  3. Just put these things into Category:Food and drink. That's better — at least it doesn't require knowingly encoding wrong information into WP. But it means that when someone goes looking for desserts, they won't see Pie, which they should, because pie is a dessert — except when it's not.

--TreyHarris 01:01, 1 Jun 2004 (UTC)

Where does this leave cow pies? They are neither dessert nor food. - Nunh-huh 02:41, 1 Jun 2004 (UTC)

Cow pies are also not pies, at least in the conventional sense. -- Cyrius| 05:02, 1 Jun 2004 (UTC)

I'd just like to add that I think an overly strict adherence to hierarchy is problematic. Some of the examples provided on the main page are particularly dubious - Paul McCartney is to be classified as a Beatle rather than as a British musician? Is this at all justifiable? On what basis should there even be a category called Category:The Beatles consisting of four people? Especially considering that the article The Beatles quickly links to all four, and all four have a link to that article towards the top. Add to that the fact that all four of them also produced solo work making them British musicians in their own right, not just through the Beatles. Ga. At any rate, the tendency of categories to break down into subcategories that are not mutually exclusive militates towards listing things in most categories that apply, even if they fit into some subcategory - there is always (or, at least, frequently) the question of whether the particular subcategory applies. There is also the fact that you end up with a system where Category:Philosophers ends up including only philosophers from countries that didn't produce many philosophers, and doesn't include Plato and Aristotle (as being in Category:Ancient philosophers or Immanuel Kant and Arthur Schopenhauer (as being in Category:German philosophers, or whatever. The whole thing quickly becomes silly. Until items categorized in subcategories also get categorized in the larger categories, I think we should maintain most of these people in both the larger category and the subcategory. And for many instances, as this one about pies above, it makes sense to do this even if we do get to that latter point. john k 06:57, 2 Jun 2004 (UTC)

I think we should do it "correctly" and not have to fix everything later if/when the code arrives because we categorized everything in every possibly applicable category instead of the most specific applicable categories. -- Cyrius| 18:17, 2 Jun 2004 (UTC)
Oh, and you forgot about Pete Best and Stuart Sutcliffe :) -- Cyrius| 18:21, 2 Jun 2004 (UTC)
I'm rather thinking that unless a category has a sufficiently wide remit to get a reasonable number of entries it shouldn't exist. Category:Philosophers should be a link on all philosphers, Category:Musicians on all musicians. Add multiple categories (eg. Category:German people, Category:Guitarists) or whatever, but keet the category itself fairly major, otherwise the tree will get so nested that it will become unusable by, uh, 'users'. --VampWillow 00:26, 3 Jun 2004 (UTC)
I tend to agree that categories shouldn't be so specific that they don't have much in them. However, we're only just now setting up categories, so I think it would be better right now to be nice and specific, and build category structure to fill out later, and maybe trim and un-nest it later. --ssd 22:07, 5 Jun 2004 (UTC)

Australian Prime Ministers

I added John Howard, Paul Keating and Bob Hawke to this category. They have since been removed, the user (Adam Carr) citing "horrible whitespace gaps". I don't see it on my screen (IE or Firefox) so I thought the problem had been fixed? Am I right to add them back? --Chuq 01:12, 1 Jun 2004 (UTC)

If they are Australian Prime Ministers, then they should be in this category. The John Howard, Bob Hawke and Paul Keating articles looked fine to me. There is a small gap at the top, but that is due to the category link. I think it is not only OK, but necessary to add them again. -- Chris 73 | Talk 01:36, 1 Jun 2004 (UTC)
They can be alleviated by not leaving each category tag on one line. Remove whitespace in the wikitext. Dysprosia 02:44, 1 Jun 2004 (UTC)
It also helps to put the category and language stuff at the bottom of the article instead of the top. --ssd 05:48, 1 Jun 2004 (UTC)
It seems that the category tabs affect the article line spaces. I made a test, and added a category and lots of lines to the top for a sample article here (Adachi Morinaga), then added the category and lots of lines to the bottom (Adachi Morinaga), and then added the category to the bottom with no lines in between (Adachi Morinaga). Maybe somebody can make a bot that moves all interwiki and category links to the bottom? -- Chris 73 | Talk 03:38, 1 Jun 2004 (UTC)
On further testing, the reason I'm not seeing the whitespace is because I'm not using the (broken) monobook skin. I changed to monobook and viewed the old Paul Keating version with the category listing, and saw the problem. This is a problem with the skin, NOT the categories. Yet another reason why Standard should be the default skin for unregistered users until the monobook skin is fixed. I think this is discussed elsewhere.
Also with the Adachi Morinaga articles by Chris above - the only thing I can see causing whitespace, is several lines of white space added to the article. This isn't a category, or even a skin problem, its a adding-blank-lines-for-no-reason problem :P I know you were testing, Chris, but I don't think there is anything unexpected about "adding blank lines" causing "blank lines in an article"! --Chuq 04:10, 1 Jun 2004 (UTC)
The example was excessive. What i meant was that I sometimes add the interwiki and category links to the top, and add one or two newlines to separate it from the article. Those newlines also show up, but of course much less than in the above examples.
It also seems that the problem some people have is due to the new skin. -- Chris 73 | Talk 06:10, 1 Jun 2004 (UTC)
I eventually got an answer from Adam Carr (see his/my talk pages), he knew of a workaround but he decided removing the categories was a better idea than simply adding the six characters needed to apply the workaround. He doesn't seem to like (or understand?) the concept of categories either, saying the Australian Prime Ministers category is a waste of time because we already have a Prime Ministers of Australia article. Anyway, I've since applied the workaround (see Bob Hawke etc). It involved moving the Category links up the top, which I thought was against agreed policy, but until the skin is fixed it's the best we can do. It's unfortunate what could have been a one minute fix involved a lot of mucking around and wasted time for all of us, but thanks all for your responses! --Chuq 06:54, 1 Jun 2004 (UTC)

I'd have to say that while I think that many categories are useful, I think categories for groups that are more clearly presented in terms of chronological lists (e.g. British monarchs, Australian prime ministers, chancellors of the exchequer, or whatever), categories are sort of pointless. Especially since for most of these we already have succession tables that link to the main article, and the main article has links to all the individuals under consideration. Category:Australian politicians would make sense, but I don't really see what you get out of having a special category for prime ministers. john k 20:03, 1 Jun 2004 (UTC)

Well, there is a lot more to be done. Prime Ministers was just one of the first subgroups of Australian politicians I could think of.. other ones I plan to make:
                               etc.
                                    \
      NSW MLCs   Victorian pol's -- Aust. state pol's 
             \                      /                \
          New South Wales pol's  --/                 Aust pol's by government           Politicians ---
               /                                              /           \               /       \    \
          NSW MLAs                     Aust. senators        /             \             /         \    \
              /                                     \       /               \           /           \    \
        NSW Liberal MLAs                    Aust. federal pol's       Australian politicians     People   \
          /                                       /                         /     \     \           /      \
         /                                       /                         /       \   Australian people    \
        /                                       /                         /         \           \            \
       /                                       /                         /           \         Australia      \
 John Howard           Aust. PM's ---- Aust. MHRs                       /             \         /              \
           \          /                 /                              /            Aust. politics --------- Politics
    Aust. lib. PMs   /    Aust. Lib. MHRs ---- Aust. Liberal pol's    /                         
                 \  /       /                             \          /                        
                Aust. Lib. leaders                Aust. pol's by party         
                                                         /
                                             Aust. Labor pol's


Abbreviations, (some should be obvious): Aust=Australian, pol's=politicians, Lib=Liberal, NSW=New South Wales, MHR=Member of the House of Representatives (Australian federal lower house), MLC/MLA=Member of the Legislative Assembly/Council (NSW state lower/upper house).

The major downside I can see here is that many of the category names are quite long. There are more lines that can be drawn, but I'm not as good with this as ScudLee!

Some may also think it is arguable whether politicians are people :P --Chuq 00:56, 2 Jun 2004 (UTC)

I think "Australian politicians" or "Australian liberal politicians" "Australian Labor politicians" "New South Wales politicians" are fine as categories. But categories based around jobs of which there is only one such person at a time are, I think, basically worthless. What does the Australian prime ministers category give you that the page on Australian prime ministers doesn't already do? john k 02:40, 2 Jun 2004 (UTC)
IMHO categories are best at grouping distantly related things whose best order is alphabetical, where it isn't exactly sane try to keep a central page that lists them all. The article Prime Minister of Australia has a LOT of related data per entry, and is sorted chronologically, and this data probably doesn't change more than once every year or so... all three of these items make it inappropriate for categories to replace the list in the page. If the page exists and isn't going to be replaced by the categories, do we need them too? --ssd 00:11, 2 Jun 2004 (UTC)
Yes, I think that's exactly right. john k 00:41, 2 Jun 2004 (UTC)
We could move the list to the category page, so that the list is displayed at the beginning of the category. Just a thought -- Chris 73 | Talk 03:19, 2 Jun 2004 (UTC)

That'd be fine, except that I don't see any real way in which such categories are useful at all. To whom is a page with an alphabetical listing of British monarchs useful? john k 04:00, 2 Jun 2004 (UTC)

Well to whom is a listing of Australian politicians useful? Who knows, but we still do it.
If I was to re-do the diagram above, which groups would be "not useful"? You suggest Aust PMs. Aust Lib PMs, I presume as well. What about Aust Lib leaders? --Chuq 05:20, 2 Jun 2004 (UTC)
It's not a question of usefulness. But categories for topics that are just long alphabetical lists of vaguely related names make sense, because a category makes as much sense for such things as a long alphabetical list. A category is, though, much less useful for lists of officeholders, and so forth, who already have a clear mechanism linking them to each other, and to other articles, that makes more sense than a loose alphabetical list. I don't think any list that would allow for a succession table should have a category - so no to Australian prime ministers, no to Australian liberal leaders, no to chancellors of the exchequer, no to Supreme Allied Commanders, Europe, no to General Secretaries of the CPSU, and so on and so forth. Yes to large, broad groupings where a manual list article is likely to be incomplete and where the main article is unlikely to link to the list article. john k 06:07, 2 Jun 2004 (UTC)
OK. On reading your comments, I could see where you were coming from, and agreed that they make sense. I then proceeded to create and link some of the other categories mentioned above (see my user contribs page if you want the full details). On editing the Aus PM category page, I saw someone had added the category to Category:Heads of government. It then occurred to me that if someone generated a full list of the Heads of government category, Australian PM's would not be included. The list of Aus PM's may be finite and generally static, but as part of a larger group that they are in, they are just part of an alphabetical list of names! SOmething to think about before removing the category completely.. I have moved John Howard and Bob Hawke to Category:Australian Liberal Party MHRs and Labor MHRs respectively, but haven't changed Paul Keating yet, or added any others. Chuq 10:58, 2 Jun 2004 (UTC)

Categorizing articles that fall under WikiProjects

I'm a little concerned with the issue JamesDay brought up in the main Wikipedia:Categorization page; that people are going off and enthusiastically categorising inside areas of the 'pedia that are covered by existing WikiProjects. While I'm certainly not suggesting that this should be barred, I do think that trying to let active WikiProjects decide on their own categorization is a good thing.

So please, if there is a WikiProject covering an area you want to categorise, how about posting something in that project's talk page -- at least telling people what you're up to? —Morven 02:34, Jun 1, 2004 (UTC)

I think some of the WikiProjects need to be converted to categories and drop (at least some of) their silly boxes, banners, and lists. Perhaps if they started, then others wouldn't be doing it for them. Of course, the others doing it for them should be doing it right, and should be using the structure they already built to get it right. Of course, we haven't exactly figured out what categories are better at yet either, so perhaps not everything should be converted... --ssd 00:02, 2 Jun 2004 (UTC)
Some of the lists can definitely be deleted. Problem with categories is you can't categorise things not already written ... —Morven 00:12, Jun 2, 2004 (UTC)
Note Erik's point above, if a category is to replace a list, then is it is very important to add the non-existent articles to the non-auto-generated section of the category page. Pcb21| Pete 07:30, 2 Jun 2004 (UTC)
Please make sure to check that the lists is really an alphabetical one that doesn't include additional information. -- User:Docu
WikiProject generally include an Infoboxes, those are generally candidates for the Template namespace rather than categories. -- User:Docu
It's generally a good idea to do that and especially to check existing Lists of articles by category for categorizations schemes they are already using. -- User:Docu

Circles

Looking at the new category system, the one think it should do in future it doesn't do now is drawing nice graphical representations of our hierarchy trees. To look if this could be possible one day, I tried if a category can be a super-category of its own sub-category: See Category:Testing circularity, Category:Testing circularity 2. In other words: we can do infinite loops with the category tags, which makes drawing algorithms (or any other automatical use of the category tags, like listing content of sub-category in super-category) quite a bit more complex. Is this a bug, is it a feature, should this be forbidden, regulated, automatical recognized and marked? -- till we | Talk 11:39, 2 Jun 2004 (UTC)

I think it's technically a feature, but should be avoided in the general categorization scheme. Any category traversal algorithm (be it for graphing or anything else) will have to handle encountering loops, because we can't guarantee that they won't exist. -- Cyrius| 18:06, 2 Jun 2004 (UTC)

Avoiding loops in a traversal algorithm is trivial. All that is necessary is to keep a list of already visited notes, and not re-visit them. --ssd 18:56, 2 Jun 2004 (UTC)

It's trivial as long as you remember that you need to handle it. -- Cyrius| 20:07, 2 Jun 2004 (UTC)
And it makes algorithms more time-consuming if you have to test every node with every node you have already visited. -- till we | Talk 20:18, 2 Jun 2004 (UTC)
It gets a little harder if someone creates Category:all categories which don't include themselves ;-) -- Solipsist 21:07, 3 Jun 2004 (UTC)

Web sites

When I noticed that Category:Web comics only derived from comics, I created a Category:Web sites. I put slashdot, alexa, google, and wikipedia in there and i figured it would flesh itself out. However, I now realize that I should have planned this better first. Can someone help me with one of those nifty ASCII trees for this categorization? Any ideas on what the subcats should be (Portals, Search Engines, Blogs, etc?) - DropDeadGorgias (talk) 17:00, Jun 2, 2004 (UTC)

Actually, shouldn't it be Category:Websites? The concatenated version is used by website and list of websites. Fredrik (talk) 17:02, 2 Jun 2004 (UTC)
Uhh, you're right... Sorry 'bout that. Can an Admin move the cat page? - DropDeadGorgias (talk) 17:08, Jun 2, 2004 (UTC)
I started a new category Category:Websites, changed the links, and listed Category:Web sites for deletion. Fredrik (talk) 18:21, 2 Jun 2004 (UTC)
Several Web directories have already spent many hundreds of hours working out how to divide the "websites" category. As I mentioned a day or two ago, we may save a lot of wheel-invention time by looking at them. The one I know fairly intimately, and largely respect, is Zeal (for "United States" read "Universal"). (Hey, why doesn't that external link show the way it used to?) - Robin Patterson 01:11, 3 Jun 2004 (UTC)
Good point. I also like http://dir.yahoo.com/ and http://www.alexa.com/browse for structure. - DropDeadGorgias (talk) 18:58, Jun 4, 2004 (UTC)

Articles that have their own categories

If the subject of an article has its own category, should the article go both in that category, and in its parents? For instance, National Hockey League - should it be in both Category:NHL and Category:Ice hockey leagues? -- Jao 20:23, 2 Jun 2004 (UTC)

I think probably, yes, at least, in most cases. But perhaps it should be linked special to get put at the start or end of the alphabetized category list? Or maybe this should be an automatic feature? At the very least, the two articles should link to each other somehow. --ssd 02:44, 3 Jun 2004 (UTC)
I think no, in most cases—but yes in Jao's. Take a look at what I did with United States National Security Advisor and Category:United States National Security Advisors. I made the category a see also on the first page, and the page for the category links to the article. That gets the linkage, bidirectionally.
I think the difference pretty much comes down to whether the category is a plural or not. Advisors suggests that the category contains advisors. United States National Security Advisor is about the advisor, but it isn't one itself. In your case, the category doesn't say "league teams" or anything like that, it just says "NHL". So National Hockey League should go into Category:NHL. It wouldn't go into, e.g., Category:NHL teams. --TreyHarris 04:46, 3 Jun 2004 (UTC)
That sounds reasonable (except the plural part as a descriminator). I think the bidirectional linkage part is what is important. How it gets there is less so. I guess I was just thinking lazy, why link it in both places instead of just putting it in the category once? :) --ssd 05:17, 3 Jun 2004 (UTC)

Could "hidden categories" solve heirarchy problems?

There's a lot of argument over whether John Lennon (eg) should be in Category:The Beatles only (putting him in "British Musicians", "Musicians", "People" automatically) or put in all those explicitly (therefore showing up on all those lists). I have a possible solution.

Implement hidden categories: If I type [[Category:Musicians|]] in the John Lennon article, his article will show on Category:Musicians but the Musicians category will not show in the article on John Lennon. Just from looking at the categories which show up ("The Beatles" and maybe "British Musicians" too, and perhaps "Males" and "Christians" on the side, which brings me on to my next post) any sane human being understands John Lennon is also a Musician and a Person. However, if you go to Category:People, do you expect to find John Lennon on the list? Of course! r3m0t 23:29, 2 Jun 2004 (UTC)

John Lennon, who recorded much solo material from 1970-1980, clearly should be listed in both Category:British Musicians and Category:The Beatles (assuming the latter category is necessary - it seems pretty ridiculous to me). john k 06:40, 3 Jun 2004 (UTC)

See the talk page, where this has been discussed. Because all of the Beatles were British musicians, Category:The Beatles is a subcategory of Category:British musicians. Thus, John Lennon is included in British musicians indirectly. Hidden categories could end up being a nightmare for maintainability, and horribly prone to vandalism. There's have to be a really, really good reason for including them. grendel|khan 19:26, 2004 Jun 3 (UTC)

Cities?

What should be the parent of Category:Cities? Geography? Politics? Chuq 23:41, 2 Jun 2004 (UTC)

Why not both? Fredrik (talk) 00:57, 3 Jun 2004 (UTC)
Well I wasn't thinking one or the other, I just didn't think either of them were particularly appropriate, and thought there could be better options. Civilisation for example? It's a tricky one! Chuq 01:03, 3 Jun 2004 (UTC)
I would prefer geography, but both is fine by me -- Chris 73 | Talk 01:18, 3 Jun 2004 (UTC)
I would suggest using Category:Geography for articles related to the academic study of geography. May I suggest placing Category:Cities, Category:Countries, Category:Rivers, etc. into Category:Gazetteer, which would be a subcategory of Category:Geography. (Articles about gazetteer books would be under Category:Gazetteers.) GUllman 22:37, 3 Jun 2004 (UTC)
Ahh, now that sounds like the best plan so far! Chuq 00:01, 4 Jun 2004 (UTC)
How about Category:Places ? dml 02:49, 4 Jun 2004 (UTC)
Category:Places or Category:Geographical features, perhaps — deltab 2004-06-06 04:06 UTC
I don't know about Category:Cities, but I put Category:Rivers, Category:Mountains, Category:Islands, etc. underneath Category:Landforms, which is a sub-category of Category:Geography. Makes sense to me, at least. --- hike395 12:04, 8 Jun 2004 (UTC)
Sounds like a great plan Hike395. After your comment I thought Political geography and Physical geography would be good categories? Chuq 13:39, 8 Jun 2004 (UTC)
Good ideas for subject titles -- just be careful to choose titles that will not mix nouns and proper names together. Category:Physical geography or Category:Landforms would contain articles such as Island, Volcano, Flood plain. Category:Political geography would contain City, State, Boundary. However, Category:Places would contain articles such as Greenland, Mount Vesuvius, North Pole, Greenwich. GUllman 00:27, 10 Jun 2004 (UTC)
I disagree --- Mount Vesuvius is a volcano, hence belongs in Category:Volcanoes, a sub-category of Category:Landforms, sub-category of Category:Geography. Excluding places from Category:Geography 1) is non-intuitive for users, and 2) contradicts current Wikipedia ontology, which has articles like Geography of California that describes many places (i.e., proper nouns). --- hike395 02:51, 10 Jun 2004 (UTC)
My examples were exemplary and weren't intended to define how many levels of categories are between the top level and the article. Mount Vesuvius is both a volcano of interest of geologists, and a tourist site of interest to geographers and people visiting Italy. Even though I was a Geography major, I must admit that places are not the sole property of Geography -- I proposed to create a Category:Places because the geographers, geologists, political scientists, historians, economists, transport and travel, communications, and many other fields all lay claim to the Wikipedia gazetteer. Some articles on places have information on their history, or politics, or geology, or transportation, but the rest has yet to be filled in. This way, Category:Places could be a subcategory of all these categories at once. GUllman 20:04, 10 Jun 2004 (UTC)

Lists v. categories

I think the categorisation of topics is a good idea, but I'm just wondering about the validity of lists now that the category option has been enabled in WikiMedia. Categories are in a sense lists, so wouldn't the lists created so far become obsolete eventually?

For example, there is a List of vegetables, but we also have Category:Vegetables. Are we saying there is room for both on Wikipedia? Surely once articles have been categorised, there would be no need for the lists?

Forgive me for posing these questions, but I'm trying to understand the rationale behind using both. --TonyW 10:47, Jun 3, 2004 (UTC)

Good question. I agree, but I'd suppose some people will always prefer to stick to lists so we shouldn't force our own preferences on people. The systems can coexist happily.--[[User:HamYoyo|HamYoyo (Talk)]] 10:51, Jun 3, 2004 (UTC)

Categories do have several advantages over lists, but the one advantage lists have over categories, indeed the feature that has driven the growth of Wikipedia to half a million articles so quickly, is the ability to easily type requested articles and existing articles on the same page. If an article in a list doesn't yet exist, it remains a red link until someone decides click on it and create one. Using categories, you actually have to create a stub article for every item on the list, including a category link on each stub (or type the requested titles onto the category page so they show up in a separate grouping). It remains to be seen how this will change our ability to find gaps in coverage and create new articles. GUllman 18:02, 3 Jun 2004 (UTC)

Since categories do indeed have a page associated with them, and the category shows up red until that page is edited, that could be the place to put links for desired but non-existent articles. In my opinion, the bigger difference is that lists have, or can have, extra information beyond just the article title. For example, in List of people by name, each person often has dates and a one-line description. That's something which would be good to add to the category system. --Amillar 14:10, 4 Jun 2004 (UTC)
It just seems that there's going to be a lot of duplication. Only earlier I found a List of astronomers, and a category for the same. Admittedly, not all the astronomers in the list are in the relevant category yet until someone marks those articles with the category identifier. I was thinking if existing articles were copied over, then the list could be deleted, and new article entries could be marked under its corresponding category thereby adding it to that "list". -TonyW 19:25, Jun 3, 2004 (UTC)

I see three advantages of lists over categories:

  1. Categories can only list things for which there are articles, lists can include things for which there are no articles, and it can list several things which are discussed in one page. (For instance, a list of birds could list several species of a particular genus, which are all discussed in a single article about the genus). While creating stubs could solve the first problem, it can't solve the latter.
  2. List can include more information than a category can. For instance, a list of soccer world championship winners will list Brazil more than once. It will also list when the championship was won. That information is lost in categories.
  3. For now, categories are formatted badly. All entries are put in a single paragraph, giving a bunch of lines with several entries per line. It also often gives disambiguation information (because that's used in the title) which wouldn't be shown in a list. Longer lists will typically have a table of content, making it much easier to search what you are looking for.

Abigail 16:20, 4 Jun 2004 (UTC)

It would seem, therefore, that there is a place for both, despite the obvious duplications. Perhaps as improvements are made to the software, we'll see better handling of the category system. -TonyW 18:50, Jun 4, 2004 (UTC)
Good points. Counterpoints (not necessarily disagreeing):
  1. I'm real tempted to copy lists of stuff into the category description for the stuff, and then delete what's already written. Makes a good todo list.
  2. Lists with descriptions are nice, but when the list gets too huge, the category is better, especially when it is more complete and huge, especially with the nice new category format. However, the list might still have a place as showing selected highlights of the category or something. It will be interesting to see what happens to categories when they get as big as some of the lists already split into articles by first letter.

Right now, lists are mostly more complete than categories. I wonder how long it will be before the roles are reversed and the lists are no longer maintained viligantly enough to keep up with completeness of the categories. Or, perhaps that will never happen. --ssd 03:54, 8 Jun 2004 (UTC)

I don't think it's a matter of completeness. Lists carry, or have the potential to carry, more information than categories. A list of presidents of the USA will show that Reagan succeeded Carter, and was succeeded by Bush, and that Reagan served two terms, while the other two didn't. A category will list Reagan under 'R', and Carter under 'C' (if people pay attention, else Carter will be listed under 'J'). Furthermore, we also have lists like people born on May 20 and people died on December 25. We don't have corresponding categories (well, not yet). While categories may be able to replace some lists, I'd think it's a loss for Wikipedia if all lists are replaced by categories. So we'll have to maintain both. Abigail 13:58, 8 Jun 2004 (UTC)

We could add a feature to the categorization software so that unwritten articles can be categorized. Either by adding a category to the nonexistent article (which would still behave exactly as if it didn't exist, but the category would be in the markup when it was first edited), or by a tag on the category page itself, which would be automatically removed when the article is written (and the category automatically added to the article). This would take care of one advantage of lists over categories. I definitely think this should be implemented. Providing links to similar articles that need to be written is a very good thing.

Just some other ideas: The other advantage is formatting of the list, and extra information. Lists could be built from the information that is already in categories, so maybe when something was added to the category, it would show up in the list in an "unsorted" section or whatever.

Category pages could also have a few formatting options, so that you could define whether to list things in subcategories (such as a category of animals and then subcategories of phylum) or in a big list (such as a category of people), could define whether names should be listed in order of last name or first, etc.

The less tedious manual work, the better. - Omegatron 21:58, Jun 9, 2004 (UTC)

When inclusion would be POV

Consider Category:Terrorists. Some people would obviously belong, like Osama bin Laden. Some people would not belong, and it would be POV to include them, like George W. Bush. But lots and lots of people are borderline. Do you inclued Yassir Arafat or not? Either including him, or excluding him, would be POV. He's certainly notable as someone whom lots of people consider a terrorist, but including him seems to be slander and unencyclopedic. And it's not just him: consider Khalid bin Mahfouz, José Padilla, and Waleed Alshehri.

So in situations like this, where including an article to a category would be POV, and excluding the article would also be POV, what should we do? Quadell (talk) 18:56, Jun 3, 2004 (UTC)

The problem is that categorization is an unambiguous bright line in an ambiguous world. I think in cases like this, the criteria should be named on the category page, and discussion of what the criteria should be should happen on the category talk page. People would be completely justified reverting categorizations that don't meet the current criteria.
Editors of some categories may decide that an implicit "articles in any way relating to" should be assumed prior to the category name. Editors of other categories may decide that "People thought by some to be" or "people who undisputably are" is a better criteria. But I think the point is that since NPOV isn't possible here, the rule should be like journalistic bias: when you can't eliminate, disclaim. --TreyHarris 19:07, 3 Jun 2004 (UTC)
What should we do? For NPOV, we should ideally rename the category "people that have been considered terrorists by X", or not have a category for terrorists at all. Fredrik (talk) 19:14, 3 Jun 2004 (UTC)
Do we have a list of terrorists? (It's currently a redirect to terrorism.) The idea of categorization really isn't that different. The same standards should apply, with an explanation on the category:terrorists page. grendel|khan 19:19, 2004 Jun 3 (UTC)
When you decide "well, we can't do it NPOV, so we shouldn't attempt to do it at all," NPOV is becoming ideology rather than guiding principle. --TreyHarris 19:19, 3 Jun 2004 (UTC)
The advantage of a list of terrorists is that for everyone listed there, you can give a justification why the person is on the list (for instance "head of an organization considered a terrorist group by the EU"). With categories, one would have to link to each article to see the justification. So, I'd say that there is a difference between a list and a category. (Note that I don't have an opinion yet whether such a category and/or list should exist on Wikipedia). Abigail 16:27, 4 Jun 2004 (UTC)
Yes, I consider this yet another example of why categories are a bad direction in which Wikipedia can move. Formerly, we could say about Yassir Arafat that, for example, he is considered a terrorist by many (for his and the PLO's actions), but is also the elected leader of the Palestenians and a Nobel Peace Prize awardee for his contribution to the Israeli-Palestenian peace process. This is an English sentence that explains the relationship of that person to several "categories"; The Sentence is NPOV exactly because of these "imprecise" English explanations, rather than the new categories links which would basically means that "Arafat is a terrorist, an elected leader and a nobel peace prize winner" - which is rather strange unexplained. Categories for "suspected terrorists", "considered terrorists", "alleged terrorists", etc., is not likely to help, because we'll start seeing there every notable person which is considered "bad" by someone, regardless of not this "badness" is actually a form of terrorism. Nyh 09:11, 6 Jun 2004 (UTC)
Wow, someone else that thinks categories is a bad idea! The strength of Wikipedia lies in the text and the links that can be made within that text. Categories is a can of worms that will waste a lot of time resulting in what? All kinds of lists with both good, bad, and questionable entries. Whoopie! Bad and questionable entries just waste everyone's time. Create text with links and then edit those links; You can find everything relevant that way with context; lists lack context and so do categories. Should have been obvious before it was implemented. - Marshman 04:23, 8 Jun 2004 (UTC)
Hand-crafted "List of" articles have not been outlawed by the introduction of categories, and there's nothing requiring you to click on any category links. So why not just ignore them and continue on as before? Bryan 04:39, 8 Jun 2004 (UTC)
If a person is not a suspected terrorist and yet has been given category:suspected terrorist, then surely that would be corrected. If someone who hated Alanis Morrisset's music were to put "Alanis Morrisset is suspected of being a terrorist" in her article it would be removed, why not in the case of categories? For the more subtle shadings of meaning, one can try to explain those in plain NPOV English on the category page itself. In the case of Arafat, one would simply have to read his article to find out how he came to be grouped under such seemingly incompatable categories; the article is still the primary repository of information after all. Bryan 09:37, 6 Jun 2004 (UTC)

Example of confusion using the 9/11 commission

Two issues. First, there's Category:9/11, a member of Category:Terrorist incidents. There are lots of people associated with 9/11 in one way or the other. I was thinking of making a subcategory Category:9/11 Commission members, which would include the members of that commission, but then I thought, the Commission Director (Philip D. Zelikow) isn't technically a member, and it would be a shame to exclude him. I could make it simply Category:9/11 Commission, or Category:People associated with the 9/11 Commission, but then one might want to include the people they interviewed. Which one could. So what's the best way to organize this?

Second, let's say there comes to be a Category:People interviewed by the 9/11 Commission. (There are, after all, dozens, and this is information people might want to browse through.) Besides the fact that this would be an awfully long name, it would have to include George W. Bush. Which would indicate to me that the man would have a lot of categories. Off the top of my head, I can think of Category:Presidents, Category:People interviewed by the 9/11 Commission, Category: People associated with the 2003 invasion of Iraq, Category:Oil Executives, Category:Republicans, Category:Neoconservatives (although that's pushing it), Category: Baseball team owners, and soon, possibly, Category:Defendants in the Valerie Plame case. My point is, particularly famous and diverse people are going to be in a dozen, maybe dozens of, categories. Is this according to plan? Quadell (talk) 19:14, Jun 3, 2004 (UTC)

I see no problem with Dubya being in multiple categories- as he's been famous in several roles, I see no problem with that... He is the President of the US, Leader of the Armed forces and all that. - DropDeadGorgias (talk) 19:36, Jun 3, 2004 (UTC)
In practice most things and people can belong to several orthogonal categories. If the categories are not independent then there is a move to fit them into a category heirachy somewhere. I guess if it gets too far out of had you might want to give priority to some categories and hide others to avoid an over-long category line. As far as Dubya is concerned, don't forget to add Category:Functional illiterates -- Solipsist 20:47, 3 Jun 2004 (UTC)

Persons v people

Now, what is the plural of person. Is it (1) people or (2) persons; the answer is persons; people is a mass noun, and although in common usage as a plural of person, that usage is incorrect. Dunc_Harris| 20:08, 3 Jun 2004 (UTC)

People is what you have when there are several of them.

If you are refering to a person as a group (??), but you have more than one group, that could be a persons I suppose, but I can't think how exactly that would be meaningful. If you have several different groups of people, that would be peoples, as in peoples of the world and I have seen that used. --ssd 12:23, 4 Jun 2004 (UTC)

Biographies ?

Currently, there is the list of people by name. With all articles linked from those pages, one gets many (but not all biographies, and a series of articles due to linked dates, descriptions etc.). Using Category:People and subcategories, one gets many of them as well, but also 20 Forthlin Road and similar.

I'd consider this a problem with the categorisation of 20 Forthlin Road, not the need for another category. Maybe it should go under Category:The Beatles? Remembering that The Beatles should not be a sub-category of British musicians, it should be a subcategory of British musical groups. The Beatles members should be a sub-category of both The Beatles and British musicians. Chuq 04:42, 4 Jun 2004 (UTC)

If a category and a sort key, e.g. [[:Category:Biographies|Lennon, John]] were added to every article, one could easily list those without additional articles. The layout of Category:Biographies would initially be a bit messy, but once the layout is improved, it may be just as easy to read as the list of people by name above. -- User:Docu

I think once the "expansion" of sub-catergories (ie. displaying of all articles under all sub-categories) happens (ie. is programmed, is configured, is enabled) it will be fine. Adding everyone to People, or Biographies, is a really messy way of doing it, as it will all need to be undone later. Chuq

Inappropriate Categories?

I tend to be of the opinion that categories for subjects that have or could have succession tables are a bad idea. What value does Category:British Prime Ministers have? It gives you an alphabetical list of British prime ministers. How is this useful, when there's already a page that chronologically lists British prime ministers, a link to that page from the articles of every British prime minister, and succession tables to allow you to easily jump to that person's predecessor(s) and successor(s). So why is this useful? Especially since, by this logic, James Callaghan will end up in Category:British Prime Ministers, Category:British Foreign Secretaries, Category:Chancellors of the Exchequer, Category:British Home Secretaries, and Category:British Labour Party leaders, but not in Category:British politicians. Wouldn't it make more sense just to put all British politicians in Category:British politicians, and leave the individual offices to the older methods which worked perfectly fine? That is, I think categories are inappropriate for things of which there is only one at a time. I'd be happy to see Lord Callaghan in a Category:Prime Ministers, for instance, since that would presumably list politicians who were prime ministers in various different countries (although we might want to limit such a category to current prime ministers). john k 20:38, 3 Jun 2004 (UTC)

edited for clairity by Gentgeen 20:51, 3 Jun 2004 (UTC)
If one is trying to use Category:British Prime Ministers to find a list of Prime Ministers, it's probably quite useless ( #User_browsing ). It may help sort "British politicians" (if this is needed) or in terms of #Category_extraction. -- User:Docu
You say "I'd be happy to see Lord Callaghan in a Category:Prime Ministers, for instance, since that would presumably list politicians who were prime ministers in various different countries" - well why not try having Category:British Prime Ministers as a sub-category of Category:Prime Ministers?
Because it's pointless. Of what use is this category? john k 02:38, 4 Jun 2004 (UTC)

What about Category:Heads of state - it would be a pain to add each British PM to both categories, when you could add them to Category:British Prime Ministers and then add that category to any others it should be a part of? Chuq 23:51, 3 Jun 2004 (UTC)

The British PM would not need to be added to Category:Heads of State, because he (or she) is not a head of state. john k 02:38, 4 Jun 2004 (UTC)

Category:Heads of government then :P Anyway, in some countries, the Prime Minister is both - emphasising the fact that having a group for each country is a good idea. British PMs -> HoG, (other country) PMs -> HoG and HoS. Chuq 04:28, 4 Jun 2004 (UTC)
I agree it is silly to create categories that are too finite to be of use. Where a grouping is finite (ie. you can create a list, such as UK Prime Ministers, American Presidents, etc) then that list should reside on a list page and the members should be put in a category(ies) appropriate to the overall category that they are in. Lists to find, categories to browse. --VampWillow 14:46, 4 Jun 2004 (UTC)

There is not a single country in the world where the Prime Minister is, by virtue of that office, Head of State (there are various countries where the Head of State also holds the office of Prime Minister, such as Oman, Saudi Arabia, and so forth, and many countries, such as the United States, where the Head of State is also Head of Government). john k 05:28, 4 Jun 2004 (UTC)


Really? -- Joe in Canada

More sophisticated relations

OK, we have Category:People. This includes Category:Authors, which includes various other categories. One of these is Category:Stephen King. Now, in this case, that category includes lots of books directly.

This means that if you took a naïve algorithm and dumped out all the articles linked to under Category:People, you'd get a list of Stephen King's books. This is clearly not very useful.

Now, the books needn't be under Category:Stephen King directly - they could be under Category:Stephen King books. But that would want to be, along with Category:Stephen King movies, a member of the Category:Stephen King, and it is hard to see where a Category:Stephen King would go if it isn't Category:Authors or a subcategory of that.

So I think we need a way of distinguishing between a category where (a) you are asserting that everything in the category is an example of the thing it is in (ie list categories), and (b) categories where you are just providing hierachial links for convenience.

So, basically I'm saying, the relationship between Category:Stephen King and Category:People should be different to that between Category:Authors and Category:People. I want all members of the latter to appear in the mega-list of people, I don't want all the members of the former.

Does this make sense to anyone? Morwen 18:13, 4 Jun 2004 (UTC)

Yeah, I see the problem. Either we can assume that members of a sub-category are also members of its super-category -- in which case Category:Stephen King books can't be a member of Category:Stephen King, and they'd have to be unrelated -- or we can't make that assumption -- in which case Category:Stephen King would have to be a member of Category:People, since the fact that he's an author doesn't guarantee that he's a person. Similar problems are everywhere. What's to do?
One possible solution would be to have a "related categories" section, separate from a "sub categories" section. Category:Stephen King books would be related to Category:Stephen King That seems like the best solution to me. Quadell (talk) 18:33, Jun 4, 2004 (UTC)
That would work. Additionally it might be good to have a way of linking from say, Stephen King to Category:Stephen King without Stephen King actually appearing in the category Category:Stephen King (because he is already in Category:Authors; and from his categories introduction text). Morwen 20:22, 4 Jun 2004 (UTC)
Check out ScudLee's diagram further up this page at Wikipedia_talk:Categorization#People_by_name. What it shows, is that Category:Stephen King should not be in Category:Authors, as the only author in all the Stephen King related articles is Stephen King himself. I think this is a similar point to what John k said below. Chuq 23:43, 4 Jun 2004 (UTC)
Yes, that was my point. BTW, it's Category:Writers, not Category:Authors. john k 00:31, 5 Jun 2004 (UTC)

Why should Category:Stephen King be under Category:People at all? Shouldn't it be under Category:Categories about People, or some such? john k 20:36, 4 Jun 2004 (UTC)

Part of this is a naming issue. Only 4 things fall into the category "The Beatles", while many things fall into the category "Things related to the Beatles". Also, sometimes the category seems to relate to the subject the article is about; other times to the article itself, which may be a distinction worth acknowledging in some way. -- Nunh-huh

We have to think from the encyclopedia user's point of view. He/she is starting at the top level of the hierarchy with a subject in mind, and they need to know which blind path to go down to find an article on that subject. It might help to think of the problem as a game of twenty questions. The first question we may ask is, "Is your subject a Category:Persons, Category:Places, or Category:Things?" If they choose Category:Persons, then ALL the articles from then on should be about persons. Why? Because we may someday be able to click a link to collapse the hierarchy, and display all the articles below that level in one alphabetical order. If they wanted to know about Stephen King's books, they might choose Category:Things, and have a choice of Category:Animals, Category:Vegetables, Category:Minerals, Category:Ideas, etc., and go down one of those paths. My point is, Categories link only as a hierarchy; Wikipedia articles link as a network to every related article. So as long as the user reaches the article on Steven King (the person), or the articles on Steven King's books using the categories, the articles themselves link to each other. GUllman 16:31, 5 Jun 2004 (UTC)

This makes a lot of sense. We need to have some sort of idea of what categories are for before we go and start putting everything into categories. 17:42, 5 Jun 2004 (UTC)

On the other hand, I think there's a good case to be made for a more bottom-up approach; let's take a look at how things are being categorized, and try to find the patterns in that. It's more the Wikipedia way, too. For example, I've noticed that there are a lot of categories that are non-plural, such as Category:Medicine, Category:Biology and Category:Law. In those cases, rather than being categories containing only one article (Medicine, Biology and Law, respectively) they are instead full of articles and subcategories that are about the indicated topic. It's like there's an implied "Topics relating to -" in front of the categories with singular titles. I think that'd be a good approach to consider as a standard, personally, since it seems to be natural and it would also cut down on the wordiness of many category titles (there'd be a heck of a lot of categories starting "Topics relating to" otherwise). Bryan 05:25, 6 Jun 2004 (UTC)

The following is an archive of a section that used to be on the main page. It has since been re-factored, but this reads like a discussion, so it's been kept here.

Hierarchicalization

Categories should be kept as hierarchical as possible. That is, the number of top-level categories should be kept at a minimum.

Most importantly, an article should only link to the most specific categories it is in. Thus, Paul McCartney should link not to People, nor to People of Britain, nor to Musicians, but to British musicians, or rather The Beatles. The reason for this is that The Beatles should be hierarchicalized as follows (arrows denote parentage; linked items are articles; the rest are categories):

Paul McCartney
      |                            -----> Musicians -----
      V                           /                      \
The Beatles -> British musicians -                        ---> People
                                  \                      /
                                   -> People of Britain -
It's wrong. Imagine, Paul McCartney leaves The Beatles, and Britain. Will he belong to People? Kenny
I think it's ok, in this case. He'll still belong to "The Beatles" and British musicians. He is also, separately, directly, in the category "People living in Britain", which he would no longer be in... and he should directly be in "Solo musicians" as well. But in general, it is true, deciding whether an adjective should apply to a whole category (British musicians) or be its own category (People of Britain) or both, is difficult. +sj+ 19:51, 2 Jun 2004 (UTC)

Thus, it's known that British musicians are not only musicians and British people, but also People. Which is obvious. Thus, the number of actual category links on the article page can be kept to a minimum.

Another example:

Harry Potter   Albus Dumbledore
   |                  |      |
   |                  |      |
   |   .--------------'      |
   |   |                     V
   |   |         Hogwarts teachers ---> Harry Potter characters ---> Harry Potter
   V   V                                ^   |                            |
Gryffindors-----------------------------'   |                            V
                                            V                   Fictional universes
                                  Fictional characters                   |
                                            |                            V
                                            '--------------------->  Fiction
                                              

Note that an article may have more than one parent. Thus, the structure is not a strict top-down hierarchy. (And thus absolutely not equivalent to the old subpage model.)

Disagree with Most importantly, an article should only link to the most specific category it is in. Most people will place items in their simple (food, say) category as their first choice and articles routinely belong in multiple categories. An article should be in as many categories as it belongs in. If you want a table of contents, create a specific table of contents system, which you cna also do with category tags. Category tags themselves are not solely for making a table of contents. The description above is perhaps a fair start for a description of TOCCategories. It's not a good start of a description of how to use category tags. See Category extraction below, which this proposed heirarchy directly contradicts. Jamesday 01:25, 1 Jun 2004 (UTC)
Putting an article in "as many categories as it belongs in" would create tremendous and largely pointless clutter. It's much more useful to have Category:Fictional characters as a hierarchy than as one big, flat list, which is exponentially more likely to turn into kibble. Further, there's absolutely no contradiction with #Category extraction: categories can be extracted recursively. For instance, in the above example, extracting Fictional characters would extract Harry Potter characters, and thus Hogwarts teachers, and thus Albus Dumbledore. This allows for a very fine degree of control. There's frankly no reason to add data that can be autogenerated by the software.
True, tools don't exist to do this right now, but the process isn't that difficult. Take the following pseudocode (given that the add operation on sets does nothing when we add an item that's already in there, like doing a $hash{$item} = 1 in Perl), which simply performs a breadth-first search of the category graph starting at the given element, then adds all articles contained in that subset of the category space:
Queue.enqueue(category_to_extract) 
while (!Queue.empty)
  this_category = Queue.dequeue();
  CategorySet.add(this_category)
  foreach (subcategory in this_category)
    if (!CategorySet.in(subcategory) and !Queue.in(subcategory))
      Queue.enqueue(subcategory) # if we haven't seen it before
foreach (category in CategorySet)
  foreach (article in category)
    ArticleSet.add(article)
foreach (article in ArticleSet)
   Extract(article)
Once the new-format database dumps come out, I'll see how implementing this works out.
Furthermore, by simply doing repeated parent-lookups, we perform a similar search in the opposite direction---up toward root categories, not down toward articles---and extract something like the following:
Fictional universes
     |
     V
   Harry Potter--\
                 |           /---->Hogwarts employees--\
                 V           |                         |
   Harry Potter characters---+                         +-->Albus Dumbledore
     ^                       |                         |
     |                       \---->Gryffindors---------/
Fictional characters
Which could even be written thusly. by graph-distance from the article:
  • Albus Dumbledore
    • Hogwarts employees, Gryffindors
      • Harry Potter characters
        • Harry Potter, Fictional characters
          • Fictional universes
Pseudocode follows (to produce the above list, not the generalized dag).
foreach (category of my_article)
  Queue.enqueue( [category,1] )
while (!Queue.empty)
  [this_category,depth] = Queue.dequeue();
  CategorySet.add( [this_category,depth] )
  if (this_category has no parents)
    this_category.setFlag(is_a_root)
  foreach (super_category containing this_category)
    if (!CategorySet.in(super_category) and !Queue.in(super_category))
      Queue.enqueue( [subcategory,depth+1] )
      maxdepth = max(depth+1,maxdepth)
    else 
      [super_category,old_depth] = CategorySet.retrieve(super_category)
      if (old_depth > depth+1)
        CategorySet.add( [super_category,depth+1] )
foreach ([category,depth] in CategorySet)
  outputSet[depth].add(category)
foreach (depth in 1..maxdepth)
  output depth;
  foreach (category in outputSet[depth])
    output category;
The idea is to trace it back to one or more root categories. Root categories should be made with this in mind. There's no real reason, for instance, to make Mathematics and natural sciences simply to contain Natural sciences and Mathematics, when starting the extraction with the set { Mathematics, Natural sciences } would work exactly the same.
Hope this helped explain some things. grendel|khan 14:42, 2004 Jun 1 (UTC)
Paul McCartney can be linked directly to British musicians since he did work independently of The Beatles. (He's also got other categories he can be in, but that's irrelevant to the discussion.) -- Cyrius| 03:06, 1 Jun 2004 (UTC)
The point is that the individual members of The Beatles are already qualified as British musicians through the Beatles category. -Sean Curtin 12:22, 1 Jun 2004 (UTC)
I don't think it is important that an article links only to the one most specific category. Frequently, there are several specific categories. It might even be ok if it links to a category and its parent category. But if the parent category ends up with a couple hundred articles, those also linked deeper should be weeded out. Unless, of course, someone wants to implement an automatic category TOC that splits by alphabet--first letter? first two letters? --ssd 03:36, 1 Jun 2004 (UTC)
I hopen eventually the categories will work similar to Special:Allpages, allowing to use categories for the larger of the current "Lists of articles by category". -- User:Docu


Creation order of hierarchies

In the enthusiasm to populate Wikipedia with categories, the character of the project is changing in an unanticipated direction. Pages are being (randomly) categorised by proponents of various disciplines. There are three problems with this:

  • Firstly, over categorisation. I have seen the following anomalies in the last 24 hours:
    • Shepton Mallet, a small town in England is categorised as British archaeology and nothing else
    • walking is categoried as complementary and alternative medicine
    • a lawyer who played chess is categorised as a chess player (subsequently as both)
  • Secondly, clutter. Even if each of these categories is relevant (which can be doubted) the original page starts to clutter up rapidly. Logically, there is no almost limit to the extent to which categories can be applied to any page for the imaginative editor.
  • Thirdly, the neutral point of view is compromised. An example of this is domestic violence. One of the debates in this area is the extent to which claims of domestic violence are false or not. However, the category "abuse" has been blithely attached - thus begging one of the crucial issues.

I suggest that there should be a Guideline for categorisation by which editors (1) exercise caution and err on the side of not ascribing a category unless the text of the page justifies it (2)limit the size of the categorisation link text so that it remains small in relation to the size of the page. It seems to me that without these minimal requirements we will end up with pages exhibiting content confusion. Pages will go through a long gestation period looking rather weird - perhaps even looking like they derive from a banner supported commercial website. Is there any solution to the problem of trying to create hierarchies in a more logical order? -- JPF 17:58, 2 Jun 2004 (UTC)

  • I don't know where to put this comment, so move it someplace better if you think you should. A catagorization scheme that has been of tremendous use is the system in place by the National Library of Medicine. In this schema every article gets categorized, and put into every category that is appropriate. There are also two classes of categories, major and minor. the system is a major reason why medical science has advanced quickly in the last 25 years - people can find stuff. Basing the category structure on the NLM model would be useful. There is a big caveat: The category structure in NLM has a defined vocabulary of valid categories. Setting up such a structure would be project in itself. It would have an interesting subproject. Using software that identifies like articles could automate the generation of categories for articles. What is needed to address a lot of the concerns about categoris is to put the categories in new text variable in the SQL database, where only valid categories are allowed to exist. Specialized librarian users could update that part of the database. Kd4ttc 20:24, 2 Jun 2004 (UTC)
Domestic violence is by definition abuse. It's a different question whether any given incident could be called domestic violence or abuse. But you make a good point about maintainging NPOV. Some categorizations might need to be labelled as disputed. (Like whether or not something is part of Science...who knows.) -- Beland 07:36, 13 Jun 2004 (UTC)

Direct and indirect inclusion

By example. (please spell check)

Indirect

We have category Spaghetti. Spaghetti is a kind of Pasta (belongs to the category). Pasta is a kind of Food. Is Spaghetti a kind of Food? - Yes! Does Spaghetti belongs directly to category Food? - No. If it belonged to the category, other kinds of pasta and soups would belong too, making Food category overloaded. Thus Spaghetti is kind of Food only indirectly.

Direct

Spaghetti is a kind of Italian national pasta. Italian national pasta is a kind of Pasta. Would Spaghetti belong to Pasta directly? Yes. Imagine if somebody else invented Spaghetti, not the Italians. Would it be a kind of Pasta? Sure, yes. Thus Spaghetti belongs directly to Italian national pasta and Pasta independently.

Are you Italian by any chance? You're addressing a very good point; the whole category engine has to be changed to be more versatile so that items in subcategories appear in the parent category as well. Also, you should be able to combine category queries so that you can view, say, all articles that come under both "Italian" and "food" and "traditional" rather than having infinite subcategories such as "Traditional Italian Food", "Modern Italian Food", etc.
"Italian national food" should not be its own category. Categories cannot do everything, they are only an aid, and given their limitations they are most effective when not overly precise. "Italian" and "Food" can be separate categories. +sj+