Wikipedia talk:Categorization

From Wikipedia, the free encyclopedia
  (Redirected from Wikipedia talk:CAT)
Jump to: navigation, search
Shortcut:
WikiProject Manual of Style
WikiProject icon This page falls within the scope of WikiProject Manual of Style, a drive to identify and address contradictions and redundancies, improve language, and coordinate the pages that form the MoS guidelines.
 
WikiProject Categories
WikiProject icon This page is within the scope of WikiProject Categories, a collaborative effort to improve the coverage of categories on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 

WP:SUBCAT - belonging also to parent[edit]

The following statement has been in the guideline for a long time:

"When making one category a subcategory of another, ensure that the members of the subcategory really can be expected (with possibly a few exceptions[clarification needed]) to belong to the parent also."

This statement is completely on the contrary to how the vast majority of our categorisation system works. For example, look at the subcategories and article content in Category:Perth, Western Australia. Category:Perth, Western Australia-related lists‎, Category:Swan Coastal Plain‎, Category:Crime in Perth, Western Australia‎: none of these are Australian capital cities, or Cities in Western Australia or Coastal cities in Australia (the parents of the given category). Effectively, this guideline suggests that even set categories on a topic (like "People from foo") should not be in the topic category ("foo") as they will almost never share the features of the parent category.

This statement should be removed from the guideline because it is completely unrepresentative of our categorisation system. We overlap the various categories when they have a parent-child semantic relationship – by design you can get to the "People in CityX, CountryY" category by going through the "Cities in CountryY" category. That's what we've all come to expect and the implementation of the above (highly restrictive) guideline's categorisation method would profoundly change the structure of today's Wikipedia. SFB 02:21, 18 January 2015 (UTC)

The "Subcategorization" starts off by saying "If logical membership of one category implies logical membership of a second, then the first category should be made a subcategory (directly or indirectly) of the second." And further on: "If two categories are closely related but are not in a subset relation, then links between them can be included in the text of the category pages." If this isn't how things are done, or not exclusively how things are done, then these should also be removed or modified. Perhaps we should insert a sentence along the lines of When there is a parent-child semantic relationship between categories, the child should be a subcategory of the parent, regardless of whether the members of the subcategory can be expected to belong to the parent or not. - Evad37 [talk] 04:34, 18 January 2015 (UTC)
Isn't this they same as saying 'we have a parent-child category system'? Hmains (talk) 04:58, 18 January 2015 (UTC)
Perhaps then We have a parent-child category system, regardless of whether the members of the child category can be expected to belong to the parent category. Basically, make it clear that subcategories are not subsets. - Evad37 [talk] 06:23, 18 January 2015 (UTC)
Don't throw out the baby with the bathwater. That "the members of the subcategory really can be expected ... to belong to the parent" is fundamental to (wp) categorization. Without it categorization could turn into a complete mess - in particular category intersection would never work. If Foo is a city then Category:Foo can contain any article that is within the subject of that city (e.g. an article about a lake in that city). The Foo article should be categorized as a populated place, the Foo category should not - that way the lake article can be placed in Category:Foo without the lake being categorized as a populated place. That this (like most things in wp) isn't currently perfectly implemented isn't a good reason to throw it away. DexDor (talk) 08:31, 18 January 2015 (UTC)
@DexDor: Regarding the above example, that suggests that (a) "Lakes in Foo" should be removed from "Foo", or (b) "Lakes in Foo" should be included, but "Foo" should have no parents at all. The Perth example demonstrates that all the parents of that category do not apply to any of its subcategories. That category is also very typical of how Wikipedia's category system is built. If this guideline was "perfectly implemented" what would that category look like to you? What would the parents be?
Category intersection will not work via the current system, but that doesn't mean the arrangement is wrong – people want a semantic web approach of navigation. It's how humans read things. This just means that the current category system is technically unsuitable to build the category intersection feature. For that, we need a new system that allows us to set attributes for category relationships. For example, whether the relationship is "tree inclusive" (most set categories) or "tree exclusive" (most topic categories). For example, "People from Perth, Western Australia‎" would have a tree exclusive relationship with "Perth, Western Australia‎", but a tree inclusive relationship with "People from Western Australia". Note that under this model, the current parent "People by state or territory in Australia" would also need to be marked as a navigational "by"-type category, and not a functional topic or set category. Needless to say, it becomes obvious very quickly that the requirements for category intersection simply aren't met by the current system, not least because we would need to delete and reorganise half of it for intersection to work. SFB 12:19, 18 January 2015 (UTC)
"Semantic web approach of navigation" is exactly right, and SFB is correct in identifying the flaws with DexDor's interpretation, which would fragment the category system so as to make it less useful for readers, and in any event is obviously not consensus-supported practice. The last time this issue was discussed (only a few months ago, and also at this page), the consensus was clearly against such an absolutist and mechanistic view of parent-child relationships. Obiwankenobi's top comment there really says it all: "Categories really aren't like mathematical sets." postdlf (talk) 15:37, 18 January 2015 (UTC)
Replies to SFB's questions: (a) No - Category:Lakes in Foo belongs in Category:Foo and Category:Lakes (b) No - Category:Foo belongs in Category:Fooistan (the country) (and there are also "Wikipedia categories named after ..." categories) and article Foo belongs in Category:Cities in Fooistan, Category:Capital cities etc (as well as its eponymous category). The point is that there are some categories that are appropriate as parents of Foo, but not appropriate as parents of Category:Foo. Another example: the RAF article belongs in Category:1918 establishments in the United Kingdom, but the RAF category does not belong in a 1918 category (as that would have the effect of putting every RAF squadron in the 1918 category).
If anyone thinks wikipedia would benefit from changing the fundamental principles of wp categorization so much that wp:subcat no longer applies (or set up a separate categorization system) then I suggest they write an essay explaining how their system would work (including any changes needed to MediaWiki) - that way the benefits/flaws of an alternative system could be assessed. DexDor (talk) 19:50, 18 January 2015 (UTC)
@DexDor: Surely the fact that no city categories are in the country categories shows that this guideline is the wrong way wrong? i.e. there are some cases where it is applied (e.g. RAF & year example), but most of the time that logic is not applied. The examples you're giving are logical enough, but they are far from a reflection of Wikipedia practice, and perhaps categorisation preference of editors as well. Guidelines should be reflecting those. SFB 22:34, 19 January 2015 (UTC)
@SFB. Category:Paris is in Category:France (by several routes) so what do you mean by "the fact that no city categories are in the country categories" ? There is a tendency for categories to be overcategorized - Once upon a time Category:France (as well as the France article) was in Category:Member states of NATO which meant that, for example, articles about the French Revolution were in Category:NATO. That was incorrect categorization, but it didn't mean that the principles of wp categorization were fundamentally flawed - it just needed some tidying up. DexDor (talk) 08:03, 20 January 2015 (UTC)
That's my point. For your logic to work the Paris category should be directly in the France category, otherwise by your category logic we get other strange results. In the current arrangement 1996 UEFA Cup Winners' Cup Final is a child of Populated places in France, Cities in France and Departments of France via the Paris category. For you logic to work (i.e. all the tree must logically apply to the children) you would not only need to completely change the way set categories are used, but you would also have to radically re-parent things like the Paris category. As I say, I'm not saying there is a problem with the system you're proposing, but the mere enforcement of the stated guideline would hugely affect topic categories like that one (which are actually the most important ones). SFB 23:19, 20 January 2015 (UTC)
No, I'm not saying that any article needs to be directly in any category - if there's a suitable subcategory then that's where the article should be (ignoring the special case of eponymous articles).
The dual set/topic nature of wp-categorization can make things complicated. E.g. if someone intersects categories like Cities-in-France and Cities-in-Belgium they might be expecting just articles about cities that are in both countries (i.e. that straddle the border, if there are any) - in fact they would get articles about interactions between cities (e.g. your example of a football match). I don't have a solution to things like this (if, indeed, it's a problem that needs a solution).
The onus should be on anyone who thinks the wp en categorization rules should change to design (what they think is) a better set of rules (that are consistent, don't require changes to Mediawiki etc). Until a clear alternative to the current scheme is proposed (and, again, I recommend doing it as an essay so that it can be clearly explained in detail) this discussion is unlikely to make progress. DexDor (talk) 22:35, 21 January 2015 (UTC)
I agree with what postdlf has said above. The WP category system is too intricate to regard it as a system in which contents of child categories can always (or even mostly always) be regarded as legitimate contents of the parent categories. I don't think that recognizing this is a proposal for a new system or introducing fundamental principles—it's more a recognition of the complexity and how things are currently set out in practice. As a principle it certainly works in many contexts, but there are also many in which it does not work well. Good Ol’factory (talk) 02:18, 19 January 2015 (UTC)
It seems to me the guideline should be revisited based on actual usage. The Watergate scandal is under Category:Richard Nixon which makes sense to me. But that scandal obviously would never fit directly in Category:Presidents of the United States and I think this is more common than the occasional "exception" anticipated in the guidelines. Even with what seems like a hierarchical tree, like the lakes above, currently includes Category:Fish of Lake Victoria. RevelationDirect (talk) 03:42, 19 January 2015 (UTC)
A super-common example is the PLACE/People from PLACE combinations. Category:People from Paris is a subcategory of Category:Paris, but in most cases the contents of the former would not be appropriate for the latter. Good Ol’factory (talk) 03:52, 19 January 2015 (UTC)
It's not appropriate for a person from a large city like Paris to be directly categorized in Category:Paris because there is a people-from subcategory. A smaller place (e.g. a town/village) may be large enough to have its own category, but if it doesn't have a people-from subcat then (as long as the persons connection to that place is sufficiently strong) a bio article can be placed directly in the category (e.g. Twm o'r Nant in Category:Llannefydd). However, it's better in a case like that to have a people-from category even if it only has one member (which is allowed under the exception to WP:SMALLCAT) so that the type of relationship is clear. Another example: the Llanrwst railway station article is ok in Category:Llanrwst because there isn't (currently) a more specific category such as "Railway stations in Llanrwst". DexDor (talk) 20:37, 19 January 2015 (UTC)
In reply to RevelationDirect's point about Nixon. Categories are for grouping articles about similar topics; that is not quite the same thing as grouping articles about related topics. Nixon and Ford are similar topics (Republican US presidents of the 1970s) so they should be closely linked through the category system. Nixon and Watergate are very closely related topics so one would expect them to be well linked in the article text (e.g. using a "main" tag), but they are not so similar that they need to be directly linked by categorization (although both topics fit under categories for US politics etc).
There is a tendency for categories named after people to accrue articles about anything associated with that person - so, for example, we currently have California State Route 90 and Five O'Clock Follies in Category:Richard Nixon (the latter article doesn't even mention Nixon). That's just using the category system to display a list of related articles such as you might get in a see-also list or a template. DexDor (talk) 21:06, 19 January 2015 (UTC)

Proposal 1[edit]

Reword Sentence to "When making one category a subcategory of another, ensure that the main article of the subcategory will be a valid member of the parent category." RevelationDirect (talk) 13:04, 20 January 2015 (UTC)
Evad37 raises a good point that we do not follow our current guideline. This proposal is part descriptive in that it captures what we are actually doing now. I think requiring the main article to be a member of the parent category keeps the subcategories from becoming totally unmoored but, obviously, not all categories have a main article. (This is just a rough draft; how can it be improved or is this the wrong direction?)RevelationDirect (talk) 13:04, 20 January 2015 (UTC)
In that case I really don't understand what change you are trying to make to the categorization scheme. Please give an example of part of the category structure under the current rules and what it would look like after your changes. I also suggest that you do it on a separate page (e.g. as a user essay) linked from here - that way you will have more "space" to explain your ideas. Categorizing things as subsets is common in the real world (e.g. all bats are mammals, all mammals are vertebrates - thus all bats are vertebrates) and wp categorization shouldn't move away from that without a very good reason. DexDor (talk) 22:24, 21 January 2015 (UTC)

Subcategorization

A tree structure showing the possible hierarchical organization of an encyclopedia. Items may belong to more than one category, but normally not to a category and its parent (there are, however, exceptions to this rule, such as non-diffusing categories). An item may belong to several subcategories of a parent category (as pictured). If logical membership of one category implies logical membership of a second, then the first category should be made a subcategory (directly or indirectly) of the second. For example, Cities in France is a subcategory of Populated places in France, which in turn is a subcategory of Geography of France.

Many subcategories have two or more parent categories. For example, Category:British writers should be in both Category:Writers by nationality and Category:British people by occupation. When making one category a subcategory of another, ensure that the members of the subcategory really can be expected (with possibly a few exceptions) to belong to the parent also.main article of the subcategory will be a valid member of the parent category. Category chains formed by parent-child relationships should never form closed loops; that is, no category should be contained as a subcategory of one of its own subcategories. If two categories are closely related but are not in a subset relation, then links between them can be included in the text of the category pages.

A page or category should rarely be placed in both a category and a subcategory or parent category (supercategory) of that category (unless the child category is non-diffusing – see below – or eponymous). For example, the article "Paris" need only be placed in "Category:Cities in France", not in both "Category:Cities in France" and "Category:Populated places in France". Since the first category (cities) is in the second category (populated places), readers are already given the information that Paris is a populated place in France by it being a city in France.

Note also that as stub templates are for maintenance purposes, not user browsing (see #Wikipedia administrative categories above), they do not count as categorization for the purposes of Wikipedia's categorization policies. An article which has a "stubs" category on it must still be filed in the most appropriate content categories, even if one of them is a direct parent of the stubs category in question.

Content categories without visible parents[edit]

I've noticed that some categories completely lack any visible parent categories, seemingly making them uncategorized, except they contain hidden maintenance categories as parents. This completely breaks navigation by category tree, since these categories are unreachable from the root category by descent (you'd have to ascend from some shared subcategory, if any shared subcategories exist)

Category:Wikipedia categories named after Canadian musicians exhibits this anomalous behaviour, where the categorized categories lack any visible parents and only have this maintenance category. Since them categories contained are not maintenance categories themselves, but content categories, this seems wrong.

-- 65.94.40.137 (talk) 05:54, 31 January 2015 (UTC)

This sort of categorization does not completely break navigation by category tree; someone can (for example) navigate from the Bryan Adams article to Category:Bryan Adams. If you think there is a problem here then what do you think should be changed to fix it ? E.g. should we (a) delete Category:Bryan Adams, (b) put Category:Bryan Adams under categories such as Category:Canadian male singer-songwriters or (c) make the category non-hidden ? Option b might look ok, but it would place articles like Queen Elizabeth II domestic rate stamp (Canada) and Zoo Magazine under Category:Canadian male singer-songwriters which wouldn't be correct categorization. DexDor (talk) 08:15, 31 January 2015 (UTC)
I'm inclined to agree with 65.94.40.137 - Category:Bryan Adams (for example) should be in a non-hidden category. Wikipedia:Categorization#Category tree organization does not mention hidden/visible categories, but I would interpret "every category ... must be a subcategory of at least one other category" to mean visible categories, not admin/hidden categories.
The obvious solution would be to include Category:Bryan Adams in Category:Canadian male singer-songwriters (presuming that he is one). While it's true that Queen Elizabeth II domestic rate stamp (Canada) is not a Canadian male singer-songwriters, the stamp is named after a Canadian male singer-songwriters. Similarly, Zoo Magazine was co-founded by a Canadian male singer-songwriter.
See also #WP:SUBCAT - belonging also to parent above. Mitch Ames (talk) 02:32, 1 February 2015 (UTC)

Semi-protected edit request on 31 January 2015[edit]

There is a typo: on one occurrence you have "sub-category" instead of "stub-category". 

2.125.15.86 (talk) 09:20, 31 January 2015 (UTC)

Sea Shepherd Conservation Society and Category:Sea Shepherd Conservation Society[edit]

The categories that would normally belong on the article page, are on the category page only.


One exception and that being Category:Sea Shepherd Conservation Society. I reversed it putting the categories on the article page but then got reverted[1] claiming a talk page consensus[Talk:Sea Shepherd Conservation Society/Archive 12.]. The discussion is lengthy but fact is I have never seen an instance of this, categories only on the category page and not the article page before. Is [[Sea Shepherd Conservation Society}} wrong?...William 13:13, 3 February 2015 (UTC)

The simple answer is that categories are not restricted to a like named category instead of the article. In this case take one item in the category, Pete Bethune who was not established in 1977 (the subcategories but be valid for most of the category contents). That makes the parent categories for Category:Sea Shepherd Conservation Society or it's contents out of sync. Since I feel the contents are proper, the problem is the categories as you pointed out. Looking at it the other way, I expect that Category:Organizations established in 1977 only applies to the parent organization so that should only be present on the main article's page. Vegaswikian (talk) 17:19, 3 February 2015 (UTC)
WP:SUBCAT says (with my examples added here in italics) that "A page [Sea Shepherd Conservation Society] or category should rarely be placed in both a category [Category:Sea Shepherd Conservation Society] and a ... parent category (supercategory) [1977 establishments in Washington (state)] of that category ...", so there is no need to include the article Sea Shepherd Conservation Society in the categories 1977 establishments in Washington (state) etc, because the article is already in those parent categories indirectly via Category:Sea Shepherd Conservation Society.
However WP:EPONYMOUS allows (but does not require) the article Sea Shepherd Conservation Society to be in the other parent categories directly - as if the child category Sea Shepherd Conservation Society did not exist - because the child cat is eponymous.
Mitch Ames (talk) 13:57, 4 February 2015 (UTC)

Alumni[edit]

What is the best way to categorize alumni so that there is a clickable link in the main article for the school? I have been using the first method, but in the past others have removed it, saying the school is not an alumni. Of course it isn't, it is there to be the header for the category list and provide a clickable way to get to the list from the school page. --Richard Arthur Norton (1958- ) (talk) 21:33, 18 February 2015 (UTC)

  • "Category:Don Bosco Preparatory High School alumni| " Add the alumni category to the school and add a blank so when sorted it appears at the top of the list?
  • ":Category:Don Bosco Preparatory High School alumni" Add the category with a colon to the see also section?
  • "Category:Don Bosco Preparatory High School" Create a supercategory that will only contain the supracategory "Category:Don Bosco Preparatory High School alumni"?

If I understand your question, you're wondering how to properly link the alumni category within the article on the school, correct? #2 is the only acceptable option of the three you've listed, and you should also use {{cat main}} on the category's description page to link back to the school article. If there is also a standalone alumni list as well as an alumni category (and please don't use confuse us by using "list" to refer to the contents of a category), then that list should be categorized by the alumni category with a blank sortkey, and then include a link to the list in the school article. postdlf (talk) 21:41, 18 February 2015 (UTC)

  • Comment. I agree that #2 is the way to go, though I would say the backlinking from the category should be with {{cat more}} rather than {{cat main}} (since calling something a "main article" for a category could imply that it too should be in the category, which would imply the approach set out in #1). Good Ol’factory (talk) 23:55, 18 February 2015 (UTC)