Wikipedia:Overcategorization: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
added comment
Radiant! (talk | contribs)
Line 29: Line 29:
:If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful.
:If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful.
:In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories (e.g. "Spanish composers born in 1870").
:In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories (e.g. "Spanish composers born in 1870").

;No potential for growth
:Example: ''Moons of Earth'', ''Albums by some band that has disbanded and released only two albums''
:Just because most bands (etc) have a category doesn't mean all of them need one. Avoid categories that will never have more than two or three members.


;Subcategories with large overlaps
;Subcategories with large overlaps

Revision as of 16:26, 21 December 2006

[[Category:Wikipedia wp:oc wp:ocats|Overcategorization]]

Categorization is a useful tool for finding and correlating articles. However, sometimes we tend to overcategorize; the more categories an article is in, the less meaningful all of them become. Hence, based on existing guidelines and WP:CFD precedent, this page lists types of categories that should be avoided. If created, categories of these types are very likely to be deleted.

General cases

Non-defining or trivial characteristics
Example: People who own cats, Motorcycle riders, Cities with a MacDonalds restaurant
We should categorize by what is actually important in a person's life, such as their career, origin and major accomplishments. In contrast, someone's tastes in food, their favorite holiday destination, or the amount of tattoos they have are trivial. Such information may be interesting to put in the article, but is not useful for categorization. If you could easily leave something out of a biography, it is not a defining characteristic. This applies equally to articles about other items than people, such as the cities example above.
Subjective benchmark
Examples: Tall people, Notable architecture, Famous songs
Adjectives which imply a subjective benchmark should not be used in naming/defining a category. Examples include any reference to size (large, small, tall, short, etc) or distance (near, far, etc), or such subjective words as: famous, notable, evil, honest, great, beautiful, ugly, young, old, etc.
Arbitrary inclusion limit
Examples: People over six feet tall, Villages with more than 10,000 inhabitants, Disasters with more than 5,000 casualties
There is no particular reason for choosing "six", "10,000", or "5,000" as cutoff points in these three cases. A village with 9,800 people is not meaningfully different from one with 10,100 people. A better way of representing this kind of information is to to put it in an article such as "List of villages in (locality) by size". Note that our software currently allows a table to be made sortable by any column. The obvious exception is categorizing by year, since making a category for each year is not arbitrary.
Arbitrary geographical grouping
Examples: Roman Catholic Bishops from Ohio, Quarterbacks from Louisiana, Male models from Dallas, Texas
Avoid subcategorizing items by geographical boundary if that boundary does not have any relevant bearing on the items' other characteristics. For example, quarterbacks' careers are not defined by the specific state that they once lived in (unless they played for a team within that state). However, geographical boundaries are useful for dividing items into regions that are directly related to the items' characteristics (for example, Roman Catholic Bishops of the Diocese of Columbus, Ohio or New Orleans Saints quarterbacks).
Arbitrary intersection
Example: Bakers who won an Emmy award
Avoid intersections of two traits that are unrelated, even if some person can be found that has both traits. For instance, Emmy awards are not awarded for bakery skill.
Overly narrow categories
Example: Italian composers born in 1850, Fictional Black African-American DC animated Superheroes with the power to manipulate electricity
If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful.
In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories (e.g. "Spanish composers born in 1870").
No potential for growth
Example: Moons of Earth, Albums by some band that has disbanded and released only two albums
Just because most bands (etc) have a category doesn't mean all of them need one. Avoid categories that will never have more than two or three members.
Subcategories with large overlaps
Example: 1962 New York Yankees team roster, Members of United States 92nd Congress
Categories like this will likely add multiple categories to many articles, for instance if most of the 1962 team was also on the 1963 team. It is better to handle these with a single category, and create lists that detail the multiple instances.
Opinion on an issue
Example: People who like ice cream, Politicians who favor legalizing drugs
As above, holding an opinion is not a defining characteristic, and should not be a criterion for categorization, even if a reliable source can be found for the opinion.
Race, religion and sexual preference
Example: Christian ice skaters, African-American chemists, Homosexual physicists
As above, people should only be categorized by race or religion if this has significant bearing on their career. For instance, in sports, Christian ice skaters are not treated differently from Jewish or Muslim ice skaters. Similarly, in chemistry, a person's actions are more important than their ethnicity (and there are no separate Nobel Prizes for different ethnicities). While "LGBT literature" is a specific genre and useful categorisation, "LGBT quantum physics" is not.
Award winners and nominees
Example: Michigan Red Herring Award nominees
In general, the winners of all but the most internationally well-known awards should be put in a list rather than a category. It may nevertheless be useful to note the awards in the article. If an award doesn't have an article, it certainly doesn't need a category (and not every award that has an article needs a category). (The sole exception to this is the Academy Awards, which is currently under debate.) Nominees should not be categorized.
Inclusion in a published list
Example: Top 40 songs
Magazines and books regularly publish lists of the "top 10" (or some other number) in any particular field. Such lists tend to be subjective and somewhat arbitrary, and as such don't make for meaningful categorization. Additionally, since there are many of such lists, creating categories for all of them would add needless clutter to all relevant pages. Some particularly well-known and unique lists such as the Forbes 400 may constitute exceptions, although creating categories for them risks violating the publisher's copyright.

Specific cases

Actors by film, or films by actor
Example: Starship Troopers actors, or Films starring John Travolta
Since most films have several dozen otherwise-unrelated actors, and most actors play in several dozen otherwise-unrelated films, these categories would add unnecessary clutter to all pages on actors and films. This information is better represented by including a list of actors in the film article, and vice versa.
In contrast, categorizing films by director is generally useful, because most films have only a single director, and directors tend to have their own styles.
Fictional characters by team or association
Example: Excalibur members, Category:Simpsons villains
Most short-running stories don't have enough characters to make subcategorizing by team meaningful. Most long-running stories tend to have characters leaving teams, joining teams and switching teams a lot (which means a list is better for explaining the matter), or have "sides" that aren't black-and-white but have large gray areas (which means that the categorization is not objectively defined).
Guest stars by show
Example: People who appeared on the Muppet Show
Several shows are based upon having a new set of guest stars for every episode. As such these stars really don't have all that much in common, and don't make for a meaningful categorization. Once again, a list is a more comprehensive way of showing this.

See also